74
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Satellite fault tolerant attitude control based on expert guided exploration of reinforcement learning agent

, , , , &
Received 18 Jul 2023, Accepted 03 Feb 2024, Published online: 09 Mar 2024

References

  • Agarwal, V., & Tewari, R. (2021). Improving energy efficiency in UAV attitude control using deep reinforcement learning. Journal of Scientific Research, 65(3), 209–219. https://doi.org/10.37398/JSR.2021.650325
  • Atkinson, R. L., Atkinson, R. C., Smith, E. E., Bem, D. J., & Nolen-Hoeksema, S. (1996). Hilgard’s introduction to psychology (12 ed.). Harcourt Brace College Publishers.
  • Bai, L., & Gao, Z. (2019). Integrated fault-tolerant stabilization control for satellite attitude systems with actuator and sensor faults. Journal of Control, Automation and Electrical Systems, 313–019–00498–3. https://doi.org/10.1007/s40
  • Carlucho, I., De Paula, M., & Acosta, G. (2020). An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots. ISA Transactions, 102, 280–94. https://doi.org/10.1016/j.isatra.2020.02.017
  • Castaldi, P., Mimmo, N., & Simani, S. (2019). LEO satellite active FTC with aerodynamic disturbance decoupled fault diagnosis. European Journal of Control, 51, 76–94. https://doi.org/10.1016/j.ejcon.2019.06.005
  • Chai, Y., Luo, J., & Ma, W. (2022). Data-driven game-based control of microsatellites for attitude takeover of target spacecraft with disturbance. ISA Transactions, 119, 93–105. https://doi.org/10.1016/j.isatra.2021.02.037
  • Chai, R., Tsourdos, A., Savvaris, A., Chai, S., Xia, Y., & Chen, C. L. (2019). Six-DOF spacecraft optimal trajectory planning and real-time attitude control: A deep neural network-based approach. IEEE Transactions on Neural Networks and Learning System. https://doi.org/10.1109/TNNLS.2019.2955400
  • Cheng, L., Wang, Z., Song, Y., & Jiang, F. (2020). Real-time optimal control for irregular asteroid landings using deep neural networks. Acta Astronautica, 170, 66–79. https://doi.org/10.1016/j.actaastro.2019.11.039
  • Chen, J., Xu, Q., Xue, X., Guo, Y., & Chen, R. (2022). Quantum-behaved particle swarm optimization of convolutional neural network for fault diagnosis. Journal of Experimental & Theoretical Artificial Intelligence, 1–17. https://doi.org/10.1080/0952813X.2022.2120089
  • Dayal, A., Cenkeramaddi, L., & Jha, A. (2022). Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation. Applied Soft Computing, 126, 109241. https://doi.org/10.1016/j.asoc.2022.109241
  • Fujimoto, S., Hoof, H., & Meger, D. (2018). Addressing function approximation error in actor-critic methods. International conference on machine learning, Stockholm (pp. 1587–1596).
  • Gao, Z., Zhou, Z., Jiang, G., Qian, M., & Lin, J. (2018). Active fault tolerant control scheme for satellite attitude systems: Multiple actuator faults case. International Journal of Control, Automation and Systems, 16(4), 1794–804. https://doi.org/10.1007/s12555-016-0667-5
  • Gaudet, B., Furfaro, R., Linares, R., & Scorsoglio, A. (2021). Reinforcement metalearning for interception of maneuvering exoatmospheric targets with parasitic attitude loop. Journal of Spacecraft and Rockets, 58(2), 386–99. https://doi.org/10.2514/1.A34841
  • Henna, H., Toubakh, H., Kafi, M., & Sayed-Mouchaweh, M. (2020). Towards fault-tolerant strategy in satellite attitude control systems: A review. Annual Conference of the PHM Society, 12. Nashville. https://doi.org/10.36001/phmconf.2020.v12i1.1272
  • Henna, H., Toubakh, H., Kafi, M., & Sayed-Mouchaweh, M. (2022). Unsupervised Data-Driven Approach for Fault Diagnostic of Spacecraft Gyroscope. Annual Conference of the PHM Society, 14. Nashville. https://doi.org/10.36001/phmconf.2022.v14i1.3216
  • Hu, J., Huang, X., Li, M., Guo, M., Xu, C., Zhao, Y., Liu W., Wang, X. (2022). Entry vehicle control system design for the tianwen-1 mission. Astrodynamics, 6(1), 27–37. https://doi.org/10.1007/s42064-021-0124-y
  • Hu, H., Liu, L., Wang, Y., Cheng, Z., & Luo, Q. (2020). Active fault-tolerant attitude tracking control with adaptive gain for spacecrafts. Aerospace Science and Technology, 98, 105706. https://doi.org/10.1016/j.ast.2020.105706
  • Hu, Q., & Xiao, B. (2011). Fault-tolerant sliding mode attitude control for flexible spacecraft under loss of actuator effectiveness. Nonlinear Dynamics, 64(1), 13–23. https://doi.org/10.1007/s11071-010-9842-z
  • Hu, Q., & Yue, W. (2007). Markov decision processes with their applications. Springer Science & Business Media.
  • Iannelli, P., Angeletti, F., & Gasbarri, P. (2022). A model predictive control for attitude stabilization and spin control of a spacecraft with a flexible rotating payload. Acta Astronautica, 199, 401–11. https://doi.org/10.1016/j.actaastro.2022.07.024
  • Jørgensen, J. (1962). Psykologi – paa biologisk Grundlag. Scandinavian University Books.
  • Kaveh, M., & Mesgari, M. S. (2023). Application of meta-heuristic algorithms for training neural networks and deep learning architectures: A comprehensive review. Neural Processing Letters, 55(4), 4519–4622. https://doi.org/10.1007/s11063-022-11055-6
  • Laud, A., & DeJong, G. (2003). The influence of reward on the speed of reinforcement learning: An analysis of shaping. The 20th International Conference on Machine Learning (ICML-03), Washington, DC (pp. 440–447).
  • Liang, X., Wang, Q., Hu, C., & Dong, C. (2019). Observer-based H∞ fault-tolerant attitude control for satellite with actuator and sensor faults. Aerospace Science and Technology, 95, 105424. https://doi.org/10.1016/j.ast.2019.105424
  • Lillicrap, T., Hunt, J., Pritzel, A., Heess, N., Erez, T., & Tassa, Y. (2015). Temporal evolution of both premotor and motor cortical tuning properties reflect changes in limb biomechanics. Journal of Neurophysiology, 113(7), 2812–2823. arXiv preprint arXiv:1509.02971. https://doi.org/10.1152/jn.00486.2014
  • Liu, Y., Wang, H., Wu, T., Lun, Y., Fan, J., & Wu, J. (2022). Attitude control for hypersonic reentry vehicles: An efficient deep reinforcement learning method. Applied Soft Computing, 123, 123. https://doi.org/10.1016/j.asoc.2022.108865
  • Matignon, L., Laurent, G., & Le Fort-Piat, N. (2006). Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning. 16th International Conference on Artificial Neural Networks–ICANN 2006. Athens, Greece.
  • Ma, Z., Wang, Y., Yang, Y., Wang, Z., Tang, L., & Ackland, S. (2018). Reinforcement learning-based satellite attitude stabilization method for non-cooperative target capturing. Sensors, 18(12), 4331. https://doi.org/10.3390/s18124331
  • McDowell, J. (2020). The low earth orbit satellite population and impacts of the SpaceX Starlink constellation. The Astrophysical Journal Letters, 892(2), L36. https://doi.org/10.3847/2041-8213/ab8016
  • Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
  • Ng, A., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. International Conference on Machine Learning (ICML), Bled, Slovenia (pp. 278–287).
  • Nugroho, L., Andiarti, R., Akmeliawati, R., Kutay, A., Larasati, D., & Wijaya, S. (2023). Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem. Engineering Applications of Artificial Intelligence, 120, 105798. https://doi.org/10.1016/j.engappai.2022.105798
  • Osoro, O., & Oughton, E. (2021). A techno-economic framework for satellite networks applied to low earth orbit constellations: Assessing starlink, OneWeb and Kuiper. IEEE Access, 9, 141611–25. https://doi.org/10.1109/ACCESS.2021.3119634
  • Pachler, N., Del Portillo, I., Crawley, E., & Cameron, B. (2021). An updated comparison of four low earth orbit satellite constellation systems to provide global broadband. IEEE international conference on communications workshops (ICC workshops), Montreal, QC, Canada (pp. 1–7). IEEE.
  • Patel, H. R., & Shah, V. A. (2021). Application of metaheuristic algorithms in interval type-2 fractional order fuzzy TID controller for nonlinear level control process under actuator and system component faults. International Journal of Intelligent Computing and Cybernetics, 14(1), 33–53. https://doi.org/10.1108/IJICC-08-2020-0104
  • Patel, H. R., & Shah, V. A. (2022a). Fuzzy logic based metaheuristic algorithm for optimization of type-1 fuzzy controller: Fault-tolerant control for nonlinear system with actuator fault. IFAC-Papersonline, 55(1), 715–721. https://doi.org/10.1016/j.ifacol.2022.04.117
  • Patel, H. R., & Shah, V. A. (2022b). Shadowed type-2 fuzzy sets in dynamic parameter adaption in cuckoo search and flower pollination algorithms for optimal design of fuzzy fault-tolerant controllers. Mathematical and Computational Applications, 27(6), 89. https://doi.org/10.3390/mca27060089
  • Puterman, M. (2014). Markov decision processes: Discrete stochastic dynamic programming. John Wiley & Sons.
  • Randløv, J., & Alstrøm, P. (1998). Learning to drive a bicycle using reinforcement learning and shaping. 15th International Conference on Machine Learning, Madison, Wisconsin, USA, 98, 463–471.
  • Rouabah, B., Toubakh, H., Kafi, M., & M, S.-M. (2022). Adaptive data-driven fault-tolerant control strategy for optimal power extraction in presence of broken rotor bars in wind turbine. ISA Transactions, 130, 92–103. https://doi.org/10.1016/j.isatra.2022.04.008
  • Saeedvand, S., Mandala, H., & Baltes, J. (2021). Hierarchical deep reinforcement learning to drag heavy objects by adult-sized humanoid robot. Applied Soft Computing, 110, 110. https://doi.org/10.1016/j.asoc.2021.107601
  • Shirobokov, M., Trofimov, S., & Ovchinnikov, M. (2021). Survey of machine learning techniques in spacecraft control design. Acta Astronautica, 186, 87–97. https://doi.org/10.1016/j.actaastro.2021.05.018
  • Shuprajhaa, T., Sujit, S., & Srinivasan, K. (2022). Reinforcement learning based adaptive PID controller design for control of linear/nonlinear unstable processes. Applied Soft Computing, 128, 128. https://doi.org/10.1016/j.asoc.2022.109450
  • Silva, M., Shan, M., Cervone, A., & Gill, E. (2019). Fuzzy control allocation of microthrusters for space debris removal using CubeSats. Engineering Applications of Artificial Intelligence, 81, 145–56. https://doi.org/10.1016/j.engappai.2019.02.008
  • Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic policy gradient algorithms. International conference on machine learning, Beijing, China (pp. 387–395).
  • Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. Prentice Hall.
  • Staddon, J. E. (1983). Adaptive behavior and learning. Cambridge University Press.
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning, second edition: An introduction. MIT Press.
  • Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems (NIPS) conference, Denver.
  • Su, R., Wu, F., & Zhao, J. (2019). Deep reinforcement learning method based on DDPG with simulated annealing for satellite attitude control system. 2019 Chinese Automation Congress (CAC), Hangzhou, China (pp. 390–395). IEEE.
  • Tang, J., Zeng, J., Wang, Y., Yuan, H., Liu, F., & Huang, H. (2021). Traffic flow prediction on urban road network based on license plate recognition data: Combining attention-LSTM with genetic algorithm. Transportmetrica A: Transport Science, 17(4), 1217–1243. https://doi.org/10.1080/23249935.2020.1845250
  • Toubakh, H., & Sayed-Mouchaweh, M. (2015). Hybrid dynamic classifier for drift-like fault diagnosis in a class of hybrid dynamic systems: Application to wind turbine converters. Neurocomputing, 171, 1496–1516. https://doi.org/10.1016/j.neucom.2015.07.073i
  • Wang, X., Cai, J., Wang, R., Shu, G., Tian, H., Wang, M., & Yan, B. (2023). Deep reinforcement learning-PID based supervisor control method for indirect-contact heat transfer processes in energy systems. Engineering Applications of Artificial Intelligence, 117, 105551. https://doi.org/10.1016/j.engappai.2022.105551
  • Wang, Y., Zhang, H., & Zhang, G. (2019). cPSO-CNN: An efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm and Evolutionary Computation, 49, 114–123. https://doi.org/10.1016/j.swevo.2019.06.002
  • Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3), 279–292. https://doi.org/10.1007/BF00992698
  • Wei, C., Chen, Q., Liu, J., Yin, Z., & Luo, J. (2021). An overview of prescribed performance control and its application to spacecraft attitude system. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, 235(4), 435–47. https://doi.org/10.1177/0959651820952552
  • Yang, X., Yang, Y., Ye, D., Xiao, Y., & Sun, Z. (2022). Non-singular and continuous back-stepping predefined-time attitude tracking control for rigid spacecraft with predefined bound. In IEEE (Ed.), 48th Annual Conference of the IEEE Industrial Electronics Society, Brussels, Belgium (pp. 1–6).
  • Yuhan, L., Guangfu, M., Yueyong, L., & Pengyu, W. (2022). Neural network-based reinforcement learning control for combined spacecraft attitude tracking maneuvers. Neurocomputing, 484, 67–78. https://doi.org/10.1016/j.neucom.2021.07.099
  • Zhang, Z., Li, X., An, J., Man, W., Zhang, G., & Pizzarelli, M. (2020). Model-free attitude control of spacecraft based on PID-guide TD3 algorithm. International Journal of Aerospace Engineering, 2020, 1–13. https://doi.org/10.1155/2020/8874619

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.