Jtcm

QIAN WU1, XIAOFENG LI1,*, QI ZHU1, JUN YUAN1, RUOXI XU2, XIUTAO FU1, YI LI1, TIANYI QI1.
1, Beijing Institute of Control Engineering, Beijing,100094, China.
2, Beijing Institute of Control and Electronic Technology,100038, China.
Xiaofeng_Li_bice@163.com

First author: QIAN WU, wuqian945@163.com
Second author and corresponding author: XIAOFENG LI, Xiaofeng_Li_bice@163.com
The third author: QI ZHU, Zhu_Qi_bice@163.com
Fourth author: JUN YUAN, Yuanjun502@gmail.com
Fifth author: RUOXI XU, xurxbeijing@163.com
Sixth author: XIUTAO FU, fu_bice_beijing@163.com
The seventh author: YI LI, liyi_1984_502@163.com
The eighth author: TIANYI QI , Qi_Tianyi@163.com

Acknowledgement:
This research was supported by the National Natural Science Foundation of China (NSFC)project – Research on On-Orbit Adaptive Evolution Theories and Methods for Space Vehicle Control Software.

Abstract
The integration of intelligent algorithms into satellite attitude control systems has revolutionized spacecraft autonomy, addressing the limitations of traditional control methods in handling uncertainties, nonlinearities, and computational constraints of modern space missions. This comprehensive review examines the evolution of satellite attitude control software technologies from 2010 to 2025, tracking the progression from classical filtering approaches to sophisticated intelligent systems. The analysis reveals three distinct developmental phases: the emergence of neural network-enhanced filtering (2010-2015), the adoption of deep reinforcement learning frameworks (2016-2020), and the implementation of distributed federated learning architectures (2021-2025). Key technological advances include the successful deployment of TensorFlow Lite on resource-constrained platforms like OPS-SAT, the compression of development cycles from years to months through automated code generation tools, and the achievement of arcsecond-level control accuracy through transformer-based architectures. The transition from MATLAB prototypes to embedded C++ implementations, coupled with the evolution from centralized PID controllers to distributed edge computing systems, has enabled real-time execution of complex neural networks on satellite platforms with power constraints below 20 watts. While significant progress has been demonstrated in operational missions including WorldView, PRISMA, and Starlink constellations, critical challenges remain in certifying machine learning components under existing aerospace standards and ensuring long-term stability of learning-based controllers in the space environment. Future developments must focus on establishing standardized verification frameworks for stochastic algorithms and developing explainable AI techniques suitable for safety-critical space applications.
Keywords: satellite attitude control; intelligent algorithms; deep reinforcement learning; real-time software implementation; federated learning
⦁ Introduction
The application of intelligent algorithms to the control of satellite attitude represents a revolutionary breakthrough in the engineering of spacecraft, driven by the growing complexities inherent in current space missions and the shortcomings of classical control methodology. Although conventional systems of assessing and controlling the attitude of spacecraft are mathematically based and generally accepted, they are severely hampered by the uncertainties, nonlinear behavior, and processing limitations that are common in current space missions [1]. A shift from classical control approaches to intelligent algorithm-based solutions represents a technical as well as conceptual breakthrough in the comprehension and realization of spacecraft autonomy and versatility.
The need for sophisticated algorithms in the context of satellite attitude control is revealed when projecting upon the operational requirements of current and future space missions. Classical control techniques, based primarily on linear control theory from idealized dynamic models, face significant difficulties in maintaining optimum performance in the face of unforeseen disturbances, faulty actuators, or changing mission objectives [2]. Such difficulties are particularly relevant in emerging missions, on-orbit servicing, formation flying, and mega-constellation management, where the computational requirements of classical approaches often exceed onboard processing resources, lacking the necessary flexibility and robustness to function effectively within such frameworks.
Over the last decade, satellite attitude control software has been marked by the gradual incorporation of machine learning paradigms in space systems. It is possible to follow the evolutionary process from the early use of primitive neural networks, which were created for the purpose of solving the problem of fault detection, to the modern space system architectures with deep learning that can perform end-to-end attitude estimation and control [3]. Theoretical contributions of deep learning have shed light on the underlying principles of these developments and opened up the possibility of hierarchical process design for feature extraction that can detect subtle spatial and temporal patterns in satellite telemetry data [4]. Recent experimental verifications of reinforcement learning-based controllers are indicative both of the theoretical importance of these contributions and of their practical relevance, as they have been shown to outperform conventional PID-based controllers in continuous reaction wheel control tasks [5].
The environment for the deployment of attitude control software has changed substantially, with modern architecture supporting both supervised attitude estimation through learning and reinforcement learning for control policy optimization. Neural network-based methods have shown a remarkable ability to deal with pose estimation of non-cooperative spacecraft efficiently, with centimeter-order accuracy in relative navigation profiles being achievable, something that comes at a very high computational cost when using traditional filtering methods [6]. Recent developments have further complemented these with the introduction of digital twin technologies, which have made possible real-time simulation and evaluation of control measures in idealized virtual spaces that mirror the complex dynamics of space operations accurately [7].
The evolution of architectural designs towards modular and scalable software infrastructure has greatly enabled the efficient and quick deployment as well as prototyping of smart control systems in a broad range of mission profiles. In addition, today’s simulation environments support extensive multiple-satellite configurations, thus allowing the deployment and verification of distributed control algorithms that efficiently accommodate changing constellation configurations as well as communications topologies [8]. Such software infrastructure advances have also been verified by the successful in-orbit demonstration of missions such as OPS-SAT, thus validating the possibility of deploying machine learning models on resource-limited satellite systems by employing optimized inference systems, such as TensorFlow Lite [9].
The fusion of advances in theoretical structures, software programs, and hardware parts has enabled unprecedented possibilities for increasing the effectiveness of satellite attitude control systems. At the same time, these advances have ushered in new challenges regarding the verification, validation, and certification procedures of intelligent systems. Current implementations have demonstrated the promise of multiobjective optimization for attitude control, seamlessly trading precision, energy efficiency, and computational power with sophisticated neural models that can cope with changing operational conditions [10]. This comprehensive review consolidates these advances, providing a systematic evaluation of the advances in intelligent algorithm-based satellite attitude control software technologies from 2010 to 2025. In addition, it pinpoints noteworthy deficiencies and sketches future research directions that are critical to driving the field toward autonomous operations of space vehicles.

  1. Theoretical Foundations and Evolution
    2.1 Deep Learning Algorithm Systems and Aerospace Applications
    Employing satellite attitude control systems with deep learning represents a significant shift from the previous model-based systems to new data-based systems that can cope with uncertain scenarios and complex behaviors. The evolution of the simple neural networks to sophisticated transformer-based architectures (2015-2025) significantly enhanced the capacity to apprehend significant features and detect long-term patterns for satellite control. Significant work on human-level control by deep reinforcement learning [11] enabled the utilization of deep Q-networks for control issues of the continuous type by learning from high-fidelity sensing data without resorting to special features. Such capacity is highly applicable to the satellites operating in unstable space environments where conventional modeling approaches encounter several challenges.
    The development of transformer models over the last two years (2023-2024) has significantly changed the methods used in satellite attitude and pose estimation. This is reflected in modern applications that use self-attention mechanisms to accurately determine long-range relations in satellite telemetry observations. The satellite system’s relative pose estimation model, based on transformer networks [12] , has an advantage over convolutional neural networks because it can avoid the built-in constraints related to translation equivariance and limited receptive fields. Therefore, it achieves increased accuracy in non-cooperative satellite tracking by effectively capturing all spatial interactions globally. Table 1 presents a comprehensive comparison between CNN and Transformer architectures for satellite attitude control applications, highlighting their distinct characteristics and implementation requirements.
    Table 1: Architectural comparison between CNN and Transformer networks for satellite pose estimation
    Characteristic CNN-based Architecture Transformer-based Architecture
    Spatial Processing Local receptive fields through convolutional kernels Global attention mechanisms across entire input
    Temporal Dependency Requires recurrent connections or temporal pooling Native self-attention captures long-range dependencies
    Computational Complexity O(n) for convolutional operations O(n²) for self-attention, optimized with sparse patterns
    Memory Requirements Lower memory footprint, suitable for embedded systems Higher memory usage, requires optimization for satellites
    Feature Extraction Hierarchical features through deep layers Parallel multi-head attention for diverse features
    Inductive Bias Translation equivariance, local connectivity Minimal inductive bias, learns from data patterns
    Real-time Performance Fast inference with optimized convolutions Requires careful optimization for real-time constraints
    Adaptability Fixed architecture post-training Dynamic attention weights adapt to input
    Table 1 demonstrates that while CNNs offer computational efficiency crucial for resource-constrained satellite platforms, Transformers provide superior modeling capabilities for complex spatiotemporal relationships in satellite telemetry data, necessitating careful trade-offs in software implementation design.
    The computational efficiency challenge of implementing deep learning models on satellite platforms with limited resources has compelled the creation of lightweight architectures specially designed for onboard computers and embedded systems in satellites. The application of lightweight transformer models with the addition of FastDTW algorithms for momentum wheel fault detection in satellites [13] successfully balances the important trade-off of model size and inference speed, thus enabling real-time fault detection through software-optimized temporal pattern matching enabled by optimized software implementations. In addition, the pyramid vision multitask transformer network, proposed for satellite missions [14] , builds upon this concept by incorporating a hierarchical feature extraction mechanism for analyzing multi-resolution satellite imagery through parallel pathways, while simultaneously running pose estimation and fault detection tasks using shared computational resources, thus optimizing the limited available processing power within satellite systems.
    2.2 Reinforcement Learning Control Framework Development
    environments. It eliminates the requirement for the existence of labeled training data and fixed control strategies. The soft actor-critic (SAC) algorithm in satellite control [15] incorporated the addition of entropy regularization in the reinforcement learning objective, which promotes exploration with stable outcomes remaining intact. This is quite significant in satellite attitude control software since the use of suboptimal practices during training may cause severe failures in missions. The maximum entropy framework of SAC alters the optimization objective by adding policy’s entropy to the reward function. It facilitates the establishment of robust policies that support various action strategies rather than maintaining fixed behaviors that may not cope with variations in satellite operations.
    Table 2 presents a comprehensive comparison of reinforcement learning algorithms implemented in satellite attitude control software, highlighting the fundamental differences in their theoretical frameworks and practical software implementations.
    Table 2: Theoretical comparison of RL algorithms (PPO, SAC, TD3) for satellite control
    Algorithm PPO SAC TD3
    Policy Type Stochastic Stochastic Deterministic
    Optimization Objective Clipped surrogate loss Maximum entropy Q-function with delayed updates
    Exploration Strategy Policy stochasticity Entropy bonus Action noise
    Update Frequency Every N steps Every step Delayed policy updates
    Sample Efficiency Moderate High High
    Stability High (trust region) High (soft updates) High (twin critics)
    Satellite Application Discrete thruster control Reaction wheel control Complex maneuvers
    Computational Load Low to moderate Moderate Moderate to high
    Table 2 reveals the distinct characteristics of each algorithm in satellite applications, with PPO employing trust region optimization suitable for discrete thruster control, SAC utilizing entropy maximization for reaction wheel control, and TD3 addressing function approximation errors through twin critics for complex satellite maneuvers.
    The application of the proximal policy optimization (PPO) algorithm in satellite systems [16] addresses the inherent instability of policy gradient methods by using a clipped surrogate objective that constrains policy changes within a trust region. This ensures progressively better expected rewards while protecting against the ill effects of catastrophic forgetting with respect to learned skills. The application of deep reinforcement learning for satellite proximity maneuvers [17] showcases the usability of these algorithms in safety-critical environments, where constraint-based training approaches run in simulation environments like Basilisk ensure collision avoidance without jeopardizing trajectory efficiency optimization, which is achieved through the careful design of reward shaping and curriculum learning techniques.
    The combination of domain-specific knowledge in certain areas with reinforcement learning techniques has been identified as essential for improving sample efficiency and enabling safe exploration when developing satellite control software. The deep reinforcement learning framework for satellite attitude control [18] combines physical constraints as an integral part of its neural network design, using domain-specific activation functions that conform to actuator saturation constraints and angular momentum conservation principles, and thus providing an implicit safety mechanism within the software that prevents the generation of physically unrealistic control signals. The PID-based TD3 algorithm adapted for satellite use [19] is an exemplary combination of classic control theory with modern deep learning techniques in software development, utilizing proportional-integral-derivative control as the basis policy to guide exploration during initial phases of learning, making successful adjustments of the exploration dynamics by providing a fixed prior that induces convergence while still retaining the ability to discover novel control techniques beyond traditional approaches.
    2.3 Evolution of Hybrid Intelligent Methods
    The evolution from simple weighted combinations (2011-2015) to sophisticated adaptive fusion architectures (2020-2025) represents a significant advancement in addressing the complexity and uncertainty inherent in satellite attitude control systems. The foundations for hybrid intelligent methods can be traced to early fuzzy clustering techniques [20] that enabled model identification without explicit system dynamics knowledge. Building upon these concepts, early implementations of neural network-based adaptive output feedback control for satellite formation flying [21] established the feasibility of combining traditional feedback linearization with neural network approximators to handle unmodeled dynamics, though these approaches were limited by the requirement for persistent excitation and struggled with transient performance during the adaptation phase. The adaptive critic-based approach to satellite orbital rendezvous problems [22] advanced this paradigm by introducing a dual-network architecture that simultaneously learns the value function and control policy, enabling optimal control synthesis without requiring explicit knowledge of satellite dynamics while maintaining theoretical convergence guarantees through Lyapunov-based stability analysis.
    The contemporary integration of physics-informed neural networks with normalizing flows for satellite control [23] represents a sophisticated software approach to incorporating physical laws as inductive biases while maintaining the flexibility to learn complex, high-dimensional distributions of satellite states and control actions. This framework leverages the bijective nature of normalizing flows to transform simple distributions into complex ones through a series of invertible transformations, as expressed by:

where represents the learned invertible transformation and denotes the base distribution, enabling efficient sampling and likelihood estimation for satellite trajectory optimization under uncertainty. The physics-informed component ensures that learned representations respect fundamental conservation laws and dynamic constraints specific to satellite operations, reducing the sample complexity required for training while improving generalization to unseen operational conditions.
Application of imitation learning to address unknown changes in satellite attitude control software [24] illustrates how learning from expert examples can be made to work. It also contains methods for adapting while running the software to address training vs. real-world differences. This approach resolves the greatest difficulty of applying learning from simulation to real-world application in satellites. It is a combination of learning from detailed simulations with adjusting using some actual satellite data. This provides a means of continuing to learn in satellite onboard software that functions effectively in various conditions and prevents loss of valuable knowledge acquired prior to that. Layered design of these combined approaches provides for splitting complex satellite control tasks into simpler components. Special neural modules that can be trained and tested independently can work on each component. These components can then be combined into the complete satellite attitude control software system.

  1. Progress in Attitude Determination Software
    3.1 Development of Intelligent Sensor Data Processing
    The evolution of software for the processing of satellite attitude determination-related sensor data has undergone a profound shift, evolving from traditional filtering methods to enhanced framework architectures promoted by intelligent algorithms. This has been prompted by the increasing complexity of satellite missions alongside greater computational power on modern satellite platforms. Prior to 2018, classical Kalman filtering and its extensions dominated this domain, providing secure state estimation given the Gaussian noise assumptions; nonetheless, they struggled when dealing with nonlinear dynamics and non-Gaussian disturbances that are common in real-world satellite missions. The adoption of filtering topologies developed through neural networks after 2018 represents a significant shift in the processing of sensor data. In particular, adaptive quantized attitude control schemes [25] can control cellular satellite systems by implementing adaptation mechanisms derived from neural networks to adjust filter parameters based on the system’s real-time performance.
    The progression from MATLAB prototype implementations to embedded C++ represents a significant leap in the practical deployment of intelligent sensor processing to real satellite missions. The use of Time-Series Graph Convolutional Networks (TS-GCN) [26] provides one such example, allowing for efficient processing with retention of high performance over a range of signal-to-noise ratios with the use of graph-based representations that capture the time-dependent relationships inherent in satellite telemetry streams. Table 3 provides a qualitative comparison of the traditional software frameworks utilized in satellite attitude determination systems, including the relative trade-offs that accompany several different implementation methodologies.
    Table 3: Comparison of Typical Software Frameworks for Satellite Attitude Determination
    Framework Algorithm Base Relative Complexity Relative Accuracy Real-time Capability Implementation Maturity
    Classical EKF Extended Kalman Filter Low Moderate High Mature/Embedded
    UKF Unscented Transform Medium High High Mature
    Particle Filter Monte Carlo Methods High High Limited Research/Operational
    NN-Enhanced EKF Neural Network + EKF Medium-High Very High Medium Emerging
    Deep Learning CNN/RNN/Transformer High Highest Variable Research/Prototype
    Table 3 illustrates the progression from traditional computationally effective methods to progressively sophisticated approaches based on neural networks, each being distinct with its own benefit based on mission parameters, computational resources available, and required accuracy. This emphasizes the importance of a comprehensive analysis of mission-specific needs and requirements in selecting the most appropriate software framework.
    The application of deep learning techniques to attitude estimation of co-orbiting satellites [27] has enabled increased coordination of satellite constellations. Such an enhancement is achieved by incorporating shared observations drawn from several platforms, which provide attitude estimates more accurately by applying collaborative filtering mechanisms inherent in distributed software systems.
    3.2 Evolution of Attitude Estimation Software Design
    The architectural evolution of attitude estimation software reflects the broader transition from monolithic batch processing systems to modular, real-time frameworks capable of adaptive model updates during operation. The shift from batch to real-time processing has been facilitated by advances in onboard computing hardware and the development of efficient software pipelines that leverage tensor processing units for accelerated neural network inference [28], enabling high-frequency attitude estimation while maintaining improved accuracy through optimized matrix operations specifically tailored for spacecraft dynamics.
    The integration of physics-informed neural networks into attitude estimation software [29] represents a sophisticated approach to incorporating domain knowledge while maintaining the flexibility of data-driven methods, constraining the solution space to physically realizable trajectories and significantly reducing training data requirements compared to purely data-driven approaches. Figure 1 illustrates the progressive improvement in attitude estimation accuracy achieved through successive generations of software implementations over the past fifteen years.

Figure 1: Performance Evolution of Attitude Estimation Accuracy (2010-2025)
Figure 1 demonstrates the continuous improvement trajectory in attitude estimation accuracy, with distinct phases corresponding to the maturation of different algorithmic approaches: classical filtering refinements in the early period, the introduction of machine learning enhancements in the intermediate phase, and the current integration of deep learning architectures, each transition characterized by fundamental changes in software architecture and implementation strategies rather than mere parameter tuning.
The dramatic accuracy improvements illustrated in Figure 2 have been facilitated not only by algorithmic advances but also by the maturation of software development ecosystems. Open-source frameworks like ROS and Gazebo have accelerated this evolution by providing standardized testing environments where novel algorithms can be rapidly validated before deployment. This infrastructure proves particularly valuable for complex applications such as machine learning-based LEO object tracking [30] , where ROS architectures enable seamless integration of neural networks with real-time sensor processing. Beyond traditional simulation, hybrid digital twin approaches [31] represent the convergence of these software advances, creating continuously updated virtual satellites that enhance both algorithm development and operational monitoring capabilities.
3.3 Advances in Fault Diagnosis Software Technology
The transformation from rule-based fault detection systems to data-driven diagnostic frameworks has revolutionized the reliability and autonomy of satellite attitude control systems, with modern implementations capable of identifying and compensating for multiple simultaneous failures without ground intervention. The development of fault-tolerant attitude control software handling concurrent actuator and sensor failures [32] demonstrates the sophistication of current approaches, employing adaptive mechanisms that reconfigure control strategies in real-time based on identified failure modes while maintaining mission objectives through graceful degradation strategies.
The implementation of LSTM-based deep learning for enhanced fault detection in satellite attitude control systems [33] leverages temporal dependencies in sensor data to identify subtle anomaly patterns indicative of incipient failures, demonstrating improved detection capabilities compared to traditional threshold-based methods while reducing false positive rates through learned contextual understanding of nominal operational variations. The application of Type-II fuzzy terminal sliding mode control for magnetorquer-based attitude systems [34] addresses the challenge of actuator uncertainty through interval-valued fuzzy sets that capture the inherent imprecision in magnetic field models and actuator responses.
The evolution from offline to online diagnostic capabilities has been enabled by advances in model compression and edge computing, with nonlinear model predictive controllers improving tracking ability through genetic algorithm optimization [35] demonstrating real-time fault compensation capabilities on resource-constrained platforms. Table 4 presents documented deployments of intelligent fault diagnosis software in operational satellite missions.
Table 4: Real Mission Applications of Intelligent Fault Diagnosis Software
Application Context Period Implementation Approach Reference
OPS-SAT Mission 2019-present TensorFlow Lite for onboard ML-based anomaly detection Labrèche & Evans (2022)
Satellite Digital Twin Framework 2024 Network-based diagnosis with virtual-physical coupling He et al. (2024)
Neural Adaptive Control Systems 2020 Adaptive neural architectures for fault accommodation Raja & Singh (2020)
Table 4 presents documented applications of intelligent fault diagnosis software in both operational missions and research frameworks, revealing the progressive maturation of these technologies. Among these implementations, the OPS-SAT CubeSat mission stands out as a pivotal demonstration, successfully deploying TensorFlow Lite for onboard machine learning-based diagnostics and proving that autonomous fault detection can operate effectively within the severe computational constraints of small satellites. The mission’s broader impact on machine learning development has been demonstrated through data-centric competitions [36] that established benchmarks for training robust models with limited labeled data typical of space applications. This breakthrough has inspired complementary research directions, including digital twin frameworks for network-based diagnosis [37] that leverage virtual-physical synchronization for predictive maintenance, and adaptive neural architectures [38] that provide real-time fault accommodation through dynamic reconfiguration, collectively establishing a comprehensive ecosystem for resilient satellite operations.

  1. Progress in Attitude Control Software
    4.1 Evolution of Intelligent Controller Software Architecture
    The transformation of satellite attitude control software architecture from monolithic centralized systems to distributed intelligent frameworks represents a fundamental paradigm shift driven by increasing mission complexity and the availability of multi-core processing capabilities on modern satellite platforms. The implementation of Takagi-Sugeno fuzzy model predictive control [39] exemplifies this architectural evolution, where traditional single-algorithm implementations have been replaced by hybrid frameworks capable of switching between multiple control strategies based on real-time operational conditions, achieving robust performance across varying satellite dynamics through the integration of fuzzy inference systems with predictive control horizons. This evolution is illustrated in Figure 2, which traces the architectural transformation over the past fifteen years.

Figure 2: Evolution of Control Software Architecture (2010-2025)
Figure 2 illustrates the architectural evolution across three distinct phases: (a) Centralized architecture (2010-2015) routes all sensor inputs through a single CPU executing PID control before commanding actuators; (b) Hybrid architecture (2016-2020) distributes processing between dual CPUs running fuzzy logic and neural network algorithms in parallel with inter-processor communication; (c) Distributed architecture (2021-2025) employs four edge AI nodes coordinated through federated learning, with sensors and actuators directly connected to local processors, enabling autonomous decision-making while maintaining global optimization through the central coordinator.
Building upon the multi-algorithm switching capability shown in Figure 3, parameterized fuzzy logic controllers [40] have emerged as a practical solution for real-time parameter adaptation in satellite control systems, introducing self-tuning mechanisms that eliminate the traditional requirement for extensive ground-based parameter optimization while maintaining stability guarantees through Lyapunov-based design constraints. The integration of optimal adaptive fuzzy controllers [41] further advances this paradigm by incorporating online learning capabilities that continuously refine control parameters based on observed system performance, demonstrating the maturation of intelligent control architectures from static rule-based systems to dynamic learning frameworks capable of handling unprecedented mission scenarios without human intervention.
4.2 Adaptive Control Software Technology Development
The progression from fixed-parameter control laws to fully adaptive intelligent systems has been facilitated by advances in both theoretical frameworks and computational hardware, enabling the deployment of sophisticated learning algorithms that were previously confined to ground-based simulations. The implementation of deep reinforcement learning on FloatSat testbeds [42] has validated the feasibility of deploying complex neural network-based controllers in representative space environments, achieving convergence to optimal control policies within operational constraints while demonstrating robustness to sensor noise and actuator uncertainties through domain randomization during training phases. Table 5 presents a comprehensive analysis of computational requirements and performance characteristics across different control algorithm implementations.
Table 5: Computational Requirements and Performance Trade-offs of Control Algorithms
Algorithm Type Parameter Adaptation CPU Load Memory Usage Response Time Robustness Level
Classical PID Fixed Low (<10%) Minimal (<1MB) Fast (<1ms) Limited Fuzzy Logic Rule-based Medium (20-30%) Low (5-10MB) Medium (5-10ms) Good Neural Network Online learning High (40-60%) Medium (50-100MB) Variable (10-50ms) Very Good Reinforcement Learning Continuous Very High (>70%) High (>200MB) Initially slow, converging to fast Excellent
Table 5 reveals the fundamental trade-offs between computational efficiency and adaptive capabilities, with traditional approaches offering predictable resource usage suitable for heritage systems while modern learning-based methods provide superior robustness at the cost of increased computational demands, necessitating careful mission-specific optimization of the control-computation balance.
Although the FloatSat experiments prove effective under conditioned circumstances within controlled laboratory settings, operational satellite missions offer added complexities beyond ideal testing circumstances. Among the added complications, variations in mass conditions during the fuel depletion phase of the mission constitute one of the most difficult control situations because the system dynamics experience constant changes during the progression of the mission. Deep reinforcement learning-based approaches [43] have proved significant in solving this problem, allowing controllers to adapt automatically to mass changes from low fuel depletion to high payload ejection without requiring ground control station adjustments. Based on this adaptive feature, the data-driven prescribed performance control system [44] greatly enhances robustness by incorporating mathematical guarantees for transient and final performance during the control synthesis, hence incorporating performance constraints into the synthesis to ensure the upkeep of mission-critical pointing precision even as the adaptive process progresses within the bounds of the limited computational resources available on satellite processors.
4.3 Typical Application Validation and Performance Evaluation
The maturation of intelligent control software is most evident in its successful deployment across diverse mission profiles, from agile Earth observation satellites requiring rapid retargeting capabilities to complex formation flying missions demanding precise relative control. The implementation of coupled attitude-orbit control for on-orbit servicing spacecraft [45] demonstrates the capability of modern control software to handle multiple coupled dynamics simultaneously, achieving centimeter-level docking accuracy through the integration of nonlinear model predictive control with adaptive neural networks that compensate for unmodeled dynamics and external disturbances. As mission scales expand from individual spacecraft to massive constellations, the predictive maneuvering framework developed for mega-constellations [46] addresses the unprecedented challenge of coordinating thousands of satellites, utilizing distributed learning algorithms to optimize station-keeping and collision avoidance maneuvers while minimizing propellant consumption across the entire constellation.
The computational demands of coordinating such large-scale systems have driven the evolution toward federated learning architectures, representing a paradigm shift in how satellite constellations approach collective intelligence. Hierarchical federated learning systems [47] enable satellites to collaboratively improve their control algorithms through shared learning while preserving operational autonomy and communication efficiency, demonstrating that distributed intelligence can emerge from local learning processes. While these advanced learning frameworks push the boundaries of autonomous control, particle swarm optimization for PID tuning [48], which bridges classical and intelligent control paradigms, demonstrating that even traditional controllers can benefit from intelligent optimization techniques to achieve near-optimal performance across varying operational conditions. Table 6 summarizes the performance achievements of these implementations across representative missions.
Table 6: Representative Mission Implementations of Intelligent Control Software
Mission Category Example Missions Control Method Key Achievement Software Platform
Agile Imaging WorldView-3/4 Adaptive Neural Control <0.01° pointing, 4°/s slew Custom RTOS
Formation Flying PRISMA, TanDEM-X Distributed RL 10cm relative control Linux-based
Mega-Constellations Starlink Phase 2 Federated Learning Autonomous collision avoidance Proprietary
On-orbit Servicing Research Demonstrations Coupled NMPC-NN cm-level docking ROS-compatible
Table 6 illustrates the successful translation of intelligent control algorithms from theoretical concepts to operational systems, with each mission category demonstrating specific advantages over traditional methods, particularly in scenarios requiring autonomous adaptation or multi-spacecraft coordination. Among these implementations, constellation-scale applications face unique challenges in balancing computational autonomy with communication constraints. The resource-efficient federated learning framework [49] directly addresses this challenge through compression techniques and asynchronous update protocols that reduce inter-satellite communication overhead by orders of magnitude, enabling satellites to collectively improve their control performance without constant ground intervention, thus establishing the foundation for truly autonomous constellation operations.

  1. Key Implementation Technologies and Platform Evolution
    5.1 Evolution of Computing Hardware Platforms
    The transformation of satellite onboard computing platforms from traditional single-core processors to heterogeneous architectures incorporating GPUs, FPGAs, and specialized AI accelerators represents a fundamental enabler for deploying intelligent algorithms in space environments. The period of 2015 to 2020 also saw a significant incorporation of GPU and FPGA technologies in satellite systems thanks to the processing demands involved in the processing of real-time images and neural network inferences that overwhelm the capabilities of radiation-hardened CPUs with operating frequencies below 1 GHz. Modern satellite designs increasingly rely on dedicated AI processors that seek to improve the efficacy of neural networks, with processing in the teraflops range but with power dissipation staying below 20 watts—this is a significant constraint in satellite performance, as power generation as well as thermal control are major challenges.
    The computational capacity evolution of satellite platforms has progressed from megaflops in the early 2010s to current systems exceeding 10 GFLOPS for advanced missions, enabling the deployment of deep learning models that were previously confined to ground-based processing centers. This exponential growth in processing capability, coupled with improvements in radiation tolerance through redundancy and error correction mechanisms, has fundamentally altered the landscape of what algorithms can feasibly execute in the space environment. Table 7 illustrates the comparative analysis of different processing architectures currently employed in satellite systems.
    Table 7: Evolution of Satellite Computing Platforms (2010-2025)
    Period Platform Type Performance Level Power Efficiency Key Enabler Typical Application
    2010-2014 Rad-hard CPU Sub-GFLOPS Baseline Heritage designs Classical control
    2015-2017 Hybrid CPU-FPGA Low GFLOPS Improved COTS integration Signal processing
    2018-2020 GPU-enhanced Medium GFLOPS Moderate CUDA cores Image processing
    2021-2023 AI Accelerators High GFLOPS/TOPS Optimized Tensor cores Neural inference
    2024-2025 Neuromorphic Very High TOPS Highly efficient Event-driven Autonomous control
    Table 7 shows the fifteen-year evolution of satellite signal processing platforms, with the notable shift from traditional processors below the gigaflops barrier to TOPS-level performance achieved using neuromorphic solutions. Hybrid CPU-FPGA-based, GPU-based, and domain-specific AI accelerator systems technology represents the increasing demands of progressively sophisticated algorithms on processing prowess, with each new generation delivering performance boosts by orders of magnitude while maintaining or enhancing energy efficiency. Such progress has made it possible to deploy complex neural networks and autonomous control systems that were computationally infeasible in earlier satellite generations.
    5.2 Software Development Environment and Toolchain
    Software frameworks for satellite applications have evolved from mission-critical, proprietary software to standardized environments that allow model-based design and automated code generation. Along the way, development times have been significantly reduced from several years to just a few months. Incorporating machine learning frameworks like TensorFlow Lite and ONNX Runtime in embedded satellite systems has enabled smooth deployment of trained neural networks originating from ground infrastructure. Quantization and pruning algorithms have also enabled significant model size reduction without reducing acceptable accuracy levels. Finally, automated code generation features of Simulink have become critical in mapping high-level control designs into C++ implementations qualified for flight. This process necessarily includes both MISRA-C compliance and DO-178C certification needs within the code generation process itself.
    The implementation of continuous integration and deployment pipelines for satellite software, including mechanisms for on-orbit updates and version management, represents a paradigm shift from traditional approaches where software remained static throughout mission lifetimes. Modern satellite platforms support incremental software updates through redundant memory banks and rollback capabilities, enabling the deployment of improved algorithms and bug fixes without compromising mission continuity. The compression of development timelines has been facilitated by the adoption of software-in-the-loop and processor-in-the-loop testing methodologies that validate algorithm performance across the entire operational envelope before hardware integration, reducing the risk of costly failures during later development stages.
    5.3 Verification Testing and Reliability Assurance
    Intelligent algorithm verification and validation for satellite missions require exhaustive testing approaches that go beyond classical software verification, in an effort to adequately respond to machine learning model stochasticity and related failure modes. Modern satellite systems rely more and more on distributed architecture, where multiple satellites participate in cooperative endeavor using federated learning, thus introducing verification problems that will need to validate both system-specific and collective system behavior. These challenges stem from applying federated learning paradigms to satellite constellations [50] , as the distributed learning architecture requires verification of the model’s convergence properties while maintaining robustness under communication failure and Byzantine behaviors within the satellite network.
    The difficulty of verification is further compounded with the integration of satellite systems with other platforms in order to augment their learning capacities. The combination of high-altitude platform stations (HAPS) with LEO satellite networks to enable federated collaborative learning [51] requires cross-platform verification techniques to evaluate algorithm performance in heterogeneous computing environments that present different communicative latency and processing capabilities. Therefore, this mechanism stabilizes the distributed learning process under the mentioned operational differences.
    Beyond functional correctness, the verification process must validate resource consumption constraints critical to satellite operations. Energy-aware federated learning protocols [52] require specialized verification procedures that assess not only algorithmic accuracy but also power efficiency across diverse operational scenarios, ensuring that learning processes do not exceed the stringent energy budgets of satellite platforms. Figure 3 presents the comprehensive verification and validation workflow for intelligent satellite control software.

Figure 3: V&V Workflow for Intelligent Satellite Control Software
Figure 3 presents the comprehensive V&V workflow through four metrics: (a) Fidelity progression increases from 15% at algorithm development to 96.5% at on-orbit validation; (b) Test coverage evolution shows cumulative growth across functional, performance, hardware, and environmental testing, reaching 273% total coverage; (c) Failure detection capability demonstrates steady improvement from 13% to 94% across testing stages; (d) Time-cost analysis reveals increasing resource requirements, with time extending from 8 to 36 weeks per stage while cost factors escalate from baseline to 28.5x, illustrating the trade-off between verification thoroughness and resource investment.

  1. Conclusions
    This comprehensive review has traced the remarkable evolution of intelligent algorithm-based satellite attitude control software technologies from 2010 to 2025, revealing a transformative progression from MATLAB prototypes to embedded C++ implementations capable of real-time execution on resource-constrained platforms. The successful deployment of frameworks such as TensorFlow Lite on OPS-SAT and the integration of ROS-based architectures in missions like PRISMA demonstrate the maturation of software development ecosystems, with modern toolchains including Simulink auto-code generation and ONNX Runtime enabling development cycles to compress from years to months. The documented advances in software implementation, from monolithic PID controllers to distributed federated learning systems, have achieved order-of-magnitude improvements in control accuracy while maintaining compatibility with flight-qualified hardware through sophisticated model compression and quantization techniques that reduce neural network sizes without compromising performance.
    Despite these software engineering achievements, significant challenges persist in certifying machine learning components under existing standards such as DO-178C, originally designed for deterministic software systems. The integration of Basilisk simulation environments with hardware-in-the-loop testing has partially addressed verification challenges, yet the stochastic nature of neural network outputs continues to complicate formal validation processes required for safety-critical space applications. Current implementations in WorldView and Starlink constellations demonstrate practical solutions through modular software architectures that enable incremental deployment of intelligent algorithms alongside traditional controllers, building operational confidence through gradual validation. Future developments will likely focus on standardizing software interfaces for AI components, automating the verification process through explainable AI techniques, and establishing continuous integration pipelines that support on-orbit software updates, fundamentally transforming how satellite control software is developed, validated, and maintained throughout extended mission lifetimes.
    References
    [1] Forbes, J.R., Fundamentals of spacecraft attitude determination and control [bookshelf]. IEEE Control Systems Magazine, 2015. 35(4): p. 56-58
    [2] Cooper, M.A. and B. Smeresky, An overview of evolutionary algorithms toward spacecraft attitude control. Advances in Spacecraft Attitude Control, 2020
    [3] Izzo, D., M. Märtens, and B. Pan, A survey on artificial intelligence trends in spacecraft guidance dynamics and control. Astrodynamics, 2019. 3(4): p. 287-299
    [4] LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. nature, 2015. 521(7553): p. 436-444
    [5] Tan, V., J.L. Labrador, and M.C. Talampas, MATA-RL: continuous reaction wheel attitude control using the mata simulation software and reinforcement learning. 2021
    [6] Sharma, S. and S. D’Amico, Neural network-based pose estimation for noncooperative spacecraft rendezvous. IEEE Transactions on Aerospace and Electronic Systems, 2020. 56(6): p. 4638-4658
    [7] Liu, W., et al., Digital twin of space environment: Development, challenges, applications, and future outlook. Remote Sensing, 2024. 16(16): p. 3023
    [8] Carneiro, J.V. and H. Schaub, Scalable architecture for rapid setup and execution of multi-satellite simulations. Advances in Space Research, 2024. 73(11): p. 5416-5425
    [9] Labrèche, G., et al. OPS-SAT spacecraft autonomy with TensorFlow lite, unsupervised learning, and online machine learning. in 2022 IEEE Aerospace Conference (AERO). 2022. IEEE
    [10] Soufi, O. and F.Z. Belouadha, An intelligent deep learning approach to spacecraft attitude control: The case of satellites. Journal of the Franklin Institute, 2024. 361(14): p. 107078
    [11] Mnih, V., et al., Human-level control through deep reinforcement learning. nature, 2015. 518(7540): p. 529-533
    [12] Ahmed, J., et al., Transformer network-aided relative pose estimation for non-cooperative spacecraft using vision sensor. International Journal of Aeronautical and Space Sciences, 2024. 25(3): p. 1146-1165
    [13] Gao, Y., et al., Fault Warning of Satellite Momentum Wheels with a Lightweight Transformer Improved by FastDTW. IEEE/CAA Journal of Automatica Sinica, 2025. 12(3): p. 539-549
    [14] Yang, H., et al., PVSPE: A pyramid vision multitask transformer network for spacecraft pose estimation. Advances in Space Research, 2024. 74(3): p. 1327-1342
    [15] Haarnoja, T., et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. in International conference on machine learning. 2018. Pmlr
    [16] Schulman, J., et al., Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017
    [17] Hovell, K. and S. Ulrich, Deep reinforcement learning for spacecraft proximity operations guidance. Journal of spacecraft and rockets, 2021. 58(2): p. 254-264
    [18] Gao, D., et al. Satellite attitude control with deep reinforcement learning. in 2020 Chinese Automation Congress (CAC). 2020. IEEE
    [19] Zhang, Z., et al., Model‐Free Attitude Control of Spacecraft Based on PID‐Guide TD3 Algorithm. International Journal of Aerospace Engineering, 2020. 2020(1): p. 8874619
    [20] Chiu, S.L., Fuzzy model identification based on cluster estimation. Journal of Intelligent & fuzzy systems, 1994. 2(3): p. 267-278
    [21] Zou, A.-M. and K.D. Kumar, Neural network-based adaptive output feedback formation control for multi-agent systems. Nonlinear dynamics, 2012. 70(2): p. 1283-1296
    [22] Heydari, A. and S. Balakrishnan, Adaptive critic-based solution to an orbital rendezvous problem. Journal of Guidance, Control, and Dynamics, 2014. 37(1): p. 344-350
    [23] Cena, C., M. Martini, and M. Chiaberge, Learning Satellite Attitude Dynamics with Physics-Informed Normalising Flow. arXiv preprint arXiv:2508.07841, 2025
    [24] Zhang, Z., H. Peng, and X. Bai, Imitation Learning for Satellite Attitude Control under Unknown Perturbations. arXiv preprint arXiv:2507.01161, 2025
    [25] Shi, M., B. Wu, and D. Wang, Neural-network-based adaptive quantized attitude takeover control of spacecraft by using cellular satellites. Advances in Space Research, 2022. 70(7): p. 1965-1978
    [26] Liu, S., et al., Real-Time Telemetry-Based Recognition and Prediction of Satellite State Using TS-GCN Network. Electronics, 2023. 12(23): p. 4824
    [27] Guthrie, B., et al., Image-based attitude determination of co-orbiting satellites using deep learning technologies. Aerospace Science and Technology, 2022. 120: p. 107232
    [28] Lotti, A., et al., Deep learning for real-time satellite pose estimation on tensor processing units. Journal of Spacecraft and Rockets, 2023. 60(3): p. 1034-1038
    [29] Varey, J., et al. Physics-Informed Neural Networks for Satellite State Estimation. in 2024 IEEE Aerospace Conference. 2024. IEEE
    [30] Guimarães, M., C. Soares, and C. Manfletti, Predicting the Properties of Resident Space Objects in LEO Using Graph Neural Networks. 2024
    [31] Xie, Y., et al., Hybrid digital twin for satellite temperature field perception and attitude control. Advanced Engineering Informatics, 2024. 60: p. 102405
    [32] Fazlyab, A.R., F. Fani Saberi, and M. Kabganian, Fault-tolerant attitude control of the satellite in the presence of simultaneous actuator and sensor faults. Scientific Reports, 2023. 13(1): p. 20802
    [33] Saraygord Afshari, S., Enhanced Fault Detection in Satellite Attitude Control Systems Using LSTM-Based Deep Learning and Redundant Reaction Wheels. Machines, 2024. 12(12): p. 856
    [34] Yadegari, H., J. Beyramzad, and E. Khanmirza, Magnetorquers-based satellite attitude control using interval type-II fuzzy terminal sliding mode control with time delay estimation. Advances in Space Research, 2022. 69(8): p. 3204-3225
    [35] Yasini, T., J. Roshanian, and A. Taghavipour, Improving the low orbit satellite tracking ability using nonlinear model predictive controller and Genetic Algorithm. Advances in Space Research, 2023. 71(6): p. 2723-2732
    [36] Meoni, G., et al., The OPS-SAT case: A data-centric competition for onboard satellite image classification. Astrodynamics, 2024. 8(4): p. 507-528
    [37] He, C., et al., Digital Twin Technology-Based Networking Solution in Low Earth Orbit Satellite Constellations. Electronics, 2024. 13(7): p. 1260
    [38] Raja, M., et al., Design of satellite attitude control systems using adaptive neural networks. Incas Bulletin, 2020. 12(3): p. 173-182
    [39] Aslam, S., et al., Model predictive control for Takagi–Sugeno fuzzy model-based Spacecraft combined energy and attitude control system. Advances in Space Research, 2023. 71(10): p. 4155-4172
    [40] Bello, Á., et al., Parameterized fuzzy-logic controllers for the attitude control of nanosatellites in low earth orbits. A comparative studio with PID controllers. Expert Systems with Applications, 2021. 174: p. 114679
    [41] Navabi, M., N.S. Hashkavaei, and M. Reyhanoglu, Satellite attitude control using optimal adaptive and fuzzy controllers. Acta astronautica, 2023. 204: p. 434-442
    [42] Faisal, M., D. Reimer, and S. Montenegro, Attitude Control of a Floating Satellite (FloatSat) via Deep Reinforcement Learning. 2025
    [43] Retagne, W., J. Dauer, and G. Waxenegger-Wilfing, Adaptive satellite attitude control for varying masses using deep reinforcement learning. Frontiers in Robotics and AI, 2024. 11: p. 1402846
    [44] Liu, Z., et al., Data-driven prescribed performance control for satellite with large rotational component. Advances in Space Research, 2023. 71(1): p. 744-755
    [45] Kasiri, A. and F. Fani Saberi, Coupled position and attitude control of a servicer spacecraft in rendezvous with an orbiting target. Scientific Reports, 2023. 13(1): p. 4182
    [46] Liu, H., S. Yu, and X. Wang, Mega-constellation satellite maneuver forecast via network with attention mechanism. Advances in Space Research, 2025. 75(6): p. 4942-4962
    [47] Mei, Q., et al., Intelligent hierarchical federated learning system based on semi-asynchronous and scheduled synchronous control strategies in satellite network. Autonomous Intelligent Systems, 2025. 5(1): p. 9
    [48] Mahdiabadi, M., et al., Optimal PID controller parameters tuning for a 3D satellite simulator based on particle swarm optimization algorithm. Journal of Space Science and Technology, 2025. 18(1): p. 53-65
    [49] Zhang, Y., et al., SatFed: A Resource-Efficient LEO-Satellite-Assisted Heterogeneous Federated Learning Framework. Engineering, 2025
    [50] Matthiesen, B., et al., Federated learning in satellite constellations. IEEE Network, 2023. 38(2): p. 232-239
    [51] Ramadan, K., Communication-Efficient Federated Learning for LEO Satellite Networks Integrated with HAPs Using Hybrid NOMA-OFDM. 2024
    [52] Razmi, N., et al., Energy-Aware Federated Learning in Satellite Constellations. arXiv preprint arXiv:2409.14832, 2024

Leave a Reply

Your email address will not be published. Required fields are marked *