1 Introduction
The amount of resident space objects (RSOs) and the quantity of conflicts between RSOs are rapidly escalating. The number of space objects larger than 10 cm is presently approaching 21,000, the estimated population of objects between 1 and 10cm is about 500, 000, and for objects smaller than 1cm the number exceeds 100 million
(NASA, ). At the heart of the challenges for Space Situational Awareness (SSA) is to predict each object’s orbit efficiently and accurately. Several painful incidents have occurred in recent decades, such as the February 2009 collision of a U.S. Iridium communications satellite and a Russian Cosmos 2251 communication satellite (Kelso, 2012), and the RED threshold late notice conjunction threat to the International Space Station (ISS) from the “25090 PAMD” debris (Bergin, 2009). The limitation of the current orbit prediction capability is among the main causes for these collision events.Current approaches for orbit prediction are physicsbased, the success of which requires a good knowledge of the state of the space object at the start of the trajectory computation, and the environment information such as earth gravity, atmospheric drag, and solar radiation pressure, as well as the intent information for the maneuvering objects. However, our understanding of the space/time variation of atmospheric density is limited, and our information about the space object is not accurately updated, for example the maneuvering of a spacecraft that is owned by other nations could be unavailable when the orbit prediction is conducted. Moreover, our current surveillance resources are limited and expensive and spacetracking measurements are sparse and noisy. All of these issues lead to the fact that current errors in physicsbased predictions can be too large to be used for meaningful actions.
The machine learning (ML) method presents a different modeling and prediction capability compared to the physicsbased approach. The prediction can be made without explicitly modeling space objects, spacecraft maneuvers, and space environment. Instead, the models are learned based on large amounts of observed data, similar to human cognition in learning from past experience to predict future events. There are different types of machine learning, with supervised learning, reinforcement learning, and unsupervised learning as the three most common types
(AbuMostafa et al., 2012). Reinforcement learning is usually used to make decisions, while unsupervised learning is used to find patterns and structures within the data such as to cluster data into different groups without providing outputs to describe the groups. Supervised learning is such a method that learns a function or mapping from labeled data. The training data are pairs of input and its corresponding output. The supervised ML method is the appropriate approach for improving orbit prediction error based on historical measurements and error information. In the remainder of this paper, for brevity, we will refer to supervised machine learning methods as ML methods.The ML methods have shown great capability for a wide range of applications (AbuMostafa et al., 2012; Mitchell, 1997), including many in the aerospace field (Ampatzis and Izzo, 2009; Hartikainen et al., 2012; Sharma and Cutler, 2015). Hartikainen et al. combine physical models with nonparametric datadriven techniques to build a model for orbit prediction (Hartikainen et al., 2012)
. Their method focuses on the data mining aspect and can extract information of unknown forces from historical data. Sharma and Cutler have presented a learning approach to do orbit determination based on distribution regression and transfer learning methods
(Sharma and Cutler, 2015). Their tests show that the proposed machine learning approach is superior to the conventional methods such as the extended Kalman filter. Moreover, the method is able to estimate significantly varying orbital parameters.
This paper presents a computational framework that improves orbit prediction accuracy through the ML method. A crucial feature of the proposed approach is that the overall framework will introduce and embed a learningcentered strategy into the physicsbased prediction. To take advantage of the knowledge on the physicsbased models that are important and represent the state of the practice, we design the learning process to focus on modeling the prediction errors instead of the full dynamics. In this way, the prediction errors introduced in different steps including measurement, estimation, and prediction, are captured as a surrogate model. An extra benefit from this approach is that the learning is only burdened with finding the incremental corrections to the physicsbased prediction, which reduces the dimensionality for the learning task. The support vector machine (SVM) regression method (Smola and Schölkopf, 2004)
, which is an advanced method proven to be robust to outliers in the dataset, is chosen in this paper, although the proposed methodology is expected to be general and applicable to other ML methods. Furthermore, in contrast with other researches that focus on TLE data
(Legendre et al., 2006; Sang et al., 2014, 2013; Wang et al., 2009), we use a simulationbased space catalog environment to validate the proposed orbit prediction method. This approach is important, because a known “truth” will underlie all simulations, and the solutions from the machine learning can be compared to the “truth” to make definitive statements on the performance.Our simulation results, interestingly and innovatively, demonstrate three types of generalization capability for the proposed ML approach. First, the ML model can be used to improve the RSO’s orbit information that is not available during the learning process but share the same time interval as the training data. Second, the ML model can be used to improve predictions of the same RSO at future epochs. And third, the ML model based on one RSO can be applied to other RSOs that share some common features. Our results also provide insight upon several questions that are important for the SSA applications including: more observation stations could generate more historical data and thus better performance can be expected from the ML model; larger size of the training data can lead to better performance but there appears to be a limit; and the generalization capability to nearby RSOs implies that it is possible to build an ML system for a large number of RSOs while only representative RSOs are used for training.
The paper outline is as follows. The highfidelity simulation environment is first described in Section 2. Next, the framework of the machine learning approach to improve orbit prediction accuracy is presented in Section 3. In Section 4, the numerical simulation results are analyzed, where a performance metric is proposed to evaluate the ML approach. In Section 5, the generalization problem to future epochs is proposed and evaluated for SSA application. Then in Section 6, the generalization problem to different RSOs is examined by varying the inclination and the semimajor axis of the simulated RSO. Finally, conclusions and future studies are presented in the last section.
2 Simulation Environment
The framework of the orbit predictions through supervised machine learning is shown in Fig. 1. The top four blocks show the conventional orbit prediction process, while the bottom three blocks represent the proposed ML approach to enhance the conventional orbit prediction. In the following subsections, the top four blocks will be explained, and the bottom three blocks will be presented in Section 3.
2.1 Truth Dynamic Models
The truth dynamic models include

Basic Newtonian twobody gravitational force, with an accurate gravitational constant.

Highfidelity nonspherical gravity model of the Earth.

Thirdbody perturbations.

Air drag force model with highfidelity atmosphere model.

Solar radiation pressure model with highfidelity solar activity model.

Other parameters depending on the specific RSO, including the mass, inertia, geometry, material properties, etc.
The choice of truth dynamic model for the simulation depends on the time scale. In this paper, the truth dynamic model is expected to include the major factors that could contribute to the orbit prediction error within at least 60 days. Therefore, forces that have longperiod effects do not need to be modeled with high precision. The setup of the truth model is summarized in Table 1. The Newtonian gravitational force is added with an Earth gravitational constant of . The nonspherical effect of the Earth gravity is modeled using spherical harmonic functions, with coefficients provided by the EIGEN6S model (Förste et al., 2012), truncated with degree/order as the truth gravity. Thirdbody perturbations of all major solar system bodies are considered, including the Sun, all the planets, the Pluto, and the Moon. The position of these bodies are provided by DE430 data file from the JPL (Folkner et al., 2014). The DTM2000 model is used to approximate the atmosphere. The Marshall Solar Activity Future Estimate Solar (MSAFE) data from NASA is used to provide solar activity information, which has significant effect on the density and the speed of the atmosphere. The solar radiation pressure is calculated with the reference value as at 1 AU (149,597,870.0 km) from the Sun. And the effect of the penumbra and eclipse are considered. Additionally, a spherical RSO with a constant section area is assumed, and the drag coefficient and singleparameter reflection coefficient are assumed to be constant. These models are implemented using the Orekit, which is a low level space dynamics library written in Java (Maisonobe et al., May 3–6, 2010).
Parameters  Truth model  Assumed model 

Earth Shape  WGS84  WGS84 
Harmonics Gravity Field  
ThirdBody Purterbation  Sun + Solar Planets  Sun + Jupiter 
+ Pluto + the Moon  + the Moon  
Atomosphere Model  DTM2000  DTM2000 
Solar Activity Model  MSAFE  MSAFE 
2.2 Measurement Models
Measurements of a RSO can be obtained using groundbased radar or optical stations (Hill et al., 2012), or spacebased systems using specially designed satellite (Lyon, 2004). In this study, only ground radar stations are considered because the paper focuses on the general machine learning framework and the target RSO is assumed to be an LEO object. In the simulations, the radar station is located at a topocentric frame centered at a given geodetic point location defined on the WGS84 Earth ellipsoid. At each step of orbit propagation, the visibility of the RSO with respect to the ground stations is checked. If the RSO is visible to a station, the station will provide measurements including the azimuth , the elevation , and the range . These measurements are generated by converting the state vector of the RSO in the Earth Centered Inertial (ECI) frame (Vallado, 1997) to the topocentric frame and then calculating the angle information. A series of consecutive measurements are organized as a track. If more than one station detect the RSO at a particular time, all the stations will generate measurements which are combined into one track. Therefore, one track will start when it is visible to any station, and end when no stations can detect it.
Station  Eglin, FL  Clear, AK  Kaena Point, HI 

Latitude [deg]  30.57  64.29  21.57 
Longitude [deg]  86.21  149.19  158.27 
Altitude [m]  34.7  213.3  300.2 
Maximum Range [m]  13210  4910  6380 
Feasible Elevation [deg]  1–90  1–90  0–90 
[m]  32.1  62.5  92.5 
[deg]  0.0154  0.0791  0.0224 
[deg]  0.0147  0.024  0.0139 
In the simulations, three radar stations are simulated, with their parameters listed in Table 2
. The RSO is visible to a station only if the range is less than the maximum range and the elevation is within the feasible elevation range. Measurements are given at a 20second interval. The measurement errors are simulated as normal distributions with zero biases and standard deviations of
, , and for the azimuth, elevation and range respectively, as listed in Table 2. We remark that the constraint of the azimuth range of the station (Hill et al., 2012) has been omitted in the current study, so the error model does not depends on the azimuth.2.3 Estimation Models
Figure 2 illustrates the estimation strategy. The black curve represents the true orbit of a RSO. The colored (light orange and blue) boxes along the curve represent observation tracks. The estimation is carried out at the beginning epoch of each track, as represented by the red dots in the figure. For the th track in the figure, whose beginning epoch is , measurements on current track and all previous tracks with are considered, where hours is the chosen estimation window. This estimation window is shown by the gray rectangle underlying the true orbit in Fig. 2. Orange rectangles show all tracks in the estimation window that will be used to produce the estimation at , while blue rectangles show tracks outside the estimation window. We remark that this strategy is adopted to simulate the realistic process of the orbital determination, where single observation track may not have enough measurements.
The batch Least Square (LS) estimator (Crassidis and Junkins, 2011) is chosen to estimate the state of the RSO. In practice, the LS estimator is only suboptimal due to the nonlinearity of the model. However, the focus of this paper is the framework of the proposed ML approach and our hope is that it can work with most of the specific estimation methods. Additionally, we note that although the batch least square estimator can also generate the covariance matrix, which is important for collision risk assessment, only the orbital state is used in our simulation. This is due to the fact that the covariance information is not available for the twoline element (TLE) catalog (Vallado et al., 2006).
Starting from the block of estimation models in Fig. 1, simulations are carried out using an assumed dynamic model. In practice, the assumed dynamic model used during estimation and propagation is often only an approximation to the real physics.
In the simulations, the assumed model is set up with the spherical harmonic gravity model of degree/order , and with the thirdbody perturbations including only three major resources, the Sun, the Moon and the Jupiter, as summarized in Table 1. The orbit state and the drag coefficient of the RSO are estimated by the LS estimator, which generates the pairs of estimations denoted as . Other parameters of the RSO are fixed, including the cross section area, the mass, and the reflection coefficient of the surface. Finally all pairs of estimations are recorded for the orbit predictions. We remark that if the differences between the true model and the assumed model are smaller, the error of the estimation will be smaller, and also the deviations of propagating two estimates to the same epoch will be smaller. As a consequence, compared with other assumed models with larger differences, less information is embedded in the simulated catalog using the current assumed model. Using such conservative assumptions, the results in Sections 6, 5 and 4 show that the ML approach improves the orbit prediction accuracy. Therefore, we demonstrate the capability of the ML approach in a more significant way.
2.4 Orbit Prediction
After obtaining estimations for all the tracks, the prediction process is straightforward. Using the assumed dynamic model, which is the same as the one that has been used for estimation, as show in Fig. 1, each estimated state is propagated to a future epoch with , when another observation track begins. At the future epoch , the predicted state based on is compared with the true state , which provides the prediction errors. Since the assumed dynamic model, the measurement process, and the estimation process will all introduce errors, the resulting prediction error can grow quickly to a meaningless magnitude as the propagation time increases. Therefore, we use a maximum prediction duration days to restrict the orbit prediction process in this paper. This duration is also long enough for the surveillance and collision avoidance for the LEO objects.
3 Prediction Accuracy Improvement through Machine Learning
In this section, the proposed ML approach to improve orbit prediction accuracy is presented. We first introduce the SVM model, which is the adopted machine learning method in this paper. We then present the machine learning model of the orbit prediction error. And we discuss the dataset chosen for the SVM model to capture the orbit prediction error in the last subsection.
3.1 Support Vector Machine
The support vector machine (SVM) method is a machine learning algorithm that can be used for both classification and regression problems. One strength of the SVM method is that they are nonparametric techniques, so we do not need to specify the basis functions in prior. The SVM regression can handle nonlinear problems since it relies on kernel functions. Moreover, the SVM method has universal approximation capability with various kernels including the Gaussian kernel (Hammer and Gersmann, 2003; Micchelli et al., 2006). We note that although it cannot be verified that the orbit prediction problem under this study satisfies the fundamental assumptions of SVM’s universal approximation property, the result in this paper demonstrate that the SVM method is practically effective to reduce orbit prediction errors. The concept of the SVM is briefly reviewed below, and the details are referred to the references.
The SVM regression method aims to find a function that has at most deviation from the actual obtained targets for all the training data, and at the same time is required to be as flat as possible (Smola and Schölkopf, 2004). Suppose we have a set of training data , where denotes the space of input learning variables. In the linear case, the desired function takes the form
(1) 
where , , and represents the inner product operation. Then the training problem is to find the flattest function in the space , where the flatness of the function is represented by . The training problem is cast as a convex optimization problem to minimize the cost function , subject to the constraint for all the training data. By introducing slack variables and then solving the dual problem of the above quadratic problem, the support vector expansion of the regression function can be expresses as
(2) 
where and are the dual variables solved from the dual problem (Smola and Schölkopf, 2004). Comparing Eqs. 2 and 1, we can see that can be completely described as a linear combination of the training data . This property of the SVM makes it possible to deal with nonlinear regressions via kernels. Substituting the inner product in Eq. 2 by the kernel , the optimization problem is reformulated to find a flattest function in the feature space indicated by the kernel, expressed as
(3) 
3.2 Machine Learning Model of the Prediction Error
The concept of the ML approach to modify the prediction is illustrated in Fig. 3. After the ground station observes a RSO, the estimation of the RSO is conducted. The estimated state will have errors, and the following orbit prediction will further deviate from the true orbit because of the assumed dynamic models. We introduce the ML approach to directly modify this prediction, such that the MLmodified prediction will be closer to the true state. Importantly, as shown in Fig. 3, this modification does not address the state around the estimation at the current epoch, while in contrast, it directly improves the prediction at the future epoch.
3.3 Dataset Capturing the Orbit Prediction Error Model
The structure of the training data should be carefully designed to capture the prediction errors. We note that although the dataset designed in this paper is not the unique solution, results in the paper show that the ML approach can indeed improve the orbit prediction accuracy based on the proposed design of the dataset.
Figure 4 illustrates the construction of the dataset for the ML approach in this study. The horizontal axis represents the time . The dark solid curve represents the true orbit, and the dashed curves represent the predicted orbit arcs propagated based on the assumed dynamic models. The true state of the RSO at any specified epoch is denoted as , which can be expressed in different coordinate frames and forms. In the classical orbital elements (COE) form, the state is expressed as . In the Cartesian coordinates of the Earthcentered inertial (ECI) frame, the state is expressed as . At a specific time , we have the true state , estimated state , predicted state , and MLmodified state , where the second time parameter in the parenthesis indicates the prediction is based on the previous estimation at an earlier epoch .
In Fig. 4, the lighter gray box on the left hand side shows the learning process of the ML method, during which the training data are collected to train the SVM model. As annotated in the figure, at the epoch , there are three states about the RSO, the true state , the estimated state , and the predicted state which is based on the previous estimation . We have three types of error (as shown in Fig. 4):

True prediction error, .

True estimation error, .

Relative error of prediction, .
These three errors are related by the equation
(4) 
Notice that the relative prediction error is identical to the definition of the consistency error between two estimations in the work of Rivera and Bai (2016). These orbit prediction errors often represented in the RSW coordinate frame, where axis (radial) is the radial direction, the axis (alongtrack) is perpendicular to the axis in the orbital plane and points to the inertial velocity direction, and the axis (crosstrack) is along the angular momentum direction (Vallado, 1997).
Next, we construct the dataset for the ML model. Apart from the three types of errors, there are other variables available at , such as the measurement of the current track, which could also contain information about the orbit prediction error. Take the epoch as an instance, and assume the previous estimation is obtained at the epoch , with . Then, the learning variables of a particular data point of the dataset consist of the following:

The relative prediction error based on the previous estimation , expressed in both the form of COE as , and the form of RSW frame as . So we will have as a representation of , with redundancies. It is expected that carries the information of the model error between the assumed model and the truth model.

The prediction duration . This information is an important factor, as the longer the prediction, the larger the prediction error.

Current estimation of the orbit state, expressed in the COE form as . This information can reflect physical information of the orbit, such as the altitude and the shape. The atmosphere drag force and the solar radiation force also depend on these information.

The estimated drag coefficient at the previous estimation, and its error with respect to the current estimation . This information is expected to be important for RSOs in LEO.

Maximum elevation at the current track starting with , and the corresponding azimuth and range . This information is usually located around the middle of the track. They can reflect the difficulty and accuracy of the measurements along the track, because when the RSO has a larger range and smaller elevation, the sensitivity of the estimated state to the measurement will be smaller.
We note that it is not necessary to distinguish the estimation error and the propagation error for the proposed ML approach. The target variable for the above data point is chosen as:

True prediction error at a future epoch (), expressed in the RSW frame as .
The illustrated and in Fig. 4 are examples of the target variable, based on the same estimation but with different prediction durations of and respectively. With being a 6dimensional vector, six SVM models in total will be trained for all the components.
With these pairs of learning and target variables, the ML model is designed to directly modify a future orbit prediction. As shown by the prediction process in Fig. 4 (the darker gray block at the right side), the predicted state is adjusted by the additional MLmodification, which is the output of the ML model. If the learning and target variables are related, and if their relationship is at least partially contained in the designed dataset, the ML model should capture the underlying relationship and the MLmodification is expected to bring the prediction closer to the true orbit.
Collect all learning variables as . One data point of the dataset is expressed as , where corresponds to different state components of the target true error , and , () indicate the different epochs of the learning variables and the target variable respectively. Then finally, the whole dataset is constructed as,
where the th row shows prediction data from to all the following epochs represented by the th columns.
We choose these learning variables and target variables based on the rationale that the true state and the true models are usually not available, so any variables that explicitly depend on the true orbit information shall be not used, such as the true error . However, the information of the true error can be implicitly contained in the relative error to some degrees. In this way, the proposed ML approach actually also connects the predicted error with the true prediction error at a future epoch, with the assistance of other learning variables.
4 Numerical Results of Improving Orbit Prediction
In this section, the framework established in the previous section is applied to a simulated RSO. The ML approach to improve orbit prediction is demonstrated by training an SVM model first and then using it to improve the orbit prediction accuracy. We will also discuss the effect of the size of the training data at the last subsection.
4.1 Building the SVM model for the Simulated RSO
Parameters for the simulated RSO are summarized in Table 3. The initial orbital parameters are given in the ECI frame, which is directly converted from a TLE of the International Space Station (ISS). However, we note here that we study a generic RSO with areatomass ratio and reflection coefficient different from the ISS.
Parameter  Value 

UTC epoch  2016/09/28 12:09:45 
Semimajor axis [km]  6783.34 
Eccentricity  0.006793 
Inclination [deg]  51.6393 
Argument of perigee [deg]  14.5438 
RAAN [deg]  262.6471 
Mean anomaly [deg]  345.5909 
Areatomass ratio  0.05 
Drag coefficient  2.2 
Reflection coefficient  1.25 
The RSO is propagated for 30 days using the truth dynamic model. The numerical integrator is chosen to be the DormandPrince 8(5,3) method for all the propagation in this paper. The absolute tolerance on each position component in the ECI frame is set as 0.1 m, and the propagation step size is limited between 0.001 and 90 seconds.
Figure 5 demonstrates the orbit of the simulated RSO for 30 days. The orbit is colored by red at the beginning, and is seen to gradually change to blue at the end. This is expected as that the orbit processes within the simulated time interval, due to the perturbation forces. In Fig. 5, the green dots along the orbit show the locations where measurements are generated by the ground stations. The revolution of the instantaneous COE is shown in Fig. 6, alongside with zoomin plots in days to clearly show their variations. Apart from the clear precession motion, the orbit also shows shortterm and secular changes on the semimajor axis, eccentricity and inclination.
After obtaining the dataset following the description in the previous section, we perform data cleaning before starting to train the SVM model. The guidelines for the data cleaning usually depend on the specific dataset, the target problem, and also the experience of the modeler. Here, two types of data cleanings are performed. First, we remove the cases where the estimations of the drag coefficient lie outside of the interval . Because the atmosphere drag is sensitive to , a largely biased can cause significant errors. In practice, can be compared with the historical data, and those with large biases can be detected. Second, we use the Tukey’s test (Tukey, 1977)
to remove outliers in the dataset, which means data points with too large errors are eliminated. Specifically, denoting the quartiles of the alongtrack error
as , , and , the errors out of the interval are discarded as outliers.In this paper, the MATLAB (R2017b) implementation of SVM regression function fitrsvm is used (MathWorks, 2017
). The kernel function is set to the Gaussian kernel to deal with the nonlinearity of the prediction error. Some default settings of the MATLAB implementation are used: the scaling parameter for the Gaussian kernel is selected by a heuristic procedure with a subsampling process; the
margin is selected as where indicates the interquartile; the box constraint parameter is selected as ; the optimization solver for the SVM training is the Sequential Minimal Optimization (SMO) method. The learning variables will be standardized using the mean and standard deviation of each variable. The KarushKuhnTucker (KKT) violation is checked at every iteration to determine the convergence when solving the optimization problem, where the convergence tolerance is . Based on the collected dataset, six SVM models in total can be trained to capture the underlining orbit prediction errors for the six components of the true error .Additionally, in order to reproduce the training results, the seed of the random stream is set to 42 before the training. We remark that our experiments reveal that the results in the following discussions do not depend on the particular random seed. A different seed will slightly vary the specific metric but will not change the general pattern.
4.2 Validation of the Trained SVM Model
First, we consider a simpler case, where only the first ground station in Table 2 is considered. The measurements and estimation errors of a single station are more likely to be coherent, thus the underlining prediction errors are more likely to follow a deterministic pattern, which can be learned by the SVM. This case reflects the situation where a station only has access to its own historical data.
Figure 7 shows the true prediction error of the whole dataset, where the RSO has been propagated for 30 days and the maximum prediction duration for each estimated state is set is 7 days. The horizontal axis represents the prediction duration . The vertical axis represents the position components of in Fig. 7(a) and the velocity components in Fig. 7(b). In Fig. 7(a), the position errors are shown in the logarithm coordinate, since the alongtrack position error is significantly larger than the other two components. In Fig. 7(b), although is slightly more spread out than the other two components, all the errors are small and there is no dominant component. The most important task is to reduce the orbit prediction error of the alongtrack position error . Therefore, in the following study, we will only use as the study object and demonstrate the performance of reducing through the ML approach.
80% of the dataset is randomly chosen to train the SVM model while the remaining 20% is used to evaluate the performance of the learned model. For evaluation, the learning variable of the testing data is the input to the SVM model, and the output is the MLpredicted error, denoted as . The MLmodified prediction is achieved by subtracting from the prediction . So the residual orbit prediction error after the MLmodification is
(5) 
which shall be zero if the SVM model completely captures the underlying errors.
We use the concept of the volumeweighted symmetric mean absolute percentage error (volumeweighted MAPE) from statistics as the metric to evaluate the prediction accuracy between the reference value and the output value of a prediction model. The performance metric used to quantify the trained SVM model is defined as the ratio between the summation of the absolute errors between and on each data point in the testing data, and the summation of the original absolute true error of all data in the testing data, which can be mathematically expressed as,
(6) 
where is the size of the testing data. The metric reaches its lower boundary zero when the MLpredicted error is identical to the true error, but has no upper boundary. Notice that the numerator in Eq. 6 is actually the residual error after MLmodification, as defined in Eq. 5. Therefore, we have
(7) 
which indicates that the metric actually measures the percentage of the residual error with respect to the true error of the testing data. The lower this percentage is, the more errors are corrected, thus the better the performance of the trained SVM model is.
Notice that this chosen metric is consistent with the objective of the SVM model, which is to include all data point with a surface as smooth as possible through minimizing the total error within the training data. In the following analysis, this metric will be used to quantify the performance of the trained SVM model.
Figure 8 demonstrates the results of the SVM model on the training data. The horizontal axis represents the prediction duration . The vertical axis shows true, MLpredicted and residual alongtrack errors respectively, as annotated by the legend in the figure. In both Figs 8(b) and 8(a), the left figures directly show the scatter plot, where the black circle represents the true error , the green dot represents the MLpredicted error , and the red dot represents the residual error after MLmodifications. The right figures show the errorbar plot of the left scatter plot, where the center point represents the mean value of each clustered group of the error, and the length from the middle of the bar to the top (or the bottom) represents the standard deviation of the corresponding clustered group. For clarity, the three curves are slightly displaced along the horizontal axis to avoid overlapping. (Similar figures will appear in the remaining part of the paper, and we will only explain the difference there.)
In Fig. 8(a), the absolute values of the errors are demonstrated. The MLpredicted error is scattering within the same area of the true error , and the errorbar curve is very close to each other. The performance metric is 18.3%, indicating that the trained SVM model has captured most features of the underlying error model. The residual error is also very small as revealed by both the scatter plot and the errorbar plot. Therefore, we can conclude that the trained SVM model captures the orbit prediction error in the training dataset very well. In Fig. 8(b), learning results are demonstrated with the real value of the errors. As shown in both plots, the error is distributed around zero almost evenly. In this case, the orbit prediction error cannot be compensated for if we only perform a modification based on fitting the mean error with respect to the prediction duration , which was reported in the reference (Rivera and Bai, 2016). In contrast, the trained SVM model successfully compensates for the errors and reduces the standard deviation.
A common concern in the machine learning applications is the performance of the trained model on a different set of data that is not included in the training data. We test this generalization capability here, where the remaining testing data is used as the input to the trained SVM model, and the results are shown in Fig. 9. The metric is 22.6% now, which is larger than that of the training data. This is reasonable because the testing data can contain information beyond the training data. The errorbar plots in both subfigures show that the mean value and the standard deviation have been greatly reduced, even though the testing data is unknown to the trained SVM model.
In the above discussions, only one station is considered. A clear drawback of this situation is that a RSO is possible to be below the horizon of the station for a long time, as revealed by the gaps in Figs 8 and 9. Next, all three ground stations in Table 2 are included in the simulations. The obtained dataset is also randomly (with fixed seed to allow reproduction) divided into training (80%) and testing (20%) data. The result of the new SVM model on the testing data is shown in Fig. 10. We observe two clear differences between the new results and the results using only one station:

The data in the scatter plot is denser along the horizontal axis, due to the fact that more tracks are available when three stations are used. The results have been clustered into more groups in the error bar plots to show more details in the figure.

The maximum of true prediction errors (black circles) are smaller, due to the fact that the estimations become more accurate when more measurements are available.
The performance has been greatly improved when the three stations are used. The metric is 9.6% now, improved by more than a half compared with that of the single station situation, where is 22.6%. The mean errors have been reduced to almost zero, and the standard deviations have also been greatly reduced.
As revealed in the above analysis, with more available ground stations, the trained SVM model can capture the underlying orbit prediction error better, and the generalization performance is also significantly improved. In the following studies, we will only present results when three ground stations are used.
4.3 Different Partition of Dataset
An often accepted concept for the machine learning method is that the more training data used, the better performance the machine learning model has. In this subsection, we study this effect by investigating different partition of the dataset on the performance of the trained SVM model. Figure 11 shows the result of the trained SVM model, using 40% or 90% of the dataset as training data respectively. (Note in the follow analysis, for clarity, only the absolute value of the result are demonstrated.) The performance is clearly improved as shown by the thiner and steadier errorbar curves of the residual error . Also, the metric drops from 14.8% to 9.7%.
The variation of the metric with respect to the size of the training data is further investigated by varying the percentage of the training data from 10% to 90%. The results are shown in Fig. 12. Interestingly, we notice that the metric drops quickly at the beginning, but tends to a constant once the size of the training data exceeds 80%. This implies that the performance of the SVM model will not be further improved once sufficient amount of training data has been used. Additionally, the results validate that our choice of 80% dataset in the previous subsections is appropriate to achieve quality performance.
5 Generalize the SVM Model to Future Epochs
In the previous section, the generalization capability of the trained SVM model is evaluated similarly as the classical machine learning studies where the ML model performance is tested using testing data different from the training data, without distinguishing in which way the testing data is different from the training data. In improving the orbit prediction of RSOs, we are more interested in testing the SVM model for future epochs. Therefore, in this section, we introduce a constraint on the testing data, requiring them to be predictions at future epochs with respect to the training data.
5.1 Results on Simulated RSO
The simulation of the RSO is extended to 50 days, in order to provide testing data beyond the training data within the first 0–30 days. The data in the first 30 days is used to train the SVM model, the data in the 30–40 days (future 10 days) is collected as the testing data of the trained SVM model, other data are used to analyze the dependency of the performance on the amount of the testing data. Again, as the alongtrack error is the largest component in the RSW frame, we will only demonstrate the performance of the trained SVM model on .
Figure 13 shows the results using training data in the first 0–30 days to modify predictions in the following 10 days (30–40 days). In the plot of the absolute value, the mean error is reduced from around 50 km to less than 20 km at days. The standard deviation is reduced by more than a half in most cases. The plot of the real value implies that the error is distributed almost evenly along both positive and negative directions, and the mean value is reduced to almost zero with less fluctuations. The metric , the percentage of the residual error, is reduced to 35.5% of the true error. Compared with the results in the last section, in Fig. 10, the performance is worse as the metric becomes larger, which is expected since generalizing the trained SVM model to the future time should be harder than within the same time interval. Fundamentally, the orbit of the studied RSO has changed, as illustrated in Figs 6 and 5.
We note that the maximum prediction duration in Fig. 13 is 7 days, and the predictions in the testing data are made at epochs in the following 10 days. For example, a prediction made at days can base on the estimation given at days at the earliest (with days), and in this situation both the prediction and its based estimation are not included in the training data. So there is no contradiction between and time length of the testing data.
We show other two cases, with the testing data in future 5 days and 15 days respectively, in Fig. 14. The top one is for the testing data in 30–35 days, and the bottom one is for 30–45 days. Comparing Figs 14(b) and 14(a), we observe that the percentage of residual error after MLmodification increases as furtherintofuture testing data is used.
So far, we have demonstrated that the average performance evaluated by can be improved by the proposed ML approach. In the practice, it is also possible that the individual performance of the trained SVM model on one data point at a future epoch is concerned. The ideal case is that all the residual errors after the MLmodification are smaller than the true prediction errors. To evaluate the individual performance of the trained SVM model, we introduce an individual metric , defined with a given as
(8) 
The lower bound of is 0 when the orbit prediction error is perfectly compensated, while the upper bound of is positive infinity.
The performance of the trained SVM model on the simulated RSO, using the same data as in Fig. 13, is demonstrated in Fig. 15, with only the true error (black circles) and the residual error (colorful circles) are shown. The individual metric are calculated at each orbit prediction result, and then used to color the residual errors in the figures. Figure 15(a) shows all the data points whose errors are reduced by the trained SVM model, and Fig. 15(b) shows all the data points on which the MLmodification fails, i.e., the residual errors are increased compared with the true errors, with all the larger than 500% is colored by yellow. Comparing the two figures reveals that the trained SVM model works well on the majority of the testing data point. In Fig. 15(b), both the true error and the residual error are relatively small compared with those in Fig. 15(a), and most failures of the ML approach occur when days. We expect that is due to the fact the orbit prediction is accurate enough and below the modification capability of the trained SVM model. However, further studies are required to draw concrete conclusions.
Figure 16 demonstrates the distribution of the individual metric . The vertical axis is the percentage of the data point in the whole training data, whose is less than the corresponding represented by the horizontal axis. For example, the intersection of the curve and the dashed line indicates that for about 80% percent of the testing data, the percentage of the residual error is less than 100%, meaning that their prediction errors have been successfully reduced by the trained SVM model. So, generally, the ML approach can improve the orbit prediction accuracy at the majority of the future epochs. Since the results of the individual metric on different RSOs are similar, for brevity, the discussions on other RSOs will be omitted in the following paper.
5.2 Results on Other RSOs
To further demonstrate the capability of the trained ML model, six other RSOs with different orbits are studied. These RSOs are chosen from the International Laser Ranging Service (ILRS) (Pearlman et al., 2002), and the parameters are extracted from the TLE catalog, as summarized in Table 4. The orbit altitudes increase from left to right in Table 4, and the LAGEOS1 has the highest altitude of about 5860 km. The simulated RSOs are based on the twoline element (TLE) data of the real satellites, and are assumed with the same areatomass ratio of 0.05 m^{2}/kg, drag coefficient of 2.2, and reflection coefficient of 1.25, which are identical to those in Table 3. For simplicity, we will refer to these RSOs by the name of their base RSOs. However, we note that these simulated RSOs do not reflect true orbit information of the base RSOs, because the parameters in Table 4 are only used as the initial conditions for simulations in this paper.
Base RSO  SWARMA  LARETS  STELLA  HAIYANG 2A  LARES  LAGEOS1 

NORAD catalog number  39452  27944  22824  37781  38077  8820 
Orbit altitude [km]  460  691  804–812  971  1450  5860 
Semimajor axis [km]  6820.0  7060.2  7167.5  7340.6  7813.9  12268.9 
Eccentricity  0.00144  0.0002  0.0010  0.0012  0.0001  0.0045 
Inclination [deg]  87.36  98.204  98.6  99.35  69.5  109.84 
Argument of perigee [deg]  91.51  81.13  284.18  65.83  133.44  278.75 
RAAN [deg]  151.01  355.07  177.33  230.41  58.02  151.20 
Mean anomaly [deg]  297.75  59.94  176.41  169.42  151.57  313.31 
The results of generalization capability to future epochs of these six new RSOs are shown in Fig. 17. For all the RSOs, both the mean values and the standard deviations have been greatly reduced by the trained SVM models. The performance of the ML approach varies with the RSO’s orbit, and further studies are necessary to reveal the detailed relationship.
6 Generalization to Nearby RSOs
In this section, we investigate another generalization capability that is of potential interest for the SSA applications. In practice, we may only have accurate historical data about a limited number of RSOs. And the question is whether an error model learned from such RSOs can be applied to improve orbit prediction accuracy of other RSOs, for which not enough data is available to build the ML model.
6.1 Varying Orbit Inclination
Here we vary the orbit of the training RSO (based on the ISS in Table 3) to evaluate the generalization capability of the trained SVM model to nearby RSOs. The inclination of the learned RSO is . We have changed the inclination to , , and respectively, but have kept all the other orbital elements unchanged. All the new RSOs are propagated from the same starting epoch, and then are measured, estimated, and predicted within 0–50 days. The training data is the error data collected in the first 30 days of the training RSO. And then the trained model is used to modify the orbit prediction of the other RSOs in the following 10 and 20 days. In other words, the testing data of the trained SVM model is the error collected in the 30–40 days and 30–50 days of the other RSOs.
The results of modifying the prediction error of the new RSO with in following 10 days are shown in Fig. 18, where the testing data is the error data in 30–40 days. The mean error and the standard deviation have been greatly reduced. Compared with the performance on the training RSO, shown in Fig. 13, the performance metric has increased from 35.5% to 40.2%, which indicates the improvement still exists but the performance becomes worse.
The results for all the five RSOs are summarized in Fig. 19. The performance metric on the training RSO (with ) is the smallest. Compared with the training RSO, on other new RSOs are larger. But even in the worse case, is still less than 50%, which means more than half of the prediction errors can be corrected by the trained SVM model. Furthermore, the results show that the larger the deviation of the inclination from the RSO that has been learned, the worst the performance. Again, such results are intuitive, but nice to be seen from the machine learning results.
6.2 Varying Orbit Altitude
Another important parameter for RSOs in LEO is the orbit altitude. Here, the semimajor axis of the simulated RSO based on ISS is increased. The orbit prediction results in day 30–40 are used as the testing data, and the SVM model is still trained using the training data of the original RSO in day 0–30.
The results of the trained SVM on a new RSO with km are shown in Fig. 20. The semimajor axis of the new RSO is increased by 50 km, and the metric is 86.0%, which means the orbit prediction errors are only reduced by about 14.0%. The performance of the trained SVM model is not as good as that in Fig. 13, but it is expected because the simulated RSO is in LEO and an increment of 50 km on the semimajor axis can introduce a lot of differences.
The performance metric of RSOs with other increments are demonstrated in Fig. 21. We remark that a decrease of the semimajor axis of the original RSO will result in a reentry in less than 40 days, so only the increment of semimajor axis is considered. It is observed that is decreasing at the beginning. This could be caused by many factors, such as the randomness of the dataset, and the possible variation of the atmosphere density in this region. Meanwhile, the
increases quickly as the increment of semimajor axis becomes larger than 30 km. There are also some fluctuation on the performance when the increment is larger than 60 km. However, for these cases the metric are all larger than 100%, meaning that the ML approach does not improve the orbit prediction accuracy. For the moment, we can only conclude that the trained SVM model can be generalized to other RSOs with different semimajor axes, but it will fail when the variation is too large.
The results in this section are interesting as there is no information directly related to the new RSOs in the training dataset, but the ML approach could still be used to improve the orbit prediction accuracy. Some knowledge learned for the data depends on RSO’s properties such as the orbit altitude and inclination. But some other knowledge can be universal to other RSOs, such as the errors of the assumed model, measurement, and estimator, which depend on the environment and the catalog system. Our conjecture is that the proposed ML approach does not distinguish these differences, so the trained SVM model captures not only the information about the specific RSO that has been used for training, but also some information of the system.
7 Conclusions
In this paper, a new approach of using the supervised machine learning (ML) method to improve the orbit prediction accuracy of the resident space objects (RSOs) is proposed. The framework of incorporating an ML model in the orbit prediction process is formulated and the structure of the dataset is designed based on analyzing the prediction errors and also other possibly available parameters.
A simulationbased space catalog environment is developed to evaluate the performance of the proposed approach, and the support vector machine (SVM) model is chosen as the machine learning algorithm in the paper. RSOs in low Earth orbit are simulated to examine the performance of the SVM model. The results demonstrate that the trained SVM model can: 1) improve the information about the same RSO that shares the same time duration as the training data but are not available during training; 2) improve the prediction accuracy of the same RSO at future epochs; and 3) improve the prediction accuracy of other nearby RSOs unknown to the trained SVM model. It is shown that the trained SVM can reduce more than a half of the prediction error for all the three types of improvement. We note that this is an exploration study and further research is required before practical applications of the ML approach.
Further research is suggested to apply the established framework to real operational data, where the RSOs with large amount of measurement data can be used as the training objects and the trained SVM model can be used to improve orbit accuracy of the other RSOs that share some similar orbit characteristics.
Acknowledgment
The authors would acknowledge the research support from the Air Force Office of Scientific Research (AFOSR) FA95501610184 and the Office of Naval Research (ONR) N000141612729. Largescale simulations have been carried out on the School of Engineering (SOE) HPC cluster at Rutgers University. We also appreciate anonymous reviewers for insightful discussions and comments on the paper.
References
References
 AbuMostafa et al. (2012) AbuMostafa, Y.S., MagdonIsmail, M., Lin, H.T.. Learning from Data. AMLBook, 2012.

Ampatzis and Izzo (2009)
Ampatzis, C., Izzo, D..
Machine learning techniques for approximation of
objective functions in trajectory optimisation.
Proceedings of the International Joint Conference on Artificial Intelligence 2009 Workshop on Artificial Intelligence in Space 2009;2009(September).
 Bergin (2009) Bergin, C.. RED threshold late notice conjunction threat misses ISS. https://www.nasaspaceflight.com/2009/03/threattoisscrewsoyuz; 2009. Accessed on 20170205.
 Crassidis and Junkins (2011) Crassidis, J.L., Junkins, J.L.. Optimal Estimation of Dynamic Systems. Second edi ed. Chapman and Hall/CRC Applied Mathematics & Nonlinear Science. CRC Press, 2011.
 Folkner et al. (2014) Folkner, W.M., Williams, J.G., Boggs, D.H., Park, R.S., Kuchynka, P.. The Planetary and Lunar Ephemerides DE430 and DE431. JPL Interplanetary Network Progress Report 42196; JPL; 2014.
 Förste et al. (2012) Förste, C., Bruinsma, S.L., Shako, R., Abrikosov, O., Flechtner, F., Marty, J.C., Lemoine, J.M., Dahle, C., Neumeyer, H., Barthelmes, F., Biancale, R., Balmino, G., König, R.. A new release of EIGEN6: The latest combined global gravity field model including LAGEOS, GRACE and GOCE data from the collaboration of GFZ Potsdam and GRGS Toulouse. In: American Geophysical Union, General Assembly 2012. Vienna, Austria; volume 14; 2012. p. 2821.
 Hammer and Gersmann (2003) Hammer, B., Gersmann, K.. A Note on the Universal Approximation Capability of Support Vector Machines. Neural Processing Letters 2003;17(1):43–53. doi:10.1023/A:1022936519097.
 Hartikainen et al. (2012) Hartikainen, J., Seppänen, M., Särkkä, S.. StateSpace Inference for NonLinear Latent Force Models with Application to Satellite Orbit Prediction. Proceedings of the 29th International Conference on Machine Learning (ICML12) 2012;:903–910.
 Hill et al. (2012) Hill, K., Sabol, C., Alfriend, K.T.. Comparison of Covariance Based Track Association Approaches Using Simulated Radar Data. The Journal of the Astronautical Sciences 2012;59(12):281–300. doi:10.1007/s4029501300181.
 Kelso (2012) Kelso, T.. Iridium 33/Cosmos 2251 Collision. http://celestrak.com/events/collision/; 2012. Accessed on 20170215.
 Legendre et al. (2006) Legendre, P., Deguine, B., Garmier, R., Revelin, B.. Two Line Element Accuracy Assessment Based On A Mixture of Gaussian Laws. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. Reston, Virigina: American Institute of Aeronautics and Astronautics; 2006. p. 1–13. doi:10.2514/6.20066518.
 Lyon (2004) Lyon, R.H.. Geosynchronous Orbit Determination Using Space Surveillance Network Observations and Improved Radiative Force Modeling. Ph.D. thesis; Massachusetts Institute of Technology; 2004.
 Maisonobe et al. (May 3–6, 2010) Maisonobe, L., Pommier, V., Parraud, P.. Orekit: An Open Source Library for Operational Flight Dynamics Applications. In: 4th International Conference on Astrodynamics Tools and Techniques. May 3–6, 2010. .
 Micchelli et al. (2006) Micchelli, C.A., Xu, Y., Zhang, H.. Universal Kernels. Journal of Machine Learning Research 2006;7(Dec):2651–2667. 00204.
 Mitchell (1997) Mitchell, T.M.. Machine Learning. 1st ed. McGrawHill Education, 1997.
 (16) NASA, . Orbital Debris Program Office Frequently Asked Questions. http://celestrak.com/events/collision/. Accessed on 20170215.
 Pearlman et al. (2002) Pearlman, M.R., Degnan, J.J., Bosworth, J.M.. The International Laser Ranging Service. Advances in Space Research 2002;30(2):135–143. doi:10.1016/S02731177(02)002776.
 Rivera and Bai (2016) Rivera, J., Bai, X.. Improving the Orbit Propagation Accuracy of TwoLineElement Satellite. In: 67th International Astronautical Congress. Guadalajara, Mexico; 2016. p. 26–30.
 Sang et al. (2014) Sang, J., Bennett, J.C., Smith, C.. Experimental results of debris orbit predictions using sparse tracking data from Mt. Stromlo. Acta Astronautica 2014;102:258–268. doi:10.1016/j.actaastro.2014.06.012.
 Sang et al. (2013) Sang, J., Bennett, J.C., Smith, C.H.. Estimation of ballistic coefficients of low altitude debris objects from historical two line elements. Advances in Space Research 2013;52(1):117–124. doi:10.1016/j.asr.2013.03.010.
 Sharma and Cutler (2015) Sharma, S., Cutler, J.W.. Robust Orbit Determination and Classification: A Learning Theoretic Approach. IPN Progress Report 42203; Jet Propulsion Laboratory; 2015.
 Smola and Schölkopf (2004) Smola, A.J., Schölkopf, B.. A tutorial on support vector regression. Statistics and Computing 2004;14(3):199–222. doi:10.1023/B:STCO.0000035301.49549.88.
 The MathWorks (2017) The MathWorks, . Statistics and Machine Learning Toolbox User’s Guide R2017b. The MathWorks, Inc., Natick, Massachusetts, United States, 2017.
 Tukey (1977) Tukey, J.W.. Exploratory Data Analysis. AddisonWesley series in behavioral science. Reading, Mass: AddisonWesley Pub. Co, 1977.
 Vallado et al. (2006) Vallado, D., Crawford, P., Hujsak, R., Kelso, T.. Revisiting Spacetrack Report #3. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. American Institute of Aeronautics and Astronautics; 2006. p. 1–88. doi:10.2514/6.20066753.
 Vallado (1997) Vallado, D.A.. Fundamentals of Astrodynamics and Applications. The McGrawHill Companies, Inc., 1997.
 Wang et al. (2009) Wang, R., Liu, J., Zhang, Q.M.. Propagation errors analysis of TLE data. Advances in Space Research 2009;43(7):1065–1069. doi:10.1016/j.asr.2008.11.017.
Comments
There are no comments yet.