I Introduction
Traditionally, there are three primary control effectors in aircraft flight control: aileron, elevator, and rudder. They are usually designed utilizing one control effector for each degree of freedom. However, due to the increased requirements on the reliability, maneuverability and survivability of modern and futuristic aircraft, control effectors are no longer limited to these three conventional effectors. In newer designs, often a large number of control effectors are used to provide reconfiguration flexibility and required redundancy for the faulttolerance
[Dorsett1996, Khan2020_ECC]. Redundancy in control effectors can also be utilized to enhance an airplane’s performance envelope. For example, thrust vectoring augments the dogfighting capability of the F22 Raptor, and permits maneuvers such as Pugachev’s Cobra in other aircraft
[Durham2017].The problem of distributing control commands to several control actuators is known as Control Allocation. Several tools and approaches have been proposed and used to manage redundancy and to distribute the desired control effort among a set of actuators [Durham2017, Oppenheimer2011]. A conventional and most straightforward approach is to manually distribute the control actuators into three sets, and use each set of control actuators as ‘elevator’ (to produce pitching moment), ‘aileron’ (to produce rolling moment), and ‘rudder’ (to produce yawing moment). This method, formally known as Explicit Ganging, usually works well for aircraft having a relatively lesser degree of overactuation but not so for a multirotor or fixedwing aircraft with a large number of control effectors.
Another much more sophisticated strategy is to distinguish the regulation or tracking aspect of the problem from a control distribution perspective. Fig. 1 portrays how control law specifies desired control effort (desired forces/moments) and how a separate control allocator is introduced in the control loop to allocate the desired control effort among various actuators. In an overactuated system the desired commands can be achieved with multiple combinations of control effector deflections. Using a control allocation method, the desired commands are distributed over the available control effector suite in such a way that the desired control effect is produced, along with fulfilling some additional objectives like minimization of deflections, drag or radar crosssection (RCS) etc.
At present, most common practice is to use control allocation methods that assume a linear relationship between effector deflections and resulting control moments. Such methods include direct allocation [Durham1994, Bordignon1996, Bodson2002], daisy chaining [Buffington1996], redistributed pseudoinverse or cascaded generalized inverse [Oppenheimer2011]
, and methods based on linear programming
[Bodson2002, Buffington1999] and quadratic programming [Bodson2002][Enns1998]. Generally, for most practical applications, optimization based methods are preferred due to their accuracy and flexibility [Johansen2013]. This assumption of linearity allows us to use control allocation through computationally efficient algorithms. However, control allocation problem of an aircraft is inherently nonlinear and coupled, especially the crosschannel effectiveness of surfaces is highly nonlinear in nature [Niestroy2017, Rajput2018].The problem of nonlinear control allocation is challenging due to the possibility of local minima and computational issues [Johansen2013], and so far literature on this subject is limited. A more intuitive way is to treat the nonlinear mapping directly through the sequential quadratic programming [Poonamallee2004], but the computational complexity could be overwhelming. Piecewiselinear [Bolender2004, Bolender2005] and timevarying affine [Luo2007] approximations of nonlinear problems have also been studied. For practical purposes most modern aircraft employ affine control allocation schemes [Oppenheimer2011, Johansen2013], either in absolute or incremental form, due to their accuracy and computational efficiency as compared to other methods. However, these affine methods still require computation of the local Jacobian matrix and offset vector, and solving a linear or quadratic program at each sampling instant. This not only requires large onboard computational power, but also an accurate model of effectiveness function in onboard computer, which can take significant storage depending upon the number of effectors, flight envelope and accuracy of model.
Artificial Neural Networks (ANNs) are widely used computing systems that mimic biological neural networks that are present in an animal brain. These systems learn to do some task by considering given examples, generally without taskspecific programming. There are many types of ANNs used for different type of machine learning tasks, e.g. Fully Connected ANN, Convolutional Neural Network (CNN), Recurrent Neural Networks (RNN) etc. In this paper, we will only consider Fully Connected ANNs. These ANNs have proved to be excellent function approximators and can approximate almost all nonlinear functions with arbitrary accuracy (
Universal Approximation Theorems) [Csaji2001]. In spite of widespread use of ANNs in almost all fields of engineering and science, their employment in control allocation has been limited.Initially, Grogan et al. [Grogan1994]
explored the use of an ANN based scheme for linear control allocation problems, with a single hidden layer, and compared it with direct allocation for F/A18 HARV aircraft. However, based on training limitations and accuracy issues they concluded that ANN based scheme is not suitable for practical applications. It must be noted that their work was done in 1994, and since then, there has been a manifold improvement in the machine learning tools. They concluded that direct allocation is computationally efficient than ANN for a linear allocation problem, but for nonlinear allocation problems these optimizationbased methods are though accurate but highly computationally expensive. An important difference in our work is the use of Rectified Linear Unit (ReLU) activation function which results in better training performance as compared to the hyperbolictangent activation function which they used. Quite recently, there has been some research done on machine learning based control allocation schemes. Chen
[Chen2016] has used RNNs for linear control allocation problems. Huan et al. [Huan2018] have used deep autoencoders for nonlinear control allocation problem, where they considered two separate networks one (encoder) acting as an allocator while the other (decoder) mimics the system. Vries et al. [Vries2019]has used reinforcement learning approach for nonlinear control allocation.
In this research, we pose a general nonlinear control allocation problem in a different perspective, that is to seek a function which maps desired moments to control effectors. This view of control allocation, is similar to pseudoinverse methods for linear allocation problems, but differs with optimizationbased methods for either linear or nonlinear allocation problems, which try to find control effector commands for a given desired moment vector. Due to the excellent function approximation properties of ANN, we use them to approximate this map between desired moments and control effectors, and convert the allocation problem to a machine learning problem. We present training methodologies for different types of allocation problems, e.g. effector prioritization, fault tolerance, reconfiguration etc. Later on we present few important results on stability and performance of nonlinear allocation schemes in general and this ANN based allocation scheme in particular. Then to demonstrate the efficacy of the proposed scheme, we compare the results with standard methods of control allocation for a miniature tailless flyingwing research aircraft [Qu2017, Rajput2015].
Ii Preliminaries
Iia Nonlinear Control Allocation
The general nonlinear control allocation problem is defined as follows: find the control vector such that,
(1) 
where is actual input vector of the system being controlled, is vector of desired moments (virtual control), is some state vector, and is control effectiveness mapping, and is assumed to be at least continuous in both and , where (mentioned below) is known as Attainable Moment Set (AMS). Following are the precise definitions we will be using in later sections:
Definition 1.
The pointwise AMS is defined as,
(2) 
and total or complete AMS, or just AMS, is defined as
(3) 
The system of equations Eq. (1) being underdetermined, usually possess multiple solutions, only if the constraints on input are strongly satisfied, otherwise no exact solution exists. Numerous methods have been proposed [Oppenheimer2011, Johansen2013] to solve this constrained allocation problem, especially for cases when is a constant linear map from to , for example Redistributed PseudoInverse (RPI) or Cascaded Generalized Inverse (CGI), DirectAllocation, DaisyChaining, and Optimization based methods.
For nonlinear , which is usually the case for modern overactuated aircraft, it is a common practice to convert a nonlinear problem into a locally affine allocation problem at each sampling instant. At any sampling instant (1) can be approximately written as;
(4) 
where
Eq. 4 can be solved using any linear allocation method, depending upon the requirements, for which highly efficient tools are available. However, due to their superior performance constrained optimization based methods are generally preferred for practical applications over pseudoinverse based methods [Johansen2013].
General nonlinear allocation problem (1) can be posed as following weighted constrained optimization problem: given and solve
(5) 
subject to:
where, represents the weighted 2norm and is defined as, , where is a symmetric positive definite matrix. It must be noted that in this case, for a given , optimization gives a vector .
IiB Artificial Neural Networks (ANNs)
Artificial Neural Networks (ANNs) are selflearning algorithms that are created in such a way as to mimic the way a human brain investigates and processes information using an interlocked mesh of neurons. ANNs are substantial layered computation paradigms consisting of simple processing elements that can solve problems which are deemed complex in nature by human or statistical standards, given that enough learning data sets are available for training. In the past few years, the adaptive nature and excellent function approximation capability of ANNs have led to their prevalent use in a variety of applications
[Goodfellow2016, Haykin2010].Structure of an ANN consists of multiple layers of artificial neurons. Each neuron receives a vector from the previous layer as its input; and applies an affine transformation followed by a static nonlinear activation function. Therefore, the output of th layer is defined as,
(6) 
If the network has neurons in th layer, and neurons in th layer, then is the weight matrix, and is the bias term. is nonlinear activation function. The final output () of the nested functions showcase the successive connections of an layered ANN, where is the input vector.
The learning part of any ANN is carried out through the use of a backpropagation algorithm. A loss function is used to calculate the error between the actual and the predicted value once the forward pass has computed the values from inputs to outputs. The sensitivity of the cost with respect to each weight is then calculated using the backward pass, which is considered as a recursive application of the chain rule along a computational graph. The backpropagation algorithm minimizes the loss function using common optimization algorithms such as stochastic gradient descent or the ADAM optimizer.
Iii Problem Formulation & Training Methodology
The key idea of this research is, to try to find the mapping , instead of single vector as in standard allocation problems. Though a similar work for linear case has already been done [Grogan1994, Chen2016], but in this work we have developed a generalized theory and posed it as a machine learning problem. Let’s define a projection operator and present a lemma, which will be used in subsequent development.
Definition 2 (Projection Operator).
Consider a dimensional set , then the projection operator is defined as follows: (see Fig. 2)
(7) 
Remark.
It must be noted that for a rectangular hypercubical set e.g. , reduces to vector saturation, i.e.
Also if then reduces to Identity map.
Lemma 1.
Given an unconstrained optimization problem of the following form
(8) 
The optimal solution can be equivalently considered as solution of the following problem:
(9) 
Proof.
Iiia Problem Formulation
Using projection operator, we can rewrite constrained optimization problem (5) as following unconstrained one: given and solve
(10) 
where
Since generally , so for most of the cases there doesn’t exist any perfect for which . Our main goal is to find a map , which minimizes .
Remark.
It is worth noting that for the case when is affine in , say , and is unconstrained i.e. , then the problem (11) can be analytically solved, and it results in the following well known pseudoinverse solution i.e.
(12) 
Till now the development has been pretty general, and any numerical method for functional optimization problems can be used to solve (11). However, if we discretize it over the domain, and considering an ANN as a candidate for the map , then we can equivalently pose (11) as following learning problem, i.e. learn the network , while minimizing following cost over the network parameters.
(13) 
It must be noted that the training of this network is not a standard supervised machine learning problem. Therefore, it requires further consideration discussed subsequently.
IiiB Training Methodology
To train a network, first step is to obtain training data. For our case given the sets and and map , we can generate random datapoints in , and applying gives corresponding points in . These random datapoints can be generated using methods such as Latin Hypercube Sampling. From this data neglect , and select datapoints only in . This dataset will be used as input of network. Also no other output dataset is required as network is being trained according to the schematic shown in Fig. 3.
In many practical scenarios, control allocation is used as much more than just distribution of control commands. It is also employed for reconfiguration in the occurrence of faults [Khan2020_ECC], and for prioritization of control effectors [Buffington1996]. In case of control effectors’ prioritization, daisy chaining approach is usually employed, which divides all effectors into multiple sets depending upon their priority. Then it is solved as sequential solution of small allocation problems for each set. In our ANN based approach this can be easily accomplished using multiple small ANNs for each set of effectors, and overall allocator can be obtained by appropriate stacking of these small networks.
For the case of reconfiguration based fault tolerance, there could be two possible approaches in our ANN based framework. (i) Separate ANN could be trained using standard approach for each failure case and switched during flight according to the occurrence of faults. (ii) A single network could be trained with an extra input which specifies all the fault scenarios. In this case training data would be the set of all possible faulty and healthy conditions.
Iv Main Results
Iva Performance of Control Allocator
There are multiple criterion to compare the performance of control allocation methods, most common are the Allocation Error and Volume Ratio of AMS of the method to actual AMS () [Durham2017]. In literature allocation error is usually compared along a desired trajectory in [Durham1994]. In this section, we present a slightly different yet more general definition of Maximum Allocation Error (MAE). It can not only be used to compare different allocation methods but also for stability & robustness analysis. Moreover, we define Volume Ratio for general nonlinear control allocation problem. Even tough for linear methods, tools are available to compute Volume Ratio [Durham2017] but for a general case such tools do not exit.
Given the effectiveness map , sets , , and , and a control allocator , we define following performance measures for this allocator. Here, it must be noted that for following definitions it is not necessary for the control allocator to be a function; it could just be an algorithm. However, it needs to be a deterministic one.
Definition 3.
The Maximum Allocation Error (MAE) of an allocator is defined as
(14) 
where represents 1norm.
Definition 4.
The AMS of an allocator can be defined as
(15) 
and similar to Definition 1, the total or complete AMS of an allocator is defined as, .
Definition 5.
The Volume Ratio of allocator is defined as the ratio of volume of AMS of allocator and volume of actual AMS. This can be written as
(16) 
Now, let’s consider practical aircraft control allocation problem, where is in general piecewise linear function. Using this property in our ANN based allocation scheme yields following results:
Theorem 1.
For piecewise linear , and ANN based allocator with only piecewiselinear activations (e.g. ReLU), and hypercubical input set , the following holds:

The cost of (14) is also a piecewise linear function.

The solution of (14) can only be at the boundary of polytopes which divides the whole domain into finite regions and cannot be in interior of any of these polytopes.

If there are polytopic regions for or , (a fixed ), and being the AMS (pointwise AMS) of th region, then is a convex polytope for all , complete AMS (complete pointwise AMS) can be written as
(17)
Proof.
To prove the first statement, with hypercubical input set , since , therefore, is piecewise linear. Each layer of an ANN can be written as and since activation is assumed to be piecewise linear, so the complete ANN is piecewise linear. Moreover, the cost is 1norm, which is defined as , it is also, piecewise linear. Now, recalling that any combination of piecewise linear functions is also a piecewise linear function completes the proof of the first statement.
For second and third statement, consider a single region (polytope) of the domain over which the cost is an affine function of inputs. The optimum (either maximum or minimum) of the function over the region cannot be in its interior, thus it must be on the boundary. Each region being a convex set when operated by an affine function results in a convex polytope (pointwise AMS). Therefore, complete AMS would be union of all these pointwise AMS of each region.
∎
Even though the above discussed theorem provides results of fundamental importance, its practical implementation is still being researched.

Given any two piecewiselinear functions, currently the authors have not been able to find an algorithm to determine the regions over which their composition is piecewise linear.

Given convex polytopes , the volume of their union can be computed by inculsionexculsion formula, but its computational cost is of order of . Another approach is to use the following identity:
(18) where represent volume, , and represents convex hull. The second term in Eq. (18), though is usually very small in magnitude as compared to first one, but requires a lot of computational power.
In the upcoming section of this paper, we have used global optimization techniques for computation of MAE and MonteCarlo based method for Volume Ratioestimation.
IvB Closed Loop Stability Analysis
Consider a system of the following form:
(19) 
where is the state vector, is the true control input, is the virtual control input, is control effectiveness, is a matrix with rank , and is a function of state; usually, it would be a subset of all states. Let’s assume from a control design technique that we have a virtual control law , which gives the ideal closedloop system
(20) 
After incorporating control allocator , we get the following actual closedloop system:
(21) 
The basic idea of the following result is to treat the allocation error as nonvanishing but bounded perturbation at the input.
Theorem 2.
Let the origin (), be an asymptotically stable (AS) equilibrium of the ideal closedloop system (20), and let be its Lyapunov function which satisfies
(22) 
, where , and are class functions. Suppose that the MAE satisfies
(23) 
Then, the solution of actual closedloop (21) satisfies
(24) 
and
(25) 
for some class function , and some finite , where is class function and is defined as
(26) 
Proof.
For simplifying notations, in this proof we have used , , and as norm. Using as Lyapunov function for actual closedloop (21) we get,
Applying Theorem 4.18 of [Khalil2002] completes the proof. ∎
Remark.
It should be noted that though the above results have been presented for a case of static controller , the same results can be applied to any continuous dynamic or observer based controller, even with multiloop control architecture. First step in such a case would be to write the complete closedloop system in the form of Eq. (21), where represents all states (system, controller, observer  combined). Then this result can be applied directly.
V Aircraft Control Allocation: An Example
Va Aircraft Specifications
In this work we have used aerodynamic model of a smallscale tailless flying wing aircraft. The comprehensive specifications of the aircraft can be found in [Rajput2015, Qu2017].
The aircraft under discussion has six trailing edge surfaces () and two (left/right) pairs of clamshell surfaces ( and ). Deflection of trailing edge surfaces in downward direction is considered to be positive whereas in upward direction it is considered to be negative.
Surface  Min.  Max. 

20  20  
&  0  40 
&  40  0 
In total this aircraft has ten control surfaces. Any combination of these surfaces can be used to maneuver the aircraft. For this case, the authors have studied all trailing edge surfaces () as a single elevator (). This results in five independent control surfaces to be considered for control allocation problem. The saturation limits of the control surfaces are given in Table I.
VB Results & Discussion
Training data was randomly generated using Latin Hypercube
sampling. A data set comprising of 0.1 million data point was generated. This dataset was split with a ratio of 0.7, 0.15, 0.15 into training, validation, and test data sets, respectively. Keras API in Pthyon was used to design and train the ANNs. Adam optimizer was utilized as the optimization scheme, where the learning rate was initially kept as 0.005. The first and second moments were maintained at 0.9 and 0.999, respectively. Mean Squared Error (MSE) was kept as the loss function, while Root Mean Squared Error (RMSE) and Rsquared (
) values were kept as metrics to measure the performance of the networks. Rectified Linear Unit (ReLU) was used as the activation function in all hidden layers due to its computational simplicity and effectiveness in handling of vanishing gradients. It should be noted that the theoretical results discussed in section IVAfocused on piecewise linear activation e.g., ReLU. Every network was trained for 50 epochs with the data being fed to the network in multiple batches of 128 samples. Validation losses were also monitored simultaneously, and the learning rate was automatically reduced by a factor of 10% if the validation loss did not improve after a
patience period of 5 epochs.Fig. 4 shows MSE history of training and validation sets. It can be seen that for all network configurations the MSE settles down in a straight line. This depicts that the network has fully learnt the data set and training it for more epochs would not improve its performance. Fig. 5 shows the learning rate history. It can be seen that for later epochs the learning rate has dropped down to almost zero which validates the previous claim that the network has fully trained.
Control Allocator  No. of Parameters  MSE  MAE ()  

5.4.5  59  0.9907  0.0300  
5.8.5  103  0.9969  0.0286  
5.16.8.5  287  0.9989  0.0226  
5.16.8.4.5  303  0.9987  0.0234  
5.16.8.8.5  359  0.9987  0.0232  
5.8.16.8.5  383  0.9984  0.0266  
5.16.16.8.5  559  0.9990  0.0215  
5.32.16.5  815  0.9994  0.0185  
5.32.16.8.5  911  0.9993  0.0207  
5.16.32.16.5  1263  0.9994  0.0185  

552  0.0326 
The first column in Table II depicts different architectures of the ANNs studied. For example, the first entry ‘5.4.5’ means that the network has five inputs and five outputs and a single hidden layer comprising of four neurons. Similarly, ‘5.32.16.5’ means that inputs and outputs remain five. However, this architecture has two hidden layers with 32 and 16 neurons, respectively. Out of five inputs for each network; first three are desired moments and last two are angle of attack () and sideslip angle (). Output are deflection of five control surfaces (, and ). The second column depicts the number of parameters required to implement ANN based control allocator on flight computer. The succeeding two columns depict MSE and of test data set. The last column represents the Maximum Allocation Error (MAE) as defined in section IVA. It was computed through MATLAB’s GlobalSearch function while using fmincon as its local solver.
These different network architectures were compared to traditional quadratic programming (QP) based affine control allocation scheme. This method requires local slopes and offsets at each sampling instant, which in turn needs an onboard model of control effectiveness (). These slopes and offsets were obtained through a polynomial based model which is a standard practice.
Observing the values obtained of MSE and it can be noted that with increase in hidden layers and number of neurons the network performance is improved. It is also observed that the network performance of ‘5.16.8.5’ is best suited in our opinion, as there is no substantial betterment in performance with increasing the number of hidden layers and the number of neurons after that. With small number of parameters and decent performance (MSE, and MAE), it is a logical choice for network architecture. This selected architecture has smaller allocation error, while lesser number of parameters are required than the traditional scheme.
The desired moment vector is applied in terms of the predefined trajectories of moment coefficients , and , which correspond to the helical path with radius 0.003. The angle of attack () is varied from 0 deg (at 0 sec) to 8 deg (at 1 sec), and sideslip angle () is varied from 12 deg (at 0 sec) to 12 deg (at 1 sec). Thus, the control allocation algorithm is evaluated for almost complete variation of control interaction with respect to deflections, angle of attack and sideslip angle. Fig. 6 depicts the moments generated by different allocation schemes. It is noticed that the network selected follows the desired moments quite closely. Fig. 7 shows that both ANN based allocator and traditional method have similar (very small) allocation errors. Even though both methods have similar performance however proposed ANN based scheme is preferable because it requires less data storage. It is also computationally much efficient than the traditional method as it only requires few matrix operations, whereas the standard method requires solving of a quadratic program at each sampling instant. Table III shows average time taken by both methods to allocate a single vector of desired moments. It can be seen that the ANN based scheme is much more efficient. However, it must be noted that these values are from MATLAB’s tictoc routine on a normal laptop. So, in actual onboard implementation, values might differ, but the relative difference would remain of similar order.
Method  Time [msec] 

ANN Based  0.143 
QP Based [Poly]  15.68 
Vi Conclusion
In this research work, a general nonlinear control allocation problem was posed in a different perspective, that is to seek a function which maps desired moments to control effectors. This view of control allocation is similar to pseudoinverse methods for linear allocation problems, but differs with optimization based methods for either linear or nonlinear allocation problems, which try to find control effector commands for a given desired moments vector. Allocation problem was converted to a machine learning problem and training methodologies for different types of allocation problems were presented, e.g. effectors prioritization, fault tolerance, reconfiguration etc.
Two important results for nonlinear control allocation problem were presented in this work. Firstly, computational issues of performance parameters for piecewise linear effectiveness functions with ANN based allocators were discussed. Secondly, the conditions of closed loop stability, with allocator in the loop, in terms of maximum allocation error, were presented. However, there are a few future research avenues which need to be further explored in regard to the development of efficient algorithms for detailed performance analysis of different allocators, e.g., volume computation of union of convex polytopes and the calculation of domain partitions for composition of piecewise linear functions.
Later, to demonstrate the efficacy of the proposed allocator, the results were compared with those obtained from standard QP based method with polynomial model. It was shown to have similar performance with reduced number of parameters required and with much less computational cost. The results also portrayed that the run time for the proposed scheme of ANN based control allocator was less than standard QP based method by an order of magnitude.