## I Introduction

Accurate state-of-charge (SoC) estimation is necessary for optimal battery management and safe and reliable utilization of battery powered devices, such as electric vehicles (EVs) and grid level energy storage. For lithium-ion batteries, in particular, SoC estimation is difficult because the relationship between the SoC and the open-circuit voltage (OCV) is non-linear, as can be seen in Fig. 1. In certain ranges of the SoC in Fig. 1, the voltage is completely flat with respect to the SoC due to phase changes occurring within the system; this makes it challenging to estimate the SoC from voltage measurements. A variety of methods have been proposed to estimate the SoC in lithium-ion batteries.

Unlike the fuel level in traditional combustion engine vehicles, the SoC cannot be directly measured in EV applications. However, the SoC is internally linked with direct measurement (voltage, current, temperature and capacity) and can be extracted by using battery intrinsic relations and/or control theory.

#### I-1 Open-Circuit Voltage (OCV) mapping

The techniques of estimating the SoC have been extensively investigated. The most straightforward method is to map the OCV to the SoC, as a one-to-one translation can be found between SoC and OCV under certain conditions. Given a specific OCV, the corresponding SoC can be accurately interpreted if the measured condition matches the one where the OCV-SoC map is acquired. In other words, the OCV-SoC map varies with the testing conditions, such as temperature and aging status, which introduces a significant amount of variability and can bias the SoC estimation [Chemali2016, Waag2014, Baronti2011]. Even the direction of current flow (charging/discharging) will affect the OCV-SoC map significantly according to [Roscher2011]. In addition, complete electrochemical equilibrium cannot be achieved within a short time frame [Waag2013c]. Therefore, while the battery is under load, it is unfeasible to perform real-time updates of the SoC based on OCV measurements. For these reasons, OCV-based SoC estimation is commonly used as a complementary or corrective method running in the background [Xiong2017].

#### I-2 Coulomb-Counting

Coulomb-Counting identifies a SoC estimation technique that integrates the battery current, i.e. counts the Coulombs. Hence, it can identify an SoC difference but requires knowledge of an initial SoC value, which can be obtained with an OCV-SoC map in a well known condition. Coulomb counting (or Ah counting) integrates the current passed in/out of the battery with respect to time and converts it to the SoC using the following expression:

(1) |

where is the initial state of the SoC; is the present capacity of the cell. The charging/discharging efficiency is denoted as . The current that charges/discharges the battery is . However, the accuracy of SoC estimation would be compromised if low-res current sensors are used or the capacity is not updated as the battery ages [Ng2009, Baronti2011]. Especially in situations where the SoC cannot be regularly corrected by OCV-based methods, the predicted SoC significantly drifts away from the true value and misleads other functions in BMS. As a result, coulomb counting is commonly used in the laboratory environment where the aforementioned uncertainties can be reasonably controlled.

#### I-3 Model-based Observer

To reduce the uncertainties of the open-loop SoC estimation methods mentioned previously, techniques with feedback mechanisms to correct for possible bias and real-world compromises (such as sensor resolution) have been extensively investigated. Modern nonlinear state estimators and observers are commonly adopted. Particularly, Kalman-Filter (KF) based technologies

[Plett2004c, Wang2016c, Dai2012, Sepasi2014, Xiong2017, Plett2005], recursive least square methods (RLS) [He2012, Xia2018], and slide-mode observers [Ning2016, Belhani2013, Liu2016] have been heavily researched as they provide reasonable estimation accuracy and relatively robust performance.However, constructing such an observer requires precise system modeling [He2012] for the specific type of battery in the system and repetitive hand tuning to select a well-behaved covariance matrix. As the battery ages, the derived battery model using a ‘fresher’ cell’s data is biased and may even be invalid. The capacity decreases while impedance increases for aged cells, which can result in an offset/error from the true SoC and even divergence of the observer. In addition, the initial states of the observer that are fed from external sources significantly affect the performance of the estimator, in terms of convergence and accuracy.

#### I-4 Data-driven methods

With advancements in computation and an abundance of real world data, machine learning or specifically neural network-based methods are providing researchers with the ability to achieve significant advancements in many fields [Kri2012, ciresan2012, Hinton2012, Ma2015, Wang2016]. SoC estimation applying neural network-based methods has also drawn attention [Chemali2017, Chemali2018, Du2014, Charkhgard2010].

Compared to a 2% average SoC error achieved by model-based observers [Plett2004c, Wang2016c, Dai2012, Sepasi2014, Xiong2017, Plett2005]

, 4% RMS error on terminal voltage is achieved with 2-layer neural network and 30 neurons in the hidden layer

[Charkhgard2010]. However, should further error reduction be desired, neural networks need the help of external filtering/observer (like Kalman-filter in [Du2014]). Chemali2018 directly mapped the measurements of the cell (instantaneous and average terminal voltage, temperature, and average current) to the SOC estimation and is able to achieve a mean absolute error below 1% [Chemali2018]. This paper also showed that the number of layers and neurons had a minimal effect on the SOC estimation accuracy – a 2-layer network with 2 neurons per layer seems to be a good compromise between computational time and estimation accuracy. Nevertheless, the worst-case error through the entire test was as high as 7%.In cases where estimation performance is limited by an OCV-SoC plateau, seen in Fig. 1, or when complete equilibrium of the battery is unfeasible, additional information can be gathered by injecting an augmented current profile into a battery under load. This paper hypothesizes that passing current pulses through a battery and measuring the voltage response to these pulses can be used to retrieve information about a lithium-ion battery’s SoC. Because these measured electrochemical responses do not have an obvious relationship with the SoC, a neural-network can be used to learn the relationship and reconstruct the information.

This paper is organized as follows. In Section II

, a previously constructed electrochemical model is used to prove the concept and provide insights of correlation between pulse amplitude and accuracy improvement. A neural network constructed using TensorFlow was tested in Section

IV whether OCV measurements and current pulse information could effectively reconstruct the SoC information of the system. The testing procedure and experimental results using NMC cells captured by real-time battery testing system are shown in Section III. The hardware and software are being developed for practical implementation of the pulse derived SoC estimation in Section V. The results of the paper are summarized and proposed future work is discussed in Section VI.## Ii Hypothesis and Proof-of-Concept Simulation

The numerical details for the implementation of the Li_xV3O8 electrochemical model are found in refs. [Brady2016, Brady2018]. For the current pulses done in this paper, the cathode thickness was assumed to be 500 , the porosity was assumed to be 0.45, with the volume fraction of active material (Li_xV3O8) being 0.48, and the volume fraction of conductive material being 0.07, and the crystal size of the active material was assumed to be 120 nm in the [001] direction.

Fig. 1 shows how the OCV of Li_xV3O8 varies as a function of SoC. It is observed that the relationship between the SoC and the OCV is non-linear and in particular, when this material goes through a phase-change (approximately ), the OCV is constant, i.e. . Because of the non-linearity and because of the OCV-SoC plateau, it is difficult to estimate the SoC from OCV measurements alone. Fig. 2 (left - black data) shows how the estimates produced from the OCV (black) deviate from the true SoC. The OCV derived estimates are precise and accurate in the range , but are imprecise and inaccurate in the range .

To gain more information about the battery and thereby obtain better estimates of the SoC, pulses were constructed by lithiating the Li_xV3O8 cathode at a current rate of C/18, allowing the system to rest for 2 hours, then passing a pulse current at various amplitudes and measuring the potential for 60 seconds at a sampling rate of 1 Hz. An example of the voltage measurements derived from one of these current pulses can be seen in the inset of Fig. 1. Fig. 2 (left - brown data) shows that the estimates derived from voltage measurements during a current pulse are more accurate and precise than the estimates from the OCV measurements, especially in the range .

Fig. 2 (right) shows the relationship between accuracy of the SoC estimates and the amplitude of the pulse current. A pulse current of 0 corresponds to the estimate obtained using the OCV data. It is observed that as the amplitude of the injected current increases, the accuracy of the estimation improves and this improvement is most pronounced at low amplitudes. Additionally, the improvement appears to plateau at . If an objective function seeks to minimize estimation error, while also minimizing the cost incurred by pulsing the system, the relationship observed in Fig. 2 (right) implies that there is an optimal current to apply for this particular system. It should be noted that there is a preference toward low amplitude pulses especially in EV applications because low amplitude pulses are easier to implement through the balancing hardware.

## Iii Test Cases Design

Injecting pulses can clearly improve the robustness of SoC estimation as shown in the previous section. To build up the initial machine learning model and implement the concept in real battery systems, the pulse data sets that are used to train the neural network are first obtained in the laboratory environment. Therefore, a systematic testing procedure is proposed to ease the procedures that future researchers need to go through. The test cases basically follow the standard hybrid power pulse characterization (HPPC) test regulated by DOE [Idaho2010] but with distinguished modifications to adapt the optimized pulse amplitude as well as the facilitated aging tests, as discussed in Section II.

### Iii-a Testbench

The lithium nickel manganese cobalt oxide (NMC) cells are selected initially to explore the possibility of implementing machine learning. Compared with other chemistries, such as lithium cobalt oxide (LCO), NMC provides higher boost current and longer life-span and therefore is commonly used in automotive and energy storage systems [BatteryU].

The testbench consists of (i) real-time battery cycler with thermal couples, Neware BTS4000 series; (ii) host PC recording and uploading data to data base; (iii) NMC high-energy cells (parameters shown in Table I). The complete system is shown in Fig. 3.

Cell chemistry | NMC |
---|---|

Nominal capacity | 3000 mAh |

Cut-off voltage/current | 2.5 V/150 mA |

Maximum voltage | 4.2 V |

Maximum charging/discharging current | 4/15 A |

The real-time battery cycler is capable of testing the cells using constant current (CC), constant voltage (CV), CCCV, and dynamic current profiles (driving cycles) with a sampling resolution of 0.1s. The hi-res measurements, including voltage, current and temperature, are uploaded to the database through the communication line with the host PC for later data process.

### Iii-B Capacity Check

Before discussing the pulse train structure, the capacity should be checked regularly in order to interpret the SoC correctly. One of the most obvious consequences that can be observed when the battery cells are aged is the capacity fade. In addition to the uncertainties of the estimator, if the capacity is also outdated, it further degrades the accuracy of SoC estimation as the SoC and present capacity are interacting with each other according to the SoC definition in Eq. (1). Note that in this paper the sign convention for the current is positive for charging and negative for discharging.

Should the accurate SoC be desired, knowing the capacity in advance is essential to train the machine learning algorithm. Capacity is often defined with the current level. Higher current (either charging or discharging) will result in lower capacity due to the internal resistance of the battery [Wang2016b]. However, in order to obtain the approximately true capacity that is available in the cell, a relatively small current rate (0.1 C-rate) is trickled in to fully charge and discharge the cell. The smaller the current rate is, the better the internal resistance can be ignored. Both charge and discharge capacity can be obtained by integrating the corresponding parts, as shown in Fig. 4. Note that, this capacity should be used for back-calculating the precise SoC breakpoints in the following section.

### Iii-C Pulse Train

The pulse train needs to be carefully designed in order to maximize the information that can be extracted from limited data points. As discussed previously, the properties of the pulse determine the accuracy of the SoC estimation. It can be seen that higher current level contributes to lower estimation error. However, the feasibility of the current amplitude in the real battery system needs to be investigated. So a trade-off should be made between accuracy and feasibility. Especially in real-life EV systems, the pulses should not interrupt how the drivers drive or leave any obvious sign that the BMS is trying to reconstruct the SoC. The violation, for example but not limited to, can be unexpected acceleration. But it can be as ’stealthy’ as a current sharing between cells when the battery needs balancing, which can be achieved by active balancing topology describe in the next section. In this paper, the current amplitude is chosen to be 1 C-rate since it potentially will decrease the error more while keeping the cells away from maximum allowed current.

The pulses are injected at every 10% SoC. Finer resolution can also be achieved but with the compromise of testing time. Firstly, the battery cells are fully charged by CCCV, followed by 1 - 2-hour rest to allow complete equilibrium inside of the battery. The battery is then discharged at 1 C-rate to 90% SoC. By allowing 1-hour relaxation before injecting pulses, the voltage response isolates the charge-transfer and/or charge diffusion effects that are induced by previous current excitation. The subsequent voltage response will be purely excited by the current pulses. If the cell is not well rested, the charge history will be coupled into the pulse response, which makes the results less accurate. A 1-min long charge pulse and discharge pulse with 1-min rest between them are injected at 90% SoC. Then the cell is discharged to 80% SoC and repeat the same sequence as it is for 90% SoC. This procedure keeps repeating until cut-off voltage is reached at any time. The sample results for testing sequence is shown in Fig. 5.

Cells occasionally show strong individuality in terms of aging trends and responses to current due to the variations in manufacturing processes. Three cells are tested under same conditions as a batch to minimize the individuality by comparing and averaging the resulting data. Fig. 6 illustrates the voltage responses excited by the current pulses from 90% to 20% SoC levels. The major difference among them is the voltage level where they operate gradually decreases as the cells deplete more. There are also more subtle differences which hardly can be differentiated by bare eyes but can be captured by the machine learning algorithm, for example higher voltage drops when current just applies to the cells as cells discharge. At each SoC breakpoint, the results from three cells are superimposed on each other. It shows high consistency across the entire SoC test points.

### Iii-D Overall Test Procedure

As the battery ages, the performance of the model-based SoC estimators significantly degrade as they highly rely on an accurate model, especially on capacity as expressed in Eq. (1). Normally, a joint estimator or a separate slow-react estimator needs to be added for capacity estimation [Wang2015, Zou], which inevitably increases the complexity of the BMS.

Aging a battery is a time-consuming task. To accelerate the aging process, a pre-defined aging procedure is proposed here. The cells under test fully discharge with a CC at 1 C-rate, and followed by fully charge with a CC at 1 C-rate to maximum voltage and CV until current drops below 150 mA. The aging test will be terminated when the capacity reaches 80% of its original one, which is normally called end-of-life (EOL) for EV application.

Combining the capacity check, pulse train and accelerated aging test completes the testing procedure design. The entire test procedure is summarized in Fig. 7.

## Iv Machine Learning Finding the Correlation

Feedforward neural networks (FNN), shown in Figure 8 to have 2-layer and multi-layer architectures, can in principle, model most non-linear systems by mapping inputs to a desired output.

In this paper, the pulse-train generated from previous tests is fed as an input to an FNN and an estimated SoC is provided as an output of the network. The network is trained by computing the difference between this estimated value and the ground-truth or ideal SoC values. Therefore, a typical input sequence will contain pulse-train information paired with their corresponding ground-truth SoC value and can be defined by , where and

are the ideal state-of-charge value and the vector representing the pulse-train input.

FNNs can be summarized by a sequence of matrix multiplication and can be represented by the below composite function. Let denote the weight connection between neuron in layer and neuron in layer . Let and

be the bias and the activation function, respectively, of neuron

in layer . The hidden layer activations can be computed as follows;(2) |

where,

for | (3) |

is the estimated state-of-charge for pulse-train

. The nonlinearity used in these networks is called Rectified Linear Units (ReLU) due to its simplicity during the feedforward and backpropagation steps. The latter is given by;

(4) |

The error signal measuring similarity of the estimated SoC value to the gound-truth value is given by;

(5) |

The a mean absolute error summarizes the performance of the FNN over the entire dataset and is defined by;

(6) |

where is the length of the pulse-train. A forward pass begins when the pulses are fed into the network and is complete when the FNN provides an estimate of the SoC and the over loss is computed. A full training epoch,

, includes one forward pass and one backward pass; describing the process of tuning the network weights and biases based on the loss function. This is defined by the following composite function;

(7) | ||||

where and are decay rates set to 0.9 and 0.999, respectively, is the learning rate and is a constant term set to .

Training of the FNN is done offline and only when network converges to a lower loss threshold can the networks be applied online. During online operation, only a forward pass is required in order to estimate SoC. Backward passes are no longer required once the model is appropriately trained. FNNs offer an advantage of faster computing time, once trained, since a forward pass is comprised mainly of a sequence of matrix multiplications.

In this paper, TensorFlow [tf2015]

, a machine learning framework, is used with a TITAN Xp NVIDIA Graphical Processing Unit (GPU). The TensorFlow and Keras frameworks provide the ability to prototype neural networks quickly and iterate on various architectures and loss functions. These frameworks also offer automatic gradient computation thereby allowing for a seamless backward computation without any manual intervention.

Estimation accuracy of a feed-forward neural network reconstructing SoC from a finite-time test sequence with

=100 hidden nodes. a) SOC percent error recorded as a function of ground-truth SOC values. b) Example of training process for FNNs. MAE and RMSE over training and validation datasets are recorded as a function of training epochs.An example of the training and validation process is shown in Fig. 9 and Fig. 10, where the estimation error is shown as a function of the true SOC values or ground-truth values in Fig. 9 and Fig. 10. The mean error across the entire SOC range is well below 2% which is quite competitive compared with aforementioned model-based observers. In Fig. 9 and Fig. 10, the model is trained for 10,000 epochs; training and validation MAE are shown. In this work, training typically spanned 1 to 10 hours depending on the number of epochs chosen.

## V Application in Electric Vehicles

Although training is done on a GPU to capitalize on their parallel computing capability, when applying the FNNs in real world situations, a standard microprocessor can be used since, as mentioned above, the feedforward step comprises of a series of matrix multiplications.

A summary of experiment testbench to implement the proposed machine learning algorithm in real battery systems is presented. The diagram of the testbench is shown in Fig. 11. Note that the DC/DC converter, illustrated in Fig. 11, has already been built. The DC/DC converter is a battery balancing circuit equipped with sensors and necessary computing unit to perform basic BMS functions [Wang2018b, Wang2018c]. In addition, the microcontroller has been integrated with the converter to perform reasonable computing work. The entire system consists of (i) the peripheral hardware (the DC/DC converter); (ii) center and localized controllers that actuates the pulse injection, necessary BMS functions and circuit operation; and (iii) powerful computing units that analyze the uploaded data and generate the machine learning model for the use of the microcontroller.

### V-a Pulse Injection Module

As the key novelty of the proposed concept, the pulses should be injected to the cells at the right timing with proper amplitude and duration as accurate as in laboratory environment. The higher-level controller which acts like a ‘brain’ of the BMS initially sends commands to the local microcontroller. This command describes the reference currents of the battery cells, specifying amplitude and duration of expected pulses. Once the command is received by the microcontroller, it will generate corresponding PWM pulses and pass them to DC/DC converter to actuate the pulses into the battery cells. Additionally, the property (amplitude, period, etc.) of pulses could be arbitrarily adjusted by properly controlling the converter’s behavior.

### V-B Measurements Update Module

The microcontroller equips high-resolution analog-to-digital converters that translates the analog signals (such as voltage/current measurements) to digital values, such that the computing unit can process them. The essential measurements (cell voltages, currents and temperatures) that are necessary inputs for the machine learning algorithm are captured and updated at the pre-defined sampling rate. Based on the sampling rate, the measurements will be continuously uploaded to the computing unit via communication protocol for further calculations of the machine learning model.

### V-C Machine Learning Model Update

During machine learning model update, aforementioned measurements captured by the microcontroller have been transmitted to computation center either through wire or wireless. The machine learning model is iteratively trained and updated using accumulated cell information. At every pre-defined rate, the updated machine learning model is sent back to the local controllers for corrections of aging side-effects and temperature changes.

### V-D SoC Estimation

As explained previously, the current pulses are injected to the cells through pulse injection module. The corresponding responses from battery cells are recorded by measurement update module to be used as inputs for the SoC estimation using machine learning model. The SoC estimation can be performed based on the measurement with the machine learning model, taking advantage of the simplified matrix multiplications.

Two approaches using pulse injection to augment SoC estimation have been considered as candidates to update SoC estimation in real-time. 1) The machine learning model is continuously operating to obtain the SoC values in real-time. 2) machine learning model only operates at certain moments, for example when the vehicle stops at red light. Between the times when SOC estimation is updated by machine learning, other SoC estimation techniques (such as coulomb counting and EKF) can be applied to estimate SoC for those cost- and computation-constrained applications.

Fig. 12 showcases how the latter method is employed in the actual battery system with UDDS driving cycle. The red line represents the actual SoC. The blue one illustrates the SoC estimation algorithm of the latter approach, where SoC resets at every point when the electric vehicle stops. After the vehicle restarts, other SoC estimation technique (such as coulomb counting or EKF) resumes. As a result, accumulated error in the previous driving period will be eliminated.

Please note that the error presented between the two SoC curves, in Fig. 12, is exaggerated for greater clarity. In practice, the difference between them will be heavily dependent on the SoC estimation strategy adopted in the system and can be reduced significantly by correcting the estimated SoC more regularly.

## Vi Conclusions and Future Work

This paper introduces a new strategy to augment the performance of SoC estimation powered by neural networks. A high-fidelity electrochemical battery model is used to validate the concept of the pulse injection and demonstrate that higher current amplitudes contribute to more accurate SoC estimation. As the first batch of cell data is usually acquired from laboratory environment, the testing procedure tailored for the pulse-injection augmentation is discussed and detailed steps are given. The method to construct FNN for mapping the pulse measurements to a ground-truth is provided. By applying FNN to the data, the SoC can be reconstructed within a error boundary of 2%. Thanks to the advantage of FNN, a standard microprocessor is capable of running FNNs with just simplified matrix multiplications after the model is trained, which makes real-time computing feasible. Lastly, the experimental validation platform has been demonstrated and explained. By using a BMS-ready balancing circuit that is previously developed, the pulse injection can be integrated into balancing current demands without interfering with driving behaviors.

At the time of writing this paper, the experimental setup has been completed and the experimental tests are undergoing on a scaled battery-pack. The test-bench that is representative of real-world conditions in transportation electrification and the SoC estimation results will be published. Furthermore, the authors intend to apply the proposed technique to monitor battery aging. In fact, the battery response to a time series is expected to encode a range of information on the internal behavior of the battery. Future research will study finite-time sequences that can reveal general aging information, i.e. the State of Health (SoH), as well as specific aging effects such as active material dissolution, surface layer formation, and atomic structure rearrangement.

## Vii Acknowledgement

This research was undertaken, in part, through funding from the Columbia University Data Science Institute (DSI) Seed Fund Program. It was facilitated by NVIDIA Corporation with the donation of a Titan Xp GPU. We acknowledge computing resources from Columbia University’s Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010. We would also like to thank Robert C. Mohr for his contributions in the experimental setup.

## References

[heading=none]