Data Set Description: Identifying the Physics Behind an Electric Motor – Data-Driven Learning of the Electrical Behavior (Part I)

by   Sören Hanke, et al.

Two of the most important aspects of electric vehicles are their efficiency or achievable range. In order to achieve high efficiency and thus a long range, it is essential to avoid over-dimensioning the drive train. Therefore, the drive train has to be kept as lightweight as possible while at the same time being utilized to the best possible extent. This can only be achieved if the dynamic behavior of the drive train is accurately known by the controller. The task of the controller is to achieve a desired torque at the wheels of the car by controlling the currents of the electric motor. With machine learning modeling techniques, accurate models describing the behavior can be extracted from measurement data and then used by the controller. For the comparison of the different modeling approaches, a data set consisting of about 40 million data points was recorded at a test bench for electric drive trains. The data set is published on Kaggle, an online community of data scientists.


page 1

page 4


Data Set Description: Identifying the Physics Behind an Electric Motor – Data-Driven Learning of the Electrical Behavior (Part II)

A data set was recorded to evaluate different methods for extracting mat...

Space-Filling Subset Selection for an Electric Battery Model

Dynamic models of the battery performance are an essential tool througho...

Towards a Reinforcement Learning Environment Toolbox for Intelligent Electric Motor Control

Electric motors are used in many applications and their efficiency is st...

A Survey on Machine Learning Applied to Dynamic Physical Systems

This survey is on recent advancements in the intersection of physical mo...

Analysis of NARXNN for State of Charge Estimation for Li-ion Batteries on various Drive Cycles

Electric Vehicles (EVs) are rapidly increasing in popularity as they are...

Longitudinal Dynamics Model Identification of an Electric Car Based on Real Response Approximation

Obtaining a realistic and accurate model of the longitudinal dynamics is...

Learning-Based Path Planning for Long-Range Autonomous Valet Parking

In this paper, to reduce the congestion rate at the city center and incr...

I Preliminary remark

The description of the data set consists of two parts. Part I (this document) gives a simplified introduction to the system behind the data and explains how to use the data set ( [2].

Part II explains the system in more details, covers some basic approaches on how to extract models and discusses also a possible way to get a balanced data set where the samples are evenly distributed in a subset used for (deep) machine learning (ML) methods ( [3].

Ii Introduction

The drive train of an electric vehicle consists of a battery, an inverter, an electric motor and a controller (Fig. 1).

For this dataset, the battery is assumed to be an ideal power supply and, hence, the focus is on the interaction between controller, inverter and motor. The inverter is a power electronic device with switchable semiconductors which converts the electric energy provided by the battery from a two phase DC voltage to a three phase AC voltage with varying amplitude and frequency. This is required, in order to operate the motor at different rotational speeds with a certain torque generated. The motor itself converts electrical to mechanical energy and vice versa. However, at the end the motor is a passive system without any actuators and that is why the controller is only acting on the inverter switching states.

For a high performance (i.e. accurate, fast and efficient torque control), the controller needs a model that predicts the electromechanical drive’s behavior in all operating points sufficiently well. This means that operating point-dependent effects, such as nonlinearities of the inverter or the magnetic saturation of the motor, must be covered.

To summarize, the objective behind this data set is to predict the electrical behavior of both inverter and motor by one single framework.

The following two sections first describe the basic operating principle (III) and then how the data set can be used to obtain models from it including details on choosing the inputs and outputs (IV).

Fig. 1: Simplified structure of the drive train in an electric vehicle, [4]

Iii Operating principle

Fig. 2 shows the components and their basic electrical models in more details. The battery can be seen as a voltage source having the voltage and the inverter consists of three switches (, , ) each having two possible switching states (+1, -1). Thus a total of eight different combinations of the switching states yielding different voltages at the inverter terminals (a, b, c) are possible (Tab. I

). These states are also called elementary vectors

with denoting the index of a vector.

If the switching states are modulated / operated in a suitable sequence at a considerable high switching frequency, the inverter outputs a three phase AC voltage with a certain average amplitude and average frequency (fundamental component of the voltage).

Due to the structure of the inverter, the vectors and result in the same voltages at the terminals. For this reason, it is sufficient to use only vector in the provided data set.

The motor is modeled in the so called dq-coordinate system which is a typical state transformation in the motor control domain in order to simplify the overall control design. However, if the momentary rotation angle of the motor is known, the inverter voltages can be transformed to the dq-system an fed into the motor model.

Fig. 2: Basic electrical modeling of the drive train
1 -1 -1 -1
2 +1 -1 -1
3 +1 +1 -1
4 -1 +1 -1
5 -1 +1 +1
6 -1 -1 +1
7 +1 -1 +1
8 +1 +1 +1
TABLE I: Inverter switching states

Fig. 3 and Fig. 4 show, how the controller can regulate the motor currents by selecting appropriate elementary vectors . The motor currents are directly related to the torque but easier to measure and to predict. Hence, the basic motor control level is addressing electrical currents flowing through the stator windings. As the used controller is based on the model predictive control (MPC) principle, the following steps are carried out recurrently for every controller cycle (explained for the cycle between the time points and ):

  1. Measurement of , , at time point

  2. Prediction of the currents at for the different elementary vectors that can be applied: ,

  3. Selection of the elementary vector that brings both currents as close as possible to the set points

  4. Apply the selected elementary vector for one cycle (here between and )

  5. Repeat steps 1-4 for the next cycle beginning at

For the prediction in step 2, the models that describe the system behavior for the different elementary vectors are used by the controller. How this models can be extracted from the data set is discussed in section IV.

The selection of the appropriate vector in step 3 is based on the evaluation of the cost function (1). For each elementary vector the deviation between prediction and set points is evaluated. Fig. 3 shows the deviation for the predicted q-current that would result when choosing vector . The vector that yields the lowest costs is selected by the MPC.

Fig. 3: Predictions of the motor currents in d- and q-direction for the possible elementary vectors

However, as shown in Fig. 4, there will be an prediction error due to deviations of the models from the real plant behavior. But increasing the accuracy of the models will reduce the prediction error and allow MPC to determine the most appropriate vector.

Fig. 4: Curve shape of the motor currents with highlighted measurements and predictions

Iv How to extract models from the data set

To get accurate models that describe the behavior of the plant (inverter + motor) with a high accuracy, a data-driven approach based on measurements can be used. The data set provides about 40 million samples that can be used to train for example artificial neural networks or other machine learning models.

A sample in the data set (each row) consists of the measured dq-currents at two consecutive time points (e.g. and ), the angle at the earlier of the two time points, and the information about the elementary vector selected in the controller cycle between them () as well as the vector selected in the cycle before (). An overview of the included variables is given in Tab. II.

The rotational speed of the motor for all samples was constant at  min and hence this variable is not part of the data set. In the future, an extended data set for varying rotational motor speeds may be added and then the rotational speed would be added to the input space. Similar, the motor temperature was nearly constant during all measurements and, therefore, does not need to be considered in the given data set.

However, the successive rows or samples in the set do not constitute a time series.

Variable Description Data type Classification
measured d-current at single inputs
measured q-current at single
measured rotational angle at single
element. vector applied at integer
element. vector applied at integer
measured d-current at single targets
measured q-current at single
TABLE II: Variables contained in the data set

As a result of the measurements at and , the real behavior of the currents for a given vector is known. This knowledge can now be used to derive models. Fig. 5, Fig. 6 and Fig. 7 show three variants on how to define the inputs and targets for a machine learning modeling.

Fig. 5: Variant A: Choosing the inputs and targets for a modeling approach using a single model

Variant A uses a single model that aims to cover the behavior of all elementary vectors. In addition to the known values at time , the index of the vector to be analyzed is input to the model. The information about which vector was used in the interval before () can also be an input. This might be helpful to consider more detailed effects like the inverter-deadtime or the interlocking time, as they appear when switching between elementary vectors. Targets are the currents at the end of the controller cycle.

Nevertheless, it is also possible to extract multiple models each covering the specific behavior of one of the used vectors (variant B). The index of the vector to be analyzed () is then used by the controller to switch between the models.

Fig. 6: Variant B: Choosing the inputs and targets for a modeling approach using multiple models

To account for the vector which was used in the interval before (), it is also possible extract models. In this variant C, each model covers the behavior of a particular transition between the former vector and the next vector as well as the behavior of the currents with vector applied during the next cycle.

Fig. 7: Variant C: Choosing the inputs and targets for a modeling approach using multiple models, with each of them considering the behavior of a particular transition between the former and the next switching vector.

The accuracy of a model can be evaluated according to the cost function (LABEL:eq:cost_function_model).


Considering and as elements of an error vector for each sample , the cost function represents the root mean square of this error vector (RMSE) regarding the

samples used for the training of the respective model. The error vector elements are the deviation between the targeted outputs (

, ) and the predictions of the model (, ) for the d- and the q-current.

Please note:
For sake of simplicity, is considered as ideally constant in this contribution. Moreover, the rotational speed and the motor temperature are kept constant, too. It is planed to extend the data set to variations of this three variables in the future. However, the presented data-driven modeling ideas can be directly extended to consider these varying operation conditions by extending the input space with this additional features.

Devices under test:
The drive system under test consists of an interior magnet permanent magnet synchronous motor (IPMSM) of and a 2-level IGBT inverter. The most important test bench parameters are summarized in Tab. III. Fig. 8 shows the test bench with the transient recorder in the front and the used motor in the background.

Are more detailed list of the available equipment can be found in Part II of the description.

IPMSM Brusa HSM16.17.12-C01
Stator resistance 18 m
Inductance in d-direction
Inductance in q-direction
Permanent magnet flux 66 mV s
Pole pair number 3
Inverter 3SKiiP 1242GB120-4D
Typology voltage source inverter
2-level, IGBT
Controller hardware dSPACE
Processor board DS1006MC, 4 cores, 2.8 GHz
Measurement devices
Transient recorder Yokogawa   DL850
Power analyzer Yokogawa WT3000
Current probes (zero-flux transducers) 3Yokogawa, 500 A, 2 MHz
Torque sensor       HBM, T10FS, 2 kN m
TABLE III: Test bench parameters
Fig. 8: Test bench with the used PMSM in the background

Link to the uploaded data set:
The data set is published on Kaggle, an online community of data scientists: