I Preliminary remark
The description of the data set consists of two parts. Part I (this document) gives a simplified introduction to the system behind the data and explains how to use the data set (https://arxiv.org/abs/2003.07273) [2].
Part II explains the system in more details, covers some basic approaches on how to extract models and discusses also a possible way to get a balanced data set where the samples are evenly distributed in a subset used for (deep) machine learning (ML) methods (https://arxiv.org/abs/2003.06268) [3].
Ii Introduction
The drive train of an electric vehicle consists of a battery, an inverter, an electric motor and a controller (Fig. 1).
For this dataset, the battery is assumed to be an ideal power supply and, hence, the focus is on the interaction between controller, inverter and motor. The inverter is a power electronic device with switchable semiconductors which converts the electric energy provided by the battery from a two phase DC voltage to a three phase AC voltage with varying amplitude and frequency. This is required, in order to operate the motor at different rotational speeds with a certain torque generated. The motor itself converts electrical to mechanical energy and vice versa. However, at the end the motor is a passive system without any actuators and that is why the controller is only acting on the inverter switching states.
For a high performance (i.e. accurate, fast and efficient torque control), the controller needs a model that predicts the electromechanical drive’s behavior in all operating points sufficiently well. This means that operating pointdependent effects, such as nonlinearities of the inverter or the magnetic saturation of the motor, must be covered.
To summarize, the objective behind this data set is to predict the electrical behavior of both inverter and motor by one single framework.
Iii Operating principle
Fig. 2 shows the components and their basic electrical models in more details. The battery can be seen as a voltage source having the voltage and the inverter consists of three switches (, , ) each having two possible switching states (+1, 1). Thus a total of eight different combinations of the switching states yielding different voltages at the inverter terminals (a, b, c) are possible (Tab. I
). These states are also called elementary vectors
with denoting the index of a vector.If the switching states are modulated / operated in a suitable sequence at a considerable high switching frequency, the inverter outputs a three phase AC voltage with a certain average amplitude and average frequency (fundamental component of the voltage).
Due to the structure of the inverter, the vectors and result in the same voltages at the terminals. For this reason, it is sufficient to use only vector in the provided data set.
The motor is modeled in the so called dqcoordinate system which is a typical state transformation in the motor control domain in order to simplify the overall control design. However, if the momentary rotation angle of the motor is known, the inverter voltages can be transformed to the dqsystem an fed into the motor model.
1  1  1  1 

2  +1  1  1 
3  +1  +1  1 
4  1  +1  1 
5  1  +1  +1 
6  1  1  +1 
7  +1  1  +1 
8  +1  +1  +1 
Fig. 3 and Fig. 4 show, how the controller can regulate the motor currents by selecting appropriate elementary vectors . The motor currents are directly related to the torque but easier to measure and to predict. Hence, the basic motor control level is addressing electrical currents flowing through the stator windings. As the used controller is based on the model predictive control (MPC) principle, the following steps are carried out recurrently for every controller cycle (explained for the cycle between the time points and ):

Measurement of , , at time point

Prediction of the currents at for the different elementary vectors that can be applied: ,

Selection of the elementary vector that brings both currents as close as possible to the set points

Apply the selected elementary vector for one cycle (here between and )

Repeat steps 14 for the next cycle beginning at
For the prediction in step 2, the models that describe the system behavior for the different elementary vectors are used by the controller. How this models can be extracted from the data set is discussed in section IV.
The selection of the appropriate vector in step 3 is based on the evaluation of the cost function (1). For each elementary vector the deviation between prediction and set points is evaluated. Fig. 3 shows the deviation for the predicted qcurrent that would result when choosing vector . The vector that yields the lowest costs is selected by the MPC.
(1) 
However, as shown in Fig. 4, there will be an prediction error due to deviations of the models from the real plant behavior. But increasing the accuracy of the models will reduce the prediction error and allow MPC to determine the most appropriate vector.
Iv How to extract models from the data set
To get accurate models that describe the behavior of the plant (inverter + motor) with a high accuracy, a datadriven approach based on measurements can be used. The data set provides about 40 million samples that can be used to train for example artificial neural networks or other machine learning models.
A sample in the data set (each row) consists of the measured dqcurrents at two consecutive time points (e.g. and ), the angle at the earlier of the two time points, and the information about the elementary vector selected in the controller cycle between them () as well as the vector selected in the cycle before (). An overview of the included variables is given in Tab. II.
The rotational speed of the motor for all samples was constant at min and hence this variable is not part of the data set. In the future, an extended data set for varying rotational motor speeds may be added and then the rotational speed would be added to the input space. Similar, the motor temperature was nearly constant during all measurements and, therefore, does not need to be considered in the given data set.
However, the successive rows or samples in the set do not constitute a time series.
Variable  Description  Data type  Classification 

measured dcurrent at  single  inputs  
measured qcurrent at  single  
measured rotational angle at  single  
element. vector applied at  integer  
element. vector applied at  integer  
measured dcurrent at  single  targets  
measured qcurrent at  single 
As a result of the measurements at and , the real behavior of the currents for a given vector is known. This knowledge can now be used to derive models. Fig. 5, Fig. 6 and Fig. 7 show three variants on how to define the inputs and targets for a machine learning modeling.
Variant A uses a single model that aims to cover the behavior of all elementary vectors. In addition to the known values at time , the index of the vector to be analyzed is input to the model. The information about which vector was used in the interval before () can also be an input. This might be helpful to consider more detailed effects like the inverterdeadtime or the interlocking time, as they appear when switching between elementary vectors. Targets are the currents at the end of the controller cycle.
Nevertheless, it is also possible to extract multiple models each covering the specific behavior of one of the used vectors (variant B). The index of the vector to be analyzed () is then used by the controller to switch between the models.
To account for the vector which was used in the interval before (), it is also possible extract models. In this variant C, each model covers the behavior of a particular transition between the former vector and the next vector as well as the behavior of the currents with vector applied during the next cycle.
The accuracy of a model can be evaluated according to the cost function (LABEL:eq:cost_function_model).
(2) 
Considering and as elements of an error vector for each sample , the cost function represents the root mean square of this error vector (RMSE) regarding the
samples used for the training of the respective model. The error vector elements are the deviation between the targeted outputs (
, ) and the predictions of the model (, ) for the d and the qcurrent.Please note:
For sake of simplicity, is considered as ideally constant in this contribution.
Moreover, the rotational speed and the motor temperature are kept constant, too.
It is planed to extend the data set to variations of this three variables in the future.
However, the presented datadriven modeling ideas can be directly extended to consider these varying operation conditions by extending the input space with this additional features.
Devices under test:
The drive system under test consists of an interior magnet permanent magnet synchronous motor (IPMSM) of and a 2level IGBT inverter.
The most important test bench parameters are summarized in Tab. III.
Fig. 8 shows the test bench with the transient recorder in the front and the used motor in the background.
Are more detailed list of the available equipment can be found in Part II of the description.
IPMSM  Brusa HSM16.17.12C01  

Stator resistance  18 m  
Inductance in ddirection  
Inductance in qdirection  
Permanent magnet flux  66 mV s  
Pole pair number  3  
Inverter  3SKiiP 1242GB1204D  
Typology  voltage source inverter  
2level, IGBT  
Controller hardware  dSPACE  
Processor board  DS1006MC, 4 cores, 2.8 GHz  
Measurement devices  
Transient recorder  Yokogawa DL850  
Power analyzer  Yokogawa WT3000  
Current probes (zeroflux transducers)  3Yokogawa, 500 A, 2 MHz  
Torque sensor  HBM, T10FS, 2 kN m 
Link to the uploaded data set:
The data set is published on Kaggle, an online community of data scientists: https://www.kaggle.com/hankelea/systemidentificationofanelectricmotor
References
 [1] S. Hanke. Data set: Identifying the Physics Behind an Electric Motor  DataDriven Learning of the Electrical Behavior. https://www.kaggle.com/hankelea/systemidentificationofanelectricmotor.
 [2] O. Hanke S., Wallscheid and Böcker. Data Set Description: Identifying the Physics Behind an Electric Motor – DataDriven Learning of the Electrical Behavior (Part I). arXiv:2003.07273, 2020. https://arxiv.org/abs/2003.07273.
 [3] O. Hanke S., Wallscheid and Böcker. Data Set Description: Identifying the Physics Behind an Electric Motor – DataDriven Learning of the Electrical Behavior (Part II). arXiv:2003.06268, 2020. https://arxiv.org/abs/2003.06268.
 [4] Francis Ray. Electric vehicle at charging station (upper part of Fig. 1). Pixabay image #3321668, https://pixabay.com/illustrations/carelectriccarautoautomobile3321668.