I Introduction
Driving under urban scenarios, especially the intersection scenario, is one of the most challenging problems for an autonomous driving(AD) system. According to[1], a general AD system is composed of several subsystems, including sensing, navigation, decision-making, planning, and control. The key challenge of the intersection scenario is the interaction between the autonomous driving vehicle(ADV) and social vehicles, which mainly possess challenges on the decision-making and control modules[2]. Since the behavioral intention of social vehicles is uncertain, the ADV is forced to negotiate and make decisions quickly under strong interactions, otherwise, traffic accidents are very likely to occur. Generally, the intersection scenario discussed in this paper mainly refers to the ADV passing through a cross-junction while interacting with social vehicles.
Reinforcement learning(RL) methods learn an optimal policy through trial-and-error, a RL agent is able to promote its performance by updating the policy repeatedly. In recent years, RL has become a key technology in the field of AD decision-making and control[3, 4, 5]. In[6]
, deep learning (DL) and RL method is proposed to address the lane-keeping task with the visual input for an ADV on a highway track. And
[7], an RL with rule-based constraints method is proposed to train the ADV for the lane-changing task in the highway scenario.Besides, some RL-based methods are proposed to deal with the decision-making and control problems in intersection scenarios. In [8], an end-to-end framework is proposed for ADV controlling, the RL agent takes raw data as input and directly outputs the control command of the vehicle. The proposed scenario is mixed up with junctions and roundabouts, while the traffic flow in the scenario is not adjustable. In[9], a multi-agent RL framework is proposed for the behavior model design under intersection scenarios, in which both rule-based and RL-based baselines are provided. The proposed simulator is similar to the one proposed in[10], both of them fail to provide a delicate dynamic model of vehicles and a high-resolution simulator. In[11], a simulator based on CARLA simulator[12]
is proposed, which provides a set of real-world road maps as intersection AD benchmark. Though a partially observable Markov decision process(POMDP) agent is proposed, the RL interface is not deployed for further research. In
[13], an RL framework named after ULTRA is proposed with a delicate scenario design. In this paper, the behavior of social vehicles are modeled by the SMARTS simulator[14]. The dynamics of the simulation are idealized as well. We compare the features of some existing frameworks as shown in TABLE I.As discussed above, previous work has not been able to integrate a tunable intersection scenario benchmark with the RL-based baseline in a high-resolution simulator. In this paper, we propose a training and testing RL framework addressing the complex intersection scenarios for the AD problem. We call our benchmark Reinforcement Learning Complex Intersection Scenario or RL-CIS for short. As for the original contributions, this paper:
-
proposes benchmark called RL-CIS for RL-based AD agents evaluation, the RL-CIS includes both stochastic and deterministic tests and training environment for RL agent,
-
develops an RL training environment for intersection scenarios and proposes a traffic flow generation method based on stochastic process,
-
provides some basic metrics of performance for the AD agent, also a set of baselines of both RL and rule-based methods. The experimental results show that our RL agent outperforms some rule-based methods in intersection scenarios.
Features | Vehicle dynamics | RL Interface | Scenario diversity | Scenario randomness | Evaluation benchmark |
---|---|---|---|---|---|
CARLA scenario runner | |||||
Highway env | |||||
interp e2e driving | |||||
Summit | |||||
BARK | |||||
ULTRA | |||||
RL-CIS(ours) |
Ii Intersection Scenarios
Ii-a Design of Test Scenarios
In order to evaluate the AD agent, we design a set of intersection scenarios based on [15], which is proposed for ADV field test evaluation. Inspired by [15] and[16], a scenario is defined through a 3-level structure. The first level is the definition of the functional scenario. In this part, the road network structure, target route, behavior of the traffic flow, and other necessary features are defined. The second level is the logical scenario. During this phase of scenario development, the range of all parameters is given, which constitutes a set of all available scenarios. Then, the final stage is that the concrete scenario, a specific group of parameters is selected to instantiate a single concrete scenario, which will be rendered for the ADV test.
Ii-A1 Variability
The variability is the most critical feature for the proposed benchmark. The variability is composed of the diversity of the functional scenarios and the randomness of the logical scenarios, which are the two major rules in scenario design.
Diversity
Diversity is emphasized for functional scenarios design. In this phase, the diversity of functional scenarios refers to the variety of the driving tasks and behavior of social vehicles, which leads to a diverse interaction between ego and the social vehicles. In[8]and[11], though the traffic flow is random, the diversity of interaction between ADV and social vehicles is not guaranteed, which means some type of interaction may not happen among the whole test set.
Specifically, in a cross-intersection, the ADV has 3 potential passing directions, which are turning left, turning right, and going straight. For each driving task, the ego vehicle may encounter interacting traffic flow from different directions. The variation of the interaction is the major diversity of intersection scenarios.
Randomness
Randomness is the critical rule in logical scenarios design. In intersection scenarios, randomness mainly indicates two aspects. The first is the randomness of the behavior model of social vehicles. It is defined as that how a social vehicle reacts to potential traffic conflicts. The second aspect is the range of critical kinetic parameters, such as the target speed and minimum brake distance for some rule-based methods. In the design of the range of tunable parameters, there is a trade-off. The parameter range should be as broad as possible to generalize the scene, but it should not exceed a reasonable range.

Ii-A2 Deterministic Test Scenarios
The un-signalized intersection scenario is one of the most challenging scenarios for ADV and is mainly discussed in this paper. In the signalized intersection, the interaction between vehicles is taken over by the traffic lights assuming that all traffic participants obey the traffic rules. Though in real-world traffic, both signalized and un-signalized junctions are common. The un-signalized intersection is selected intentionally to stress the interaction problem in the intersection scenario.
In this paper, we select the cross-intersection scene which is the most typical road-network structure among intersection scenarios. We use the Town03 map in the CARLA simulator to configure the scenario, as shown inf 1. The blue lines refer to the planned turning routes in the intersection. In a cross-intersection, three typical tasks are mainly considered, which are turning left, turning right, and going straight. For each turning task, there are multiple potential interacting traffic flows which is deterministic by the road network structure. To further decompose scenarios, each functional scenario extracts only a single interacting traffic flow.
Therefore, we propose five functional scenarios as shown in Fig. 2. The functional scenario (a) and (b) refer to left-turning task, interacting with a going straight and turning right traffic flow coming from the opposite direction. Secondly, scenarios (c) and (d) refer to ego vehicle going straight task, interacting with a going straight and turning left traffic flow coming from left and opposite direction respectively. Thirdly, scenario (e) refers to turning right task interacting with a going straight traffic flow coming from the left direction. The traffic flow in each scenario is composed of a continuous sequence of social vehicles, each vehicle in the traffic flow has a predefined route as illustrated in Fig.1.
The behavior model of all social vehicles is determined by two rules. The first one is speed tracking. The social vehicle will accelerate until it reaches the target speed and then maintain it. The second rule specifies how the social vehicle reacts to the potential conflict, which we employ the autonomous emergency braking(AEB) method. Generally, the AEB model detects a certain range in the front direction, if any obstacles are detected, the vehicle will brake until the collision detection is clear, the vehicle will continue pursuing the target speed.
Besides the design of functional scenarios, the logical scenario is instantiated with a set of determined kinetic parameters. In our proposed intersection scenarios, the adjustable kinetic parameters are the target speed of each vehicle in the traffic flow and the gap distance between adjacent vehicles , as shown in 2. The gap distance is fixed in each concrete scenario instance to guarantee the stability of the test. The two parameters are determined through discretization of a certain range. The value of target velocity is sampled from km/h uniformly with a step length of 2. The value of gap distance is sampled from m uniformly with a step length of 2.
Ii-A3 Stochastic Test Scenarios
Besides the deterministic test, we propose the stochastic test set for the RL-based agent evaluation. In the stochastic test, a comparatively more random traffic flow is provided. Specifically, the behavior model of social vehicles is determined by CARLA’s built-in Autopilot function. CARLA Autopilot is a rule-based AD framework which includes navigation, planning and control module. The social vehicle controlled by CARLA Autopilot will randomly plan the route to pass through the junction. For the driving behavior, a CARLA autopilot agent has a Boolean switch for collision avoidance with certain vehicles. In our experiment, we switch off the collision avoidance of all social vehicles against RL-based ADV. Because the RL agent learns the policy through trial-and-error, switching off the collision detection will help the RL agent explore and learn policies more efficiently.
![]() |
![]() |
![]() |
![]() |
![]() |
Ii-B Design of Training Scenarios
Generally, the training scenario set is supposed to cover the test scenario set as much as possible but remains some variability at the same time. The traffic flow setting in training scenarios is similar to the one used in test scenarios. We deploy a rule-based traffic flow with adjustable kinetic parameters, while each traffic flow holds a fixed route. The kinetic parameters are target speed and gap distance, which are the same as in the deterministic test. However, in the training scenario, the two parameters are various for each vehicle of the same traffic flow. The parameters are sampled from the same interval as defined in the deterministic test when a new social vehicle is spawned. The behavior model of all social vehicles in training scenarios uses the same assumption as in the deterministic test, which combines speed tracking and the AEB model.
Therefore, we deploy three agents for turning left, turning right, and going straight tasks respectively. In each training procedure, the functional scenario with the same task route is trained at the same time. For example, in the left turning task, the RL agent will confront traffic flows shown in Fig.2 (a) and (b) at the same time. The traffic flow on other roads will not be activated.
Ii-B1 OU-process Parameter Generation
In the training phase, the RL agent is supposed to fully explore all available traffic situations. As discussed above, the randomness of scenarios is highly affected by the distribution of the kinetic parameters of the social vehicles, which are target speed and gap distance in our proposed test scenarios. Therefore, we deploy a kinetic parameter generation method based on Ornstein–Uhlenbeck(OU) process[17]. The OU process will generate a sequence of kinetic parameters for the whole traffic flow, and each social vehicle of the traffic flow will be given a set of parameters sequentially.
For the kinetic parameters generation, the OU process has two significant advantages. The first is that the cumulative probability distribution of the OU process is Gaussian. The second is that the OU process is mean reverting, which limits the difference between two contiguous sampling values. The stochastic differential equation(SDE) of the OU process is
(1) |
in which , refers to the expectation of the variable, denotes the damping factor of the OU process, refers to the target speed of the social vehicle,
denotes the Wiener process. The ordinary differential equation(ODE) of the OU process is
(2) |
Since the kinetic parameters of vehicles is bounded by an interval, inspired by[18] we deploy a clipped OU process to avoid over-accumulation on the interval boundary. The clipping process is shown as Algorithm 1. We denote the interval for parameter sampling as .
As for the gap distance between social vehicles, the value is sampled from a truncated gaussian distribution
(3) |
in which
refers to the normal distribution,
refers to the value interval of the gap distance,denotes the variance of the truncated gaussian distribution. In this paper is determined through
(4) |
where is a tunable parameter. It is used to adjust the concentration of the parameters relative to the mean value. According to our proposed sampling method, the gap distance is a linear mapping of target speed over the sampling interval.
Iii Baseline Methods
Iii-a Baselines for Intersection Scenarios
Iii-A1 Rule-based Methods
In this part, we select several classic rule-based methods and an RL algorithm as our proposed baseline methods for the intersection scenarios.
Intelligent Driver Model(IDM)
The intelligent driver model(IDM)[19] is one of the most popular rule-based baselines for the ADV. The IDM model is designed based on the car-following behavior. The IDM model is defined by the following equations
(5) | ||||
where refers to the desired velocity that the vehicle would drive at in free traffic, refers to the minimum desired net distance to the car in the front, refers to the minimum possible time to the vehicle in front, refers to the maximum vehicle acceleration, refers to a comfortable braking deceleration, usually takes a value of 4.
Besides, the acceleration of vehicle can be separated into a free road term and an interaction term
(6) | ||||
Autonomous Emergency Braking(AEB) Model
The AEB method is a widely used technique of level-2 ADV. In real-world deployment, the AEB method processes sensor data with a determined algorithm and performs the brake action if any collisions are detected. In the CARLA simulator, we directly use the ground-truth value as the input for the deployment of the AEB method.
In our work, the AEB method is determined with two rules. The first one is that the longitudinal range of detection and the second one is the expansion factor of the bounding box of the social vehicle . During the driving process, the ego vehicle will detect along its longitudinal direction with a length of . In meantime, the bounding box of each social vehicle is expanded from its original physical model according to the expansion factor. More specifically, the size of the bounding box is calculated through
(7) |
where refer to the original size of social vehicle physical model. If the bounding box of any social vehicle penetrating the front area of the ego vehicle, the AEB model will perform a maximum braking action until the detection area is clear.
Iii-A2 Reinforcement Learning
State Representation
For the intersection scenario, inspired by [20]
, we use the ground-truth value of kinetic information of ego vehicle and social vehicles for state representation. Such consideration is common in autonomous driving system design since the major challenge of the intersection scenario comes from the interaction between ego vehicle and social vehicles. For the ego vehicle, the state vector is defined as
, in which denotes the speed of ego vehicle, and denotes a 3-dimension one-hot vector, which indicates ego vehicle’s current position. For the social vehicle, the state vector of each one is defined as , in which refer to two-dimensional velocity of social vehicle , indicate vehicle’s Cartesian coordinates and denotes the heading angles under the ego vehicle’s coordinate system. The total state representation is combined of the ego vehicle and 5 nearest social vehicles. All six state vectors are concatenated to a 33-dimension vector as the RL input state vector.Action Space
The action space is constructed as a 2-dimension continuous variables . The vector is transformed for speed tracking by . Then we scale the action to m/s as the target speed of ego vehicle for longitudinal control.
Reward Design
Inspired by [21], the reward function is defined through events. More specifically, the reward function is combined with two parts, the first is the reward for each timestep, the second is the final reward at the end of an episode. The complete reward function can be written as follows
(8) |
where refers to the maximum time limit of one episode, by which common sub-task reward encourages ego vehicle to improve traffic efficiency.
RL Algorithms
For the intersection scenario, we deploy the TD3 algorithm[22]
as our RL baseline. For the neural network design, the state vector is divided by ego and social vehicle, each component is followed by an encoder network, which is formed by
fully-connected(FC) layers. The output of encoders is concatenated and followed by an FC layer. The actor and critic network of the TD3 algorithm share the same network structure in our experiments.Iv Simulation Experiments
Iv-a Evaluation Metrics
Iv-A1 Intersection Scenarios
Many metrics can be used to measure the behavior of agents [23]. For an AD system, safety and efficiency are the most concerned performance index. In our framework, success rate and average passing time are general metrics for performance evaluation. Success rate indicates that how the RL agent performs for the specified task directly. In this paper, the success rate is defined as
(9) |
Secondly, efficiency is measured through the average duration time of a single testing episode. Since the functional scenario which targets on same turning task shares the same route, the average passing time is compared through route classification. It is important to note that we only count the time of the successful test. That is to say, scenarios are divided into three groups for average passing time comparison.
Iv-B Results and Analysis
Iv-B1 Training Process
The learning curves of RL baselines in the task of highway training scenario are depicted in Fig. 3. From the figure, we can see that the RL agent for each task converges fast within 2000 episodes. Besides, for each route task, the learning curve converges to a maximum level within 5000 episodes. Then the left turning and right turning agents maintain relatively high and stable performance while the going straight agent has slight fluctuation. The main reason for such appearance is mainly because that the agent in going straight task must interact with the left turning traffic flow coming from the opposite direction, which brings greater challenge than other tasks.
![]() |
![]() |
In the training process of each task, the interacting traffic flows are activated according to the definition of logical scenarios, as described in experiment settings. The kinetic parameters generation of each traffic flow is counted, the sampling and distribution in the training of left-turning task is shown in Fig.4. The distribution of parameters sampling is approximate to Gaussian distribution, while the sampling curve of the time domain is rather smooth.
![]() |
![]() |
Iv-B2 Deterministic Test
In the deterministic test, we deploy a TD3 agent as the RL baseline. Besides, two rule-based methods are deployed as a comparison, which is IDM and the AEB model. In our deployment in CARLA, the IDM model will detect a certain distance along the driving route, pursuing either speed tracking or car following behavior as shown in (5) and (6). Compared to the AEB method, the IDM agent adjusts its velocity rather smoothly. The experimental results are shown in TABLE II. We evaluate the TD3 agent and rule-based agents in all five functional scenarios. Since the rule-based agents are poorly performed relatively. The statistics are calculated by the task routes for the rules-based agents. In turning left and turning right experiments, the RL agent reaches a success near 90%, and exceeds the rules-based agent in both success rate and average time. According to the result, going straight is the most challenging task since the left-turning social vehicles are rather fast and hard to negotiate for the ego vehicle.
Functional scenario | (a) | (b) | (c) | (d) | (e) | |
Ego route | Turning left | Turning right | Going straight | |||
TD3 |
Success rate(%) | 94.8 | 93.8 | 89.24 | 99.0 | 80.0 |
Average time(s) | 6.83 | 6.65 | 7.04 | 6.94 | 4.79 | |
IDM |
Success rate(%) | 67.7 | 62.8 | 47.4 | ||
Average time(s) | 8.40 | 8.23 | 8.59 | |||
AEB |
success rate(%) | 72.74 | 50.0 | 48.96 | ||
average time(s) | 8.66 | 7.21 | 7.18 | |||
|
Iv-B3 Stochastic Test
The experimental results of the stochastic test are shown in Table III. In the stochastic test, the kinetic parameters of traffic flows are uniformly sampled from an interval, which makes traffic flows not extremely dense. Therefore the RL agent reaches a significantly higher success rate for each task compared to the deterministic test.
In this part, the RL method is compared with rule-based methods as well. As the table shows, the RL agent outperforms rule-based methods in both success rate and average time. The overall success rate of the TD3 agent is above 90%. Though in turning left task and going straight task, rule-base agents have better the average time, they are at a definite disadvantage in terms of safety. We believe such results occur because the IDM and AEB methods have a limited input, which makes them not capable of detecting potential conflict with social vehicles from the cross direction. In conclusion, the model-free RL agent dominates our proposed benchmark of the intersection scene.
Methods | Driving task | Success rate(%) | Average time(s) |
---|---|---|---|
TD3(ours) | Left | 95.9 | 9.78 |
Right | 96.9 | 7.64 | |
Straight | 91.9 | 9.20 | |
IDM | Left | 68.3 | 11.21 |
Right | 72.7 | 11.05 | |
Straight | 33.7 | 22.6 | |
AEB | Left | 71.3 | 9.06 |
Right | 88.7 | 8.86 | |
Straight | 58.3 | 8.83 | |
V Conclusion
In this paper, we propose RL-CIS as a framework to train and test the RL-based AD agent in intersection scenarios. Firstly, a group of un-signalized intersection functional scenarios is designed. Then the behavioral model of social vehicles is determined with two critical parameters, composed of the whole logical scenario set. The concrete scenarios are proposed by discretizing logical scenarios and become the deterministic test set in our proposed framework. Besides, we deploy a set of stochastic tests to further evaluate the RL-based AD agent. Meanwhile, the training environment for the RL agent is developed. In this part, a stochastic process-based sampling method is deployed for traffic flow parameters generation. Both the training and test sets are built through the CARLA simulator. In addition to that, we offer a set of baselines for the intersection AD benchmarks, including TD3, IDM, and AEB methods. According to the experimental results, the RL agent shows significant superiority compared with rule-based methods.
References
- [1] E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE access, vol. 8, pp. 58 443–58 469, 2020.
- [2] B. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. Sallab, S. Yogamani, and P. Pérez, “Deep reinforcement learning for autonomous driving: A survey. arxiv 2020,” arXiv preprint arXiv:2002.00444.
- [3] B. Wang, D. Zhao, and J. Cheng, “Adaptive cruise control via adaptive dynamic programming with experience replay,” Soft Computing, vol. 23, no. 12, pp. 4131–4144, 2019.
- [4] D. Zhao, Z. Hu, Z. Xia, C. Alippi, Y. Zhu, and D. Wang, “Full-range adaptive cruise control based on supervised adaptive dynamic programming,” Neurocomputing, vol. 125, pp. 57–67, 2014.
- [5] D. Zhao, Z. Xia, and Q. Zhang, “Model-free optimal control based intelligent cruise control with hardware-in-the-loop demonstration [research frontier],” IEEE Computational Intelligence Magazine, vol. 12, no. 2, pp. 56–69, 2017.
- [6] D. Li, D. Zhao, Q. Zhang, and Y. Chen, “Reinforcement learning and deep learning based lateral control for autonomous driving,” IEEE Computational Intelligence Magazine, vol. 14, no. 2, pp. 83–98, 2019.
- [7] J. Wang, Q. Zhang, D. Zhao, and Y. Chen, “Lane change decision-making through deep reinforcement learning with rule-based constraints,” in 2019 International Joint Conference on Neural Networks. IEEE, 2019, pp. 1–6.
- [8] J. Chen, S. E. Li, and M. Tomizuka, “Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning,” IEEE Transactions on Intelligent Transportation Systems, 2021.
- [9] J. Bernhard, K. Esterle, P. Hart, and T. Kessler, “Bark: Open behavior benchmarking in multi-agent environments,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020. [Online]. Available: https://arxiv.org/pdf/2003.02604.pdf
- [10] E. Leurent, “An environment for autonomous driving decision-making,” https://github.com/eleurent/highway-env, 2018.
- [11] P. Cai, Y. Lee, Y. Luo, and D. Hsu, “Summit: A simulator for urban driving in massive mixed traffic,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 4023–4029.
- [12] A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
- [13] M. Elsayed, K. Hassanzadeh, N. M. Nguyen, M. Alban, X. Zhu, D. Graves, and J. Luo, “Ultra: A reinforcement learning generalization benchmark for autonomous driving,” 2020. [Online]. Available: https://ml4ad.github.io/files/papers2020/ULTRA:Areinforcementlearninggeneralizationbenchmarkforautonomousdriving.pdf
- [14] M. Zhou, J. Luo, J. Villella, Y. Yang, D. Rusu, J. Miao, W. Zhang, M. Alban, I. Fadakar, Z. Chen et al., “Smarts: Scalable multi-agent reinforcement learning training school for autonomous driving,” arXiv preprint arXiv:2010.09776, 2020.
- [15] China ITS Industry Alliance. (2020) Requirements of simulation scenario set for automated driving vehicle. [Online]. Available: http://www.ttbz.org.cn/StandardManage/Detail/38842/
- [16] E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P. J. Maechling, R. Mayani, W. Chen, R. Ferreira da Silva, M. Livny, and K. Wenger, “Pegasus: a workflow management system for science automation,” Future Generation Computer Systems, vol. 46, pp. 17–35, 2015, funding Acknowledgements: NSF ACI SDCI 0722019, NSF ACI SI2-SSI 1148515 and NSF OCI-1053575. [Online]. Available: http://pegasus.isi.edu/publications/2014/2014-fgcs-deelman.pdf
- [17] S. Finch, “Ornstein-uhlenbeck process,” 2004.
- [18] G. Huang, M. Mandjes, and P. Spreij, “Limit theorems for reflected ornstein–uhlenbeck processes,” Statistica Neerlandica, vol. 68, no. 1, pp. 25–42, 2014.
- [19] A. Kesting, M. Treiber, and D. Helbing, “Enhanced intelligent driver model to access the impact of driving strategies on traffic capacity,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 368, no. 1928, pp. 4585–4605, 2010.
- [20] E. Leurent and J. Mercat, “Social attention for autonomous decision-making in dense traffic,” arXiv preprint arXiv:1911.12250, 2019.
- [21] T. Tram, A. Jansson, R. Grönberg, M. Ali, and J. Sjöberg, “Learning negotiating behavior between cars in intersections using deep q-learning,” in 2018 21st International Conference on Intelligent Transportation Systems. IEEE, 2018, pp. 3169–3174.
-
[22]
S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error
in actor-critic methods,” in
International Conference on Machine Learning
. PMLR, 2018, pp. 1587–1596. - [23] A. Tampuu, T. Matiisen, M. Semikin, D. Fishman, and N. Muhammad, “A survey of end-to-end driving: Architectures and training methods,” IEEE Transactions on Neural Networks and Learning Systems, 2020.
Comments
There are no comments yet.