Clones in Deep Learning Code: What, Where, and Why?

07/28/2021
by   Hadhemi Jebnoun, et al.
0

Deep Learning applications are becoming increasingly popular. Developers of deep learning systems strive to write more efficient code. Deep learning systems are constantly evolving, imposing tighter development timelines and increasing complexity, which may lead to bad design decisions. A copy-paste approach is widely used among deep learning developers because they rely on common frameworks and duplicate similar tasks. Developers often fail to properly propagate changes to all clones fragments during a maintenance activity. To our knowledge, no study has examined code cloning practices in deep learning development. Given the negative impacts of clones on software quality reported in the studies on traditional systems, it is very important to understand the characteristics and potential impacts of code clones on deep learning systems. To this end, we use the NiCad tool to detect clones from 59 Python, 14 C# and 6 Java-based deep learning systems and an equal number of traditional software systems. We then analyze the frequency and distribution of code clones in deep learning and traditional systems. We do further analysis of the distribution of code clones using location-based taxonomy. We also study the correlation between bugs and code clones to assess the impacts of clones on the quality of the studied systems. Finally, we introduce a code clone taxonomy related to deep learning programs and identify the deep learning system development phases in which cloning has the highest risk of faults. Our results show that code cloning is a frequent practice in deep learning systems and that deep learning developers often clone code from files in distant repositories in the system. In addition, we found that code cloning occurs more frequently during DL model construction. And that hyperparameters setting is the phase during which cloning is the riskiest, since it often leads to faults.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2019

Taxonomy of Real Faults in Deep Learning Systems

The growing application of deep neural networks in safety-critical domai...
research
04/19/2023

How to Do Things with Deep Learning Code

The premise of this article is that a basic understanding of the composi...
research
03/02/2022

Code Smells in Machine Learning Systems

As Deep learning (DL) systems continuously evolve and grow, assuring the...
research
11/03/2022

DetAIL : A Tool to Automatically Detect and Analyze Drift In Language

Machine learning and deep learning-based decision making has become part...
research
06/05/2021

An Empirical Study on Tensor Shape Faults in Deep Learning Systems

Software developers frequently adopt deep learning (DL) libraries to inc...
research
02/19/2019

Efficient Memory Management for GPU-based Deep Learning Systems

GPU (graphics processing unit) has been used for many data-intensive app...
research
01/01/2021

Faults in Deep Reinforcement Learning Programs: A Taxonomy and A Detection Approach

A growing demand is witnessed in both industry and academia for employin...

Please sign up or login with your details

Forgot password? Click here to reset