Software Engineering Meets Deep Learning: A Literature Review

09/25/2019
by   Fabio Ferreira, et al.
0

Deep learning (DL) is being used nowadays in many traditional software engineering (SE) problems and tasks, such as software documentation, defect prediction, and software testing. However, since the renaissance of DL techniques is still very recent, we lack works that summarize and condense the most recent and relevant research conducted in the intersection of DL and SE. Therefore, in this paper we describe the first results of a literature review covering 81 papers about DL SE.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 6

page 7

page 8

page 10

09/14/2020

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research

An increasingly popular set of techniques adopted by software engineerin...
08/12/2020

Synergy between Machine/Deep Learning and Software Engineering: How Far Are We?

Since 2009, the deep learning revolution, which was triggered by the int...
09/17/2020

Deep Learning Software Engineering: State of Research and Future Directions

Given the current transformative potential of research that sits at the ...
05/13/2018

Deep Learning in Software Engineering

Recent years, deep learning is increasingly prevalent in the field of So...
06/25/2020

On the Replicability and Reproducibility of Deep Learning in Software Engineering

Deep learning (DL) techniques have gained significant popularity among s...
10/08/2019

Software Engineering Practice in the Development of Deep Learning Applications

Deep-Learning(DL) applications have been widely employed to assist in va...
12/04/2018

Cut to the chase: Revisiting the relevance of software engineering research

Software engineering (SE) research should be relevant to industrial prac...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning (DL) applications are increasingly important in many areas, such as automatic text translation [79], image recognition [41, 29], self-driving cars [2, 37], smart cities [30, 59]

, etc. Furthermore, various frameworks—such as TensorFlow

111https://www.tensorflow.org

and PyTorch

222https://pytorch.org—are available nowadays to facilitate the implementation of DL applications. Interestingly, software engineering (SE) researchers are also starting to explore the application of DL in traditional SE problems and areas, such documentation [94, 55, 60], defect prediction [66, 72, 74, 33], and testing [57, 52, 47].

However, since the cross-pollination between DL & SE is very recent, we do not have a clear map of the research conducted by combining these two areas. This map can help other researchers with interest on starting to work on the application of DL in SE. It can also help researchers that already work with DL & SE to have a clear picture of similar research in the area. Finally, mapping the research conducted in the intersection of DL & SE might help practitioners and industrial organizations to better understand the problems, solutions, and opportunities that exist in this area.

In this article, we provide the first results of our ongoing effort to review and summarize the most recent and relevant literature about DL & SE. To this purpose, we collect and analyze 81 papers recently published in major SE conferences and journals. We show the growth of the number of papers about DL & SE over the years. We also reveal the most common recent problems tackled by such papers. Finally, we provide data on the most common DL techniques used by SE researchers.

2 Deep Learning in a Nutshell

Deep Learning (DL) is a subfield of Machine Learning (ML) that relies on multiple layers of Neural Networks (NN) to model high level representations 

[21]

. Similarly to traditional ML, DL techniques are suitable for classification, clustering, and regression problems. To better understand how DL differs from ML, suppose we are trying to classify which modules in a system are likely to be defective. If we decide to use conventional machine learning, we need a labeled dataset with relevant features able to distinguish defective from non-defective modules. To create this dataset, we usually apply several feature extraction approaches to extract meaningful features, and then train our model. In this point relies the key difference between traditional ML and DL techniques. While in traditional ML approaches the features are handcrafted, with DL they are selected by neural networks automatically 

[44, 45, 22].

Currently, there are many types of NNs, such as Convolutional Neural Networks, Recurrent Neural Networks, Auto-Encoders, Generative Adversarial Networks, and Deep Reinforcement Learning 

[21]. In the following, we outline four common classes of NNs that are useful in several SE problems:

Multilayer Perceptrons (MLP):

They are suitable on classification and regression prediction problems. MLPs can be adapted to different types of data, such as image, text, and time series data. In addition, when evaluating the performance of different algorithms on a particular problem, we can use MLP results as baseline of comparison. Basically, MLPs consist of one or more layers of neurons. The input layer receives the data, the hidden layers provide abstraction levels, and the output layer is responsible to make predictions.


Convolutional Neural Networks (CNN): Although, they are designed for image recognition, we can use CNN for other classification and regression prediction problems. They also can be adapted to different types of data, such as image, text, and sequence input data. In summary, the input layer in a CNN receives the data and the hidden layers are responsible for feature extraction. There are three types of layers in a CNN, such as convolution layers, pooling layers, and fully-connected layers. The convolution layer performs a filter to an input multiple times to build a feature map and the pooling layer is responsible for reducing the spatial size of the feature map. Then, the CNN output can feed for instance a fully connected layer to create the model and make predictions.

Recurrent Neural Networks (RNN):

They are a specialized type of NN for sequence prediction problems, i.e., they are designed to receive historical sequence data and predict the next output value(s) in the sequence. The main difference regarding the traditional MLP can be thought as loops on the MLP architecture. The hidden layers do not only use the current input, but also the previously received inputs. Conceptually, this feedback loop add memory to the network. The Long Short-Term Memory (LSTM) is a special type of RNN able to learn long-term dependencies. Specially, LSTM is one of the most used RNNs in many different applications with outstanding results 

[87, 62].

Hybrid Neural Network Architectures (HNN): They refer to architectures using two or more types of NNs. Usually, CNNs and RNNs are used as layers in a wider model. As an example from the industry, Google’s translate service uses LSTM RNN architectures [79].

3 Methodology

To collect the papers, we searched for deep learn* in the following digital libraries: Scopus, ACM Digital Library, IEEE Xplore, Web of Science, SpringerLink and Wiley Online Library. However, we only considered papers published in the software engineering conferences and journals indexed by CSIndexbr [69], which is a Computer Science Index system.333https://csindexbr.org CSIndexbr is considered a GOTO ranking [1], i.e., information systems that provide good, transparent, open, and objective data about CS departments and institutions.444http://gotorankings.org

The software engineering venues listed by CSIndexbr are presented in Table 1. As can be observed, the system indexes 15 conferences and 12 journals in software engineering, including top-conferences (ICSE, FSE, and ASE), top-journals (IEEE TSE and ACM TOSEM) and also next-tier conferences (MSR, ICSME, ISSTA, etc) and journals (EMSE, JSS, IST, etc). CSIndexbr follows a quantitative criteria, based on metrics such as h5-index, number of papers submitted and accepted to index a conference or journal.

Acronym Name
ICSE Int. Conference on Software Engineering
FSE Foundations of Software Engineering
MSR Mining Software Repositories
ASE Automated Software Engineering
ISSTA Int. Symposium on Software Testing and Analysis
ICSME Int. Conference on Software Maintenance and Evolution
ICST Int. Conference on Software Testing, Validation and Verification
MODELS Int. Conference on Model Driven Engineering Languages and Systems
SANER Int. Conference on Software Analysis, Evolution and Reengineering
SLPC Systems and Software Product Line Conference
RE Int. Requirements Engineering Conference
FASE Fundamental Approaches to Software Engineering
ICPC Int. Conference on Program Comprehension
ESEM Int. Symposium on Empirical Software Engineering and Measurement
ICSA Int. Conference on Software Architecture
IEEE TSE IEEE Transactions on Software Engineering
ACM TOSEM ACM Transactions on Software Engineering and Methodology
JSS Journal of Systems and Software
IEEE Software IEEE Software
EMSE Empirical Software Engineering
SoSyM Software and Systems Modeling
IST Information and Software Technology
SCP Science of Computer Programming
SPE Software Practice and Experience
SQJ Software Quality Journal
JSEP Journal of Software Evolution and Process
REJ Requirements Engineering Journal
Table 1: Venues

By searching for deep learn* we found 141 papers in the conferences and journals listed in Table 1. The search was performed on September 15, 2019. Then, we removed papers with less than 10 pages, due to our decision to focus in full papers only. The only exception are papers published at IEEE Software (magazine). In this case, we defined a threshold of six pages to select the papers. By applying this size threshold, we eliminated 49 papers.

Then, we manually read the title and abstract of the remaining papers to confirm they indeed qualify as research that uses DL on SE-related problems. As a result, we eliminated 11 papers, including 5 papers that are not related to SE (e.g., one paper that evaluates an “achievement-driven methodology to give students more control of their learning with enough flexibility to engage them in deeper learning”), two papers published in other tracks (one paper at ICSE-SEET and one paper at ICSE-SEIP), two papers that only mention deep learning in the abstract, and two papers that were supersed by a journal version, i.e., we discarded the conference version and only considered the extended version of the work. Our final dataset has 81 papers.

4 Results

4.1 Publication Date

In our data collection, we did not define an initial publication date for the candidate papers. Despite that, we found a single paper published in 2015. All other papers are from subsequent years, as illustrated in Figure 1. Although the year is not finished, we have more papers published in 2019 than in 2018, which shows an increasing interest for applying deep learning in software engineering.

Figure 1: Papers by year

4.2 Authors Affiliation

We found 12 papers (14.8%) with at least one author associated to industry. Microsoft Research has the highest number of papers (3 papers), followed by Clova AI, Facebook, Grammatech, Nvidia, Accenture, Fiat Chrysler, IBM, and Codeplay (each one with a single paper).

4.3 Authors Country

Figure 2 shows a chart with the number of papers according to the authors country. Since papers can have authors from multiple countries, the sum is greater than 81 papers (the number of papers we reviewed in the study). Most papers have at least one Chinese author (33 papers), followed by USA (31 papers) and Australia (16 papers). We found authors from 20 countries.

Figure 2: Papers by authors country

4.4 Publication Venues

Figure 3 shows a chart with the number of papers by publication venue. In our dataset, 61 papers are from conferences (75.3%) and 20 papers from journals (24.7%). ICSE and FSE concentrate most papers (24 papers or 29.6%). IEEE TSE is the journal with the highest number of papers (7 papers, 8.6%). We did not find papers about DL & SE in nine venues: MODELS, SLPC, RE, FASE, ICSA, ACM TOSEM, SoSyM, SCP and SQJ.

Figure 3: Papers by venue

4.5 Research Problem

Regarding the investigated research problem, we classified the papers in three principal groups: (1) papers that investigate the usage of SE tools and techniques in the development of DL-based systems; (2) papers that propose the usage of DL-based techniques to solve SE-related problems; and (3) position papers or tutorials. Figure 4 summarizes our classification. The following subsections describe the papers in each group.

Figure 4: Papers by research problem

4.5.1 Using Software Engineering Techniques in Deep Learning-based Software

We classified 10 papers in this category (12.3%), including papers that adapt SE tools and techniques to DL-based software (8 papers) and papers that describe empirical studies of DL-based software (2 papers). Papers that apply SE to DL are mostly focused on solving particular problems that appear when testing DL-based software  [61, 54, 82, 65, 40, 14]. However, we also found papers that describe quantitative metrics to assess DL-based software [13] and to support the deployment of DL-based software [8]. Finally, we found two empirical studies of DL-based software, both investigating the characteristics of the bugs reported in such systems [38, 89],

4.5.2 Using Deep Learning Techniques in Software Engineering Problems

The usage of DL in SE is concentrated in three main problems: documentation, testing, and defect prediction. We provide mode details in the following paragraphs:

Documentation: This category has the highest number of papers (13 papers, 16%). Seven papers study problems associated to StackOverflow questions and answers, including the usage of DL techniques to cluster related posts [55, 18, 83, 84], to recommend tags [94], cross-language posts search, i.e., translating non-English queries to English before searches [55], and to extract API tips [73]. Furthermore, we found papers about the automatic generation of code comments [34], the automatic identification of source code fragments in videos [60, 91], the classification of JavaDoc-based documents [19], and on source code summarization, i.e., using DL techniques to provide a high-level natural language description of the function performed by a code unit [70].

Testing: We found seven papers (8.6%) using DL in software testing , covering fuzzing tests [9, 20, 93], fault localization [90, 47], mutation testing [57], and testing of mobile apps [52].

Defect Prediction: We also found seven papers (8.6%) that use DL for defect prediction. Three papers use DL to extract semantic features directly from source code to improve defect prediction models [74, 72, 11]. Other papers also extract semantic features, but from commit descriptions [33] or commit sequences [75]. Finally, there are papers that investigate the usage of particular DL models, such as deep forests [95]

and stacked denoising autoencoders 

[66].

Other research problems: Other important research problems handled using deep learning are code search [24, 23, 35, 3], security [10, 4, 28, 85], and software language modelling [78, 32, 68, 92]. The next most investigated research problems, with three papers each, are bug localization [42, 36, 80] and clone detection [46, 86, 76]. We also found two papers on each of the following problems: code smell detection [15, 51], mobile development [25, 26], program repair [67, 77]

, sentiment analysis 

[63, 50], and type inference [31, 56].

Finally, we found one paper about each one of the following problems: anomaly detection

[88], API migration [5], bug report summarization [48], decompilation [39], design patterns detection [64], duplicate bug detection [12]

, effort estimation

[7], formal methods [43], program comprehension [58], software categorization [71], software maintenance [81], traceability [27], and UI design [6].

4.5.3 Position Papers

We classified three papers (3.7%) in this category, all published at IEEE Software. They describe the challenges and opportunities of using DL in automotive software [16, 17] or provide a quick tutorial on machine learning and DL [53].

4.6 Neural Networks Techniques

Figure 5 shows a chart with the most common deep learning techniques used by the analyzed papers. The most common technique is Convolutional Neural Network (CNN) (18 papers, 22.2%), followed by Recurrent Neural Networks (RNN) (17 papers, 20.9%) and Hybrid Neural Networks (HNN) (12 papers, 14.8%).

Figure 5: Papers by deep learning technique

Table 2 shows the distribution of the DL techniques by research problem. As we can observe, RNNs are used in all problems, with the exception of security and bug localization. Although CNN are used in more papers (18 papers), they have focus in only four problems (documentation, testing, bug localization, and clone detection).

Neural Networks
Problem CNN RNN HNN LSTM DBN MLP
Documentation
Defect prediction
Testing
Code search
Security
Software language modeling
Bug localization
Clone detection
Table 2: Neural networks techniques by research problem

5 Related Work

We found that Li, Jiang, Ren, Li, and Zhang also provide an arXiv preprint describing a literature review on the usage of DL in SE [49]

. However, they review papers published before March, 2018, while we are covering papers published before September, 2019. This fact probably explains the difference regarding papers in top conferences: they report 14 papers at ICSE/FSE/ASE, whereas we are reporting 30 papers. Moreover, they only list papers from two journals (IST and Expert Systems and Applications), while we found papers in five journals and one magazine. Consequently, we analyze, for example, seven papers published at IEEE TSE. By contrast, they consider a broad range of conferences, e.g., SEKE, QRS, SNAPL. Finally, we provide an analysis of the neural networks used by the reviewed papers according to the research problem they investigate.

6 Conclusion

In this work, we analyzed 81 recent papers that apply DL techniques to SE problems or vice-versa. Our main findings are as follows:

  • DL is gaining momentum among SE researchers. For example, 35 papers (43.2%) are from 2019 and only one paper from 2015.

  • The authors of most papers are from China (33 papers) or USA (31 papers).

  • 12 papers (14.8%) have at least one author from industry.

  • The top-3 research problems tackled by the analyzed papers are documentation (13 papers), defect prediction (7 papers), and testing (7 papers).

  • The most common neural network type used in the analyzed papers are Convolutional Neural Network (CNN) and Recurrent Neural Networks (RNN).

The list of papers and the data analyzed in this work are available at: https://docs.google.com/spreadsheets/d/1wRIqYVh-qXEocfoup8A6O1OGaCmbcXHgnt_YZmD-e-Q/edit?usp=sharing

Acknowledgments

Our research is supported by FAPEMIG and CNPq.

References

  • [1] E. Berger, S. M. Blackburn, C. E. Brodley, H. V. Jagadish, K. S. McKinley, M. A. Nascimento, M. Shin, K. Wang, and L. Xie (2019) GOTO rankings considered helpful. Communications of the ACM 62 (7), pp. 29–30. Cited by: §3.
  • [2] M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba (2016) End to end learning for self-driving cars. CoRR abs/1604.07316. Cited by: §1.
  • [3] J. Cambronero, H. Li, S. Kim, K. Sen, and S. Chandra (2019) When deep learning met code search. In FSE, pp. 964–974. Cited by: §4.5.2.
  • [4] C. Chen, W. Diao, Y. Zeng, S. Guo, and C. Hu (2018) DRLgencert: deep learning-based automated testing of certificate verification in SSL/TLS implementations. In ICSME, pp. 48–58. Cited by: §4.5.2.
  • [5] C. Chen, Z. Xing, Y. Liu, and K. L. X. Ong (2019) Mining likely analogical apis across third-party libraries via large-scale unsupervised API semantics embedding. IEEE TSE 14 (8), pp. 1–1. Cited by: §4.5.2.
  • [6] C. Chen, T. Su, G. Meng, Z. Xing, and Y. Liu (2018)

    From UI design image to gui skeleton: a neural machine translator to bootstrap mobile GUI implementation

    .
    In ICSE, pp. 665–676. Cited by: §4.5.2.
  • [7] M. Choetkiertikul, H. K. Dam, T. Tran, T. Pham, A. Ghose, and T. Menzies (2019) A deep learning model for estimating story points. IEEE TSE 45 (7), pp. 637–656. Cited by: §4.5.2.
  • [8] E. D. Coninck, S. Bohez, S. Leroux, T. Verbelen, B. Vankeirsbilck, P. Simoens, and B. Dhoedt (2018) DIANNE: a modular framework for designing, training and deploying deep neural networks on heterogeneous distributed infrastructure. JSS 141 (1), pp. 52 – 65. Cited by: §4.5.1.
  • [9] C. Cummins, P. Petoumenos, A. Murray, and H. Leather (2018) Compiler fuzzing through deep learning. In ISSTA, pp. 95–105. Cited by: §4.5.2.
  • [10] H. K. Dam, T. Tran, T. T. M. Pham, S. W. Ng, J. Grundy, and A. Ghose (2018) Automatic feature learning for predicting vulnerable software components. IEEE TSE 14 (8), pp. 1–19. Cited by: §4.5.2.
  • [11] H. K. Dam, T. Pham, S. W. Ng, T. Tran, J. Grundy, A. Ghose, T. Kim, and C. Kim (2019) Lessons learned from using a deep tree-based model for software defect prediction in practice. In MSR, pp. 46–57. Cited by: §4.5.2.
  • [12] J. Deshmukh, A. K. M, S. Podder, S. Sengupta, and N. Dubash (2017) Towards accurate duplicate bug retrieval using deep learning techniques. In ICSME, pp. 115–124. Cited by: §4.5.2.
  • [13] X. Du, X. Xie, Y. Li, L. Ma, Y. Liu, and J. Zhao (2019) DeepStellar: model-based quantitative analysis of stateful deep learning systems. In FSE, pp. 477–487. Cited by: §4.5.1.
  • [14] A. Dwarakanath, M. Ahuja, S. Sikand, R. M. Rao, R. P. J. C. Bose, N. Dubash, and S. Podder (2018) Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In ISSTA, pp. 118–128. Cited by: §4.5.1.
  • [15] S. Fakhoury, V. Arnaoudova, C. Noiseux, F. Khomh, and G. Antoniol (2018) Keep it simple: is deep learning good for linguistic smell detection?. In SANER, pp. 602–611. Cited by: §4.5.2.
  • [16] F. Falcini, G. Lami, and A. M. Costanza (2017) Deep learning in automotive software. IEEE Software 34 (3), pp. 56–63. Cited by: §4.5.3.
  • [17] F. Falcini, G. Lami, and A. Mitidieri (2017) Yet another challenge for the automotive software: deep learning. IEEE Software, pp. 1–13. Cited by: §4.5.3.
  • [18] W. Fu and T. Menzies (2017) Easy over hard: a case study on deep learning. In FSE, pp. 49–60. Cited by: §4.5.2.
  • [19] D. Fucci, A. Mollaalizadehbahnemiri, and W. Maalej (2019) On using machine learning to identify knowledge in API reference documentation. In FSE, pp. 109–119. Cited by: §4.5.2.
  • [20] P. Godefroid, H. Peleg, and R. Singh (2017) Learn&Fuzz: machine learning for input fuzzing. In ASE, pp. 50–59. Cited by: §4.5.2.
  • [21] I. Goodfellow, Y. Bengio, and A. Courville (2016) Deep learning. MIT press. Cited by: §2, §2.
  • [22] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, and G. Wang (2015) Recent advances in convolutional neural networks. CoRR abs/1512.07108. Cited by: §2.
  • [23] X. Gu, H. Zhang, and S. Kim (2018) Deep code search. In ICSE, pp. 933–944. Cited by: §4.5.2.
  • [24] X. Gu, H. Zhang, D. Zhang, and S. Kim (2016) Deep API learning. In FSE, pp. 631–642. Cited by: §4.5.2.
  • [25] C. Guo, D. Huang, N. Dong, Q. Ye, J. Xu, Y. Fan, H. Yang, and Y. Xu (2019) Deep review sharing. In SANER, pp. 61–72. Cited by: §4.5.2.
  • [26] C. Guo, W. Wang, Y. Wu, N. Dong, Q. Ye, J. Xu, and S. Zhang (2019) Systematic comprehension for developer reply in mobile system forum. SANER, pp. 242–252. Cited by: §4.5.2.
  • [27] J. Guo, J. Cheng, and J. Cleland-Huang (2017) Semantically enhanced software traceability using deep learning techniques. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 3–14. Cited by: §4.5.2.
  • [28] Z. Han, X. Li, Z. Xing, H. Liu, and Z. Feng (2017) Learning to predict severity of software vulnerability using only vulnerability description. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 125–136. Cited by: §4.5.2.
  • [29] K. He, X. Zhang, S. Ren, and J. Sun (2015) Deep residual learning for image recognition. CoRR abs/1512.03385. Cited by: §1.
  • [30] Y. He, F. R. Yu, N. Zhao, V. C. M. Leung, and H. Yin (2017) Software-defined networks with mobile edge computing and caching for smart cities: a big data deep reinforcement learning approach. IEEE Communications Magazine 55 (12), pp. 31–37. Cited by: §1.
  • [31] V. J. Hellendoorn, C. Bird, E. T. Barr, and M. Allamanis (2018) Deep learning type inference. In FSE, pp. 152–162. Cited by: §4.5.2.
  • [32] V. J. Hellendoorn and P. Devanbu (2017) Are deep neural networks the best choice for modeling source code?. In FSE, pp. 763–773. Cited by: §4.5.2.
  • [33] T. Hoang, H. Khanh Dam, Y. Kamei, D. Lo, and N. Ubayashi (2019) DeepJIT: an end-to-end deep learning framework for just-in-time defect prediction. In MSR, pp. 34–45. Cited by: §1, §4.5.2.
  • [34] X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin (2018) Deep code comment generation. In ICPC, pp. 200–210. Cited by: §4.5.2.
  • [35] Q. Huang, Y. Yang, and M. Cheng (2019) Deep learning the semantics of change sequences for query expansion. Software: Practice and Experience, pp. 1–18. Cited by: §4.5.2.
  • [36] X. Huo, F. Thung, M. Li, D. Lo, and S. Shi (2019) Deep transfer bug localization. IEEE TSE, pp. 1–12. Cited by: §4.5.2.
  • [37] B. Huval, T. Wang, S. Tandon, J. Kiske, W. Song, J. Pazhayampallil, M. Andriluka, P. Rajpurkar, T. Migimatsu, R. Cheng-Yue, F. A. Mujica, A. Coates, and A. Y. Ng (2015) An empirical evaluation of deep learning on highway driving. CoRR abs/1504.01716. Cited by: §1.
  • [38] M. J. Islam, G. Nguyen, R. Pan, and H. Rajan (2019) A comprehensive study on deep learning bug characteristics. In FSE, pp. 510–520. Cited by: §4.5.1.
  • [39] D. S. Katz, J. Ruchti, and E. Schulte (2018) Using recurrent neural networks for decompilation. In SANER, pp. 346–356. Cited by: §4.5.2.
  • [40] J. Kim, R. Feldt, and S. Yoo (2019) Guiding deep learning system testing using surprise adequacy. In ICSE, pp. 1039–1049. Cited by: §4.5.1.
  • [41] A. Krizhevsky, I. Sutskever, and G. E. Hinton (2017-05) ImageNet classification with deep convolutional neural networks. Communications of the ACM 60 (6), pp. 84–90. Cited by: §1.
  • [42] A. N. Lam, A. T. Nguyen, H. A. Nguyen, and T. N. Nguyen (2017) Bug localization with combination of deep learning and information retrieval. In ICPC, pp. 218–229. Cited by: §4.5.2.
  • [43] T. B. Le and D. Lo (2018) Deep specification mining. In ISSTA, pp. 106–117. Cited by: §4.5.2.
  • [44] Y. LeCun, Y. Bengio, and G. Hinton (2015) Deep learning. Nature 521 (7553), pp. 436. Cited by: §2.
  • [45] Y. LeCun, B. E. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. E. Hubbard, and L. D. Jackel (1990) Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems 2, D. S. Touretzky (Ed.), pp. 396–404. Cited by: §2.
  • [46] L. Li, H. Feng, W. Zhuang, N. Meng, and B. Ryder (2017) CCLearner: a deep learning-based clone detection approach. In ICSME, pp. 249–260. Cited by: §4.5.2.
  • [47] X. Li, W. Li, Y. Zhang, and L. Zhang (2019) DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In ISSTA, pp. 169–180. Cited by: §1, §4.5.2.
  • [48] X. Li, H. Jiang, D. Liu, Z. Ren, and G. Li (2018) Unsupervised deep bug report summarization. In ICPC, pp. 144–155. Cited by: §4.5.2.
  • [49] X. Li, H. Jiang, Z. Ren, G. Li, and J. Zhang (2018) Deep learning in software engineering. arXiv abs/1805.04825. Cited by: §5.
  • [50] B. Lin, F. Zampetti, G. Bavota, M. Di Penta, M. Lanza, and R. Oliveto (2018) Sentiment analysis for software engineering: how far can we go?. In ICSE, pp. 94–104. Cited by: §4.5.2.
  • [51] H. Liu, J. Jin, Z. Xu, Y. Bu, Y. Zou, and L. Zhang (2019) Deep learning based code smell detection. IEEE TSE, pp. 1–28. Cited by: §4.5.2.
  • [52] P. Liu, X. Zhang, M. Pistoia, Y. Zheng, M. Marques, and L. Zeng (2017) Automatic text input generation for mobile testing. In ICSE, pp. 643–653. Cited by: §1, §4.5.2.
  • [53] P. Louridas and C. Ebert (2016) Machine learning. IEEE Software 33 (5), pp. 110–115. Cited by: §4.5.3.
  • [54] L. Ma, F. Juefei-Xu, F. Zhang, J. Sun, M. Xue, B. Li, C. Chen, T. Su, L. Li, Y. Liu, J. Zhao, and Y. Wang (2018) DeepGauge: multi-granularity testing criteria for deep learning systems. In ASE, pp. 120–131. Cited by: §4.5.1.
  • [55] S. Majumder, N. Balaji, K. Brey, W. Fu, and T. Menzies (2018) 500+ times faster than deep learning: a case study exploring faster methods for text mining StackOverflow. In MSR, pp. 554–563. Cited by: §1, §4.5.2.
  • [56] R. S. Malik, J. Patra, and M. Pradel (2019) NL2Type: inferring JavaScript function types from natural language information. In ICSE, pp. 304–315. Cited by: §4.5.2.
  • [57] D. Mao, L. Chen, and L. Zhang (2019) An extensive study on cross-project predictive mutation testing. In ICST, pp. 160–171. Cited by: §1, §4.5.2.
  • [58] Q. Mi, J. Keung, Y. Xiao, S. Mensah, and Y. Gao (2018) Improving code readability classification using convolutional neural networks. IST 104 (1), pp. 60–71. Cited by: §4.5.2.
  • [59] M. Mohammadi, A. Al-Fuqaha, M. Guizani, and J. Oh (2018) Semisupervised deep reinforcement learning in support of iot and smart city services. IEEE Internet of Things Journal 5 (2), pp. 624–635. Cited by: §1.
  • [60] J. Ott, A. Atchison, P. Harnack, A. Bergh, and E. Linstead (2018) A deep learning approach to identifying source code in images and video. In MSR, pp. 376–386. Cited by: §1, §4.5.2.
  • [61] H. V. Pham, T. Lutellier, W. Qi, and L. Tan (2019) CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In ICSE, pp. 1027–1038. Cited by: §4.5.1.
  • [62] H. Sak, A. W. Senior, and F. Beaufays (2014) Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR abs/1402.1128. Cited by: §2.
  • [63] H. Sankar, V. Subramaniyaswamy, V. Vijayakumar, S. A. Kumar, R. Logesh, and A. Umamakeswari (2019) Intelligent sentiment analysis approach using edge computing-based deep learning technique. Software: Practice and Experience 0 (0), pp. 1–13. Cited by: §4.5.2.
  • [64] H. Thaller, L. Linsbauer, and A. Egyed (2019) Feature maps: a comprehensible software representation for design pattern detection. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 207–217. Cited by: §4.5.2.
  • [65] Y. Tian, K. Pei, S. Jana, and B. Ray (2018) DeepTest: automated testing of deep-neural-network-driven autonomous cars. In ICSE, pp. 303–314. Cited by: §4.5.1.
  • [66] H. Tong, B. Liu, and S. Wang (2018) Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. IST 96 (1), pp. 94 – 111. Cited by: §1, §4.5.2.
  • [67] M. Tufano, J. Pantiuchina, C. Watson, G. Bavota, and D. Poshyvanyk (2019) On learning meaningful code changes via neural machine translation. In ICSE, pp. 25–36. Cited by: §4.5.2.
  • [68] M. Tufano, C. Watson, G. Bavota, M. D. Penta, M. White, and D. Poshyvanyk (2018) Deep learning similarities from different representations of source code. In MSR, pp. 542–553. Cited by: §4.5.2.
  • [69] M. T. Valente and K. Paixao (2018) CSIndexbr: exploring the Brazilian scientific production in Computer Science. arXiv abs/1807.09266. Cited by: §3.
  • [70] Y. Wan, Z. Zhao, M. Yang, G. Xu, H. Ying, J. Wu, and P. S. Yu (2018) Improving automatic source code summarization via deep reinforcement learning. In ASE, pp. 397–407. Cited by: §4.5.2.
  • [71] C. Wang, X. Peng, M. Liu, Z. Xing, X. Bai, B. Xie, and T. Wang (2019) A learning-based approach for automatic construction of domain glossary from source code and documentation. In FSE, pp. 97–108. Cited by: §4.5.2.
  • [72] S. Wang, T. Liu, J. Nam, and L. Tan (2018) Deep semantic feature learning for software defect prediction. IEEE Transactions on Software Engineering, pp. 1–26. Cited by: §1, §4.5.2.
  • [73] S. Wang, N. Phan, Y. Wang, and Y. Zhao (2019) Extracting API tips from developer question and answer websites. In MSR, pp. 321–332. Cited by: §4.5.2.
  • [74] S. Wang, T. Liu, and L. Tan (2016) Automatically learning semantic features for defect prediction. In ICSE, pp. 297–308. Cited by: §1, §4.5.2.
  • [75] M. Wen, R. Wu, and S. C. Cheung (2018) How well do change sequences predict defects? sequence learning from software changes. IEEE Transactions on Software Engineering, pp. 1–20. Cited by: §4.5.2.
  • [76] M. White, M. Tufano, C. Vendome, and D. Poshyvanyk (2016) Deep learning code fragments for code clone detection. In ASE, pp. 87–98. Cited by: §4.5.2.
  • [77] M. White, M. Tufano, M. Martinez, M. Monperrus, and D. Poshyvanyk (2019) Sorting and transforming program repair ingredients via deep learning code similarities. In SANER, pp. 479–490. Cited by: §4.5.2.
  • [78] M. White, C. Vendome, M. Linares-Vásquez, and D. Poshyvanyk (2015) Toward deep learning software repositories. In MSR, pp. 334–345. Cited by: §4.5.2.
  • [79] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, and J. Dean (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR abs/1609.08144. Cited by: §1, §2.
  • [80] Y. Xiao, J. Keung, K. E. Bennin, and Q. Mi (2019) Improving bug localization with word embedding and enhanced convolutional neural networks. IST 105 (1), pp. 17 – 29. Cited by: §4.5.2.
  • [81] R. Xie, L. Chen, W. Ye, Z. Li, T. Hu, D. Du, and S. Zhang (2019)

    DeepLink: a code knowledge graph based deep learning approach for issue-commit link recovery

    .
    In SANER, pp. 434–444. Cited by: §4.5.2.
  • [82] X. Xie, L. Ma, F. Juefei-Xu, M. Xue, H. Chen, Y. Liu, J. Zhao, B. Li, J. Yin, and S. See (2019) DeepHunter: a coverage-guided fuzz testing framework for deep neural networks. In ISSTA, pp. 146–157. Cited by: §4.5.1.
  • [83] B. Xu, D. Ye, Z. Xing, X. Xia, G. Chen, and S. Li (2016) Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In ASE, pp. 51–62. Cited by: §4.5.2.
  • [84] B. Xu, A. Shirani, D. Lo, and M. A. Alipour (2018) Prediction of relatedness in Stack Overflow: deep learning vs. svm: a reproducibility study. In ESEM, pp. 21:1–21:10. Cited by: §4.5.2.
  • [85] R. Yan, X. Xiao, G. Hu, S. Peng, and Y. Jiang (2018) New deep learning method to detect code injection attacks on hybrid applications. JSS 137 (1), pp. 67 – 77. Cited by: §4.5.2.
  • [86] H. Yu, W. Lam, L. Chen, G. Li, T. Xie, and Q. Wang (2019) Neural detection of semantic code clones via tree-based convolution. In ICPC, pp. 70–80. Cited by: §4.5.2.
  • [87] Yu Wang (2017) A new concept using lstm neural networks for dynamic system identification. In 2017 American Control Conference (ACC), pp. 5324–5329. Cited by: §2.
  • [88] X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li, J. Chen, X. He, R. Yao, J. Lou, M. Chintalapati, F. Shen, and D. Zhang (2019) Robust log-based anomaly detection on unstable log data. In FSE, pp. 807–817. Cited by: §4.5.2.
  • [89] Y. Zhang, Y. Chen, S. Cheung, Y. Xiong, and L. Zhang (2018) An empirical study on TensorFlow program bugs. In ISSTA, pp. 129–140. Cited by: §4.5.1.
  • [90] Z. Zhang, Y. Lei, X. Mao, and P. Li (2019) CNN-FL: an effective approach for localizing faults using convolutional neural networks. In SANER, pp. 445–455. Cited by: §4.5.2.
  • [91] D. Zhao, Z. Xing, C. Chen, X. Xia, and G. Li (2019) ActionNet: vision-based workflow action recognition from programming screencasts. In ICSE, pp. 350–361. Cited by: §4.5.2.
  • [92] G. Zhao and J. Huang (2018) DeepSim: deep learning code functional similarity. In FSE, pp. 141–151. Cited by: §4.5.2.
  • [93] H. Zhao, Z. Li, H. Wei, J. Shi, and Y. Huang (2019) SeqFuzzer: an industrial protocol fuzzing framework from a deep learning perspective. In ICST, pp. 59–67. Cited by: §4.5.2.
  • [94] P. Zhou, J. Liu, X. Liu, Z. Yang, and J. Grundy (2019) Is deep learning better than traditional approaches in tag recommendation for software information sites?. IST 109 (1), pp. 1 – 13. Cited by: §1, §4.5.2.
  • [95] T. Zhou, X. Sun, X. Xia, B. Li, and X. Chen (2019) Improving defect prediction with deep forest. IST 114 (1), pp. 204 – 216. Cited by: §4.5.2.