Code search is the most frequent developer activity in software development process (Caitlin15). Reusable code examples help improve the efficiency of developers in their developing process (Brandt09; Shuai2020). Given a natural language query that describes the developer’s intent, the goal of code search is to find the most relevant code snippet from a large source code corpus.
Many code search engines have been developed for code search. They mainly rely on traditional information retrieval (IR) techniques such as keyword matching (Meili15) or a combination of text similarity and Application Program Interface (API) matching (Lv15). Recently, many works have taken steps to apply deep learning methods (he2016deep; ChoMGBBSB14; wang2019tag2gauss; wang2019tag2vec; yang2020domain) to code search (Gu2018; Cambronero2019; Yan2020; Li2020; Feng2020; Zhu2020; Shuai2020; Ye2020; Haldar2020; Ling2020; Ling2020a; wang2020cocogum)
, using neural networks to capture deep and semantic correlations between natural language queries and code snippets, and have achieved promising performance improvements. These methods employ various types of model structures, including sequential models(Gu2018; Cambronero2019; Yan2020; Li2020; Feng2020; Zhu2020; Shuai2020; Ye2020; Haldar2020), graph models (Ling2020; Guo2020), and transformers (Feng2020).
Existing deep learning code search methods mainly use a single model to represent queries and code snippets. However, code may have diverse information from different dimensions, such as business logic, specific algorithm, and hardware communication, making it hard for a single code representation module to cover all the perspectives. On the other hand, as a specific query may focus on several perspectives, it is difficult for a single query representation module to represent different user intents.