Code Search Intent Classification Using Weak Supervision

11/24/2020
by   Nikitha Rao, et al.
0

Developers use search for various tasks such as finding code, documentation, debugging information, etc. In particular, web search is heavily used by developers for finding code examples and snippets during the coding process. Recently, natural language based code search has been an active area of research. However, the lack of real-world large-scale datasets is a significant bottleneck. In this work, we propose a weak supervision based approach for detecting code search intent in search queries for C# and Java programming languages. We evaluate the approach against several baselines on a real-world dataset comprised of over 1 million queries mined from Bing web search engine and show that the CNN based model can achieve an accuracy of 77 and Java respectively. Furthermore, we are also releasing the first large-scale real-world dataset of code search queries mined from Bing web search engine. We hope that the dataset will aid future research on code search.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2022

DiffSearch: A Scalable and Precise Search Engine for Code Changes

The source code of successful projects is evolving all the time, resulti...
research
05/02/2022

ORCAS-I: Queries Annotated with Intent using Weak Supervision

User intent classification is an important task in information retrieval...
research
12/19/2019

The Usage of Web Search for Software Engineering

Internet plays a key role in accomplishing many tasks. For many such tas...
research
07/02/2019

A Framework for Evaluating Snippet Generation for Dataset Search

Reusing existing datasets is of considerable significance to researchers...
research
11/06/2019

Open Domain Web Keyphrase Extraction Beyond Language Modeling

This paper studies keyphrase extraction in real-world scenarios where do...
research
04/06/2022

Code Search: A Survey of Techniques for Finding Code

The immense amounts of source code provide ample challenges and opportun...
research
10/01/2022

CodeDSI: Differentiable Code Search

Reimplementing solutions to previously solved software engineering probl...

Please sign up or login with your details

Forgot password? Click here to reset