Quality-aware skill translation models for expert finding on StackOverflow

07/16/2019
by   Arash Dargahi Nobari, et al.
0

StackOverflow has become an emerging resource for talent recognition in recent years. While users exploit technical language on StackOverflow, recruiters try to find the relevant candidates for jobs using their own terminology. This procedure implies a gap which exists between recruiters and candidates terms. Due to this gap, the state-of-the-art expert finding models cannot effectively address the expert finding problem on StackOverflow. We propose two translation models to bridge this gap. The first approach is a statistical method and the second is based on word embedding approach. Utilizing several translations for a given query during the scoring step, the result of each intermediate query is blended together to obtain the final ranking. Here, we propose a new approach which takes the quality of documents into account in scoring step. We have made several observations to visualize the effectiveness of the translation approaches and also the quality-aware scoring approach. Our experiments indicate the following: First, while statistical and word embedding translation approaches provide different translations for each query, both can considerably improve the recall. Besides, the quality-aware scoring approach can improve the precision remarkably. Finally, our best proposed method can improve the MAP measure up to 46 average, in comparison with the state-of-the-art expert finding approach.

READ FULL TEXT

page 7

page 17

research
05/25/2016

Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation

Using top-ranked documents in response to a query has been shown to be a...
research
05/21/2020

Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation

Speech translation (ST) aims to learn transformations from speech in the...
research
10/12/2022

DATScore: Evaluating Translation with Data Augmented Translations

The rapid development of large pretrained language models has revolution...
research
08/06/2019

Word Embedding for Response-To-Text Assessment of Evidence

Manually grading the Response to Text Assessment (RTA) is labor intensiv...
research
11/14/2019

Query Expansion for Patent Searching using Word Embedding and Professional Crowdsourcing

The patent examination process includes a search of previous work to ver...
research
08/02/2023

Optimizing Machine Translation through Prompt Engineering: An Investigation into ChatGPT's Customizability

This paper explores the influence of integrating the purpose of the tran...
research
01/31/2021

Introduction of a novel word embedding approach based on technology labels extracted from patent data

Diversity in patent language is growing and makes finding synonyms for c...

Please sign up or login with your details

Forgot password? Click here to reset