Cross-lingual Short-text Matching with Deep Learning

11/13/2018
by   Asmelash Teka Hadgu, et al.
0

The problem of short text matching is formulated as follows: given a pair of sentences or questions, a matching model determines whether the input pair mean the same or not. Models that can automatically identify questions with the same meaning have a wide range of applications in question answering sites and modern chatbots. In this article, we describe the approach by team hahu to solve this problem in the context of the "CIKM AnalytiCup 2018 - Cross-lingual Short-text Matching of Question Pairs" that is sponsored by Alibaba. Our solution is an end-to-end system based on current advances in deep learning which avoids heavy feature-engineering and achieves improved performance over traditional machine-learning approaches. The log-loss scores for the first and second rounds of the contest are 0.35 and 0.39 respectively. The team was ranked 7th from 1027 teams in the overall ranking scheme by the organizers that consisted of the two contest scores as well as: innovation and system integrity, understanding data as well as practicality of the solution for business.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2020

A Study of Neural Matching Models for Cross-lingual IR

In this study, we investigate interaction-based neural matching models f...
research
05/23/2023

Evaluating and Modeling Attribution for Cross-Lingual Question Answering

Trustworthy answer content is abundant in many high-resource languages a...
research
11/06/2022

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space

With the recent developments in cross-lingual Text-to-Speech (TTS) syste...
research
01/05/2018

aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

As an alternative to question answering methods based on feature enginee...
research
07/31/2017

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Semantic Textual Similarity (STS) measures the meaning similarity of sen...
research
06/25/2021

ParaLaw Nets – Cross-lingual Sentence-level Pretraining for Legal Text Processing

Ambiguity is a characteristic of natural language, which makes expressio...
research
10/08/2019

TraffickCam: Explainable Image Matching For Sex Trafficking Investigations

Investigations of sex trafficking sometimes have access to photographs o...

Please sign up or login with your details

Forgot password? Click here to reset