xPQA: Cross-Lingual Product Question Answering across 12 Languages

05/16/2023
by   Xiaoyu Shen, et al.
0

Product Question Answering (PQA) systems are key in e-commerce applications to provide responses to customers' questions as they shop for products. While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages across 9 branches, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. We evaluate various approaches involving machine translation at runtime or offline, leveraging multilingual pre-trained LMs, and including or excluding xPQA training data. We find that (1) In-domain data is essential as cross-lingual rankers trained on other domains perform poorly on the PQA task; (2) Candidate ranking often prefers runtime-translation approaches while answer generation prefers multilingual approaches; (3) Translating offline to augment multilingual models helps candidate ranking mainly on languages with non-Latin scripts; and helps answer generation mainly on languages with Latin scripts. Still, there remains a significant performance gap between the English and the cross-lingual test sets.

READ FULL TEXT

page 4

page 12

research
10/22/2020

XOR QA: Cross-lingual Open-Retrieval Question Answering

Multilingual question answering tasks typically assume answers exist in ...
research
10/23/2020

Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering

Coupled with the availability of large scale datasets, deep learning arc...
research
09/07/2022

Improving the Cross-Lingual Generalisation in Visual Question Answering

While several benefits were realized for multilingual vision-language pr...
research
07/13/2023

MegaWika: Millions of reports and their sources across 50 diverse languages

To foster the development of new models for collaborative AI-assisted re...
research
02/26/2023

CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer

Recent advances in cross-lingual commonsense reasoning (CSR) are facilit...
research
07/31/2017

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Semantic Textual Similarity (STS) measures the meaning similarity of sen...
research
10/22/2020

Multilingual Synthetic Question and Answer Generation for Cross-Lingual Reading Comprehension

We propose a simple method to generate large amounts of multilingual que...

Please sign up or login with your details

Forgot password? Click here to reset