Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling

11/12/2022
by   Philipp Scharpf, et al.
0

The increasing number of questions on Question Answering (QA) platforms like Math Stack Exchange (MSE) signifies a growing information need to answer math-related questions. However, there is currently very little research on approaches for an open data QA system that retrieves mathematical formulae using their concept names or querying formula identifier relationships from knowledge graphs. In this paper, we aim to bridge the gap by presenting data mining methods and benchmark results to employ Mathematical Entity Linking (MathEL) and Unsupervised Formula Labeling (UFL) for semantic formula search and mathematical question answering (MathQA) on the arXiv preprint repository, Wikipedia, and Wikidata, which is part of the Wikimedia ecosystem of free knowledge. Based on different types of information needs, we evaluate our system in 15 information need modes, assessing over 7,000 query results. Furthermore, we compare its performance to a commercial knowledge-base and calculation-engine (Wolfram Alpha) and search-engine (Google). The open source system is hosted by Wikimedia at https://mathqa.wmflabs.org. A demovideo is available at purl.org/mathqa.

READ FULL TEXT
research
06/28/2019

Introducing MathQA – A Math-Aware Question Answering System

We present an open source math-aware Question Answering System based on ...
research
04/11/2021

Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation

Mathematical information retrieval (MathIR) applications such as semanti...
research
12/04/2020

ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?

The zbMATH database contains more than 4 million bibliographic entries. ...
research
06/21/2023

CompMix: A Benchmark for Heterogeneous Question Answering

Fact-centric question answering (QA) often requires access to multiple, ...
research
03/25/2023

Thistle: A Vector Database in Rust

We present Thistle, a fully functional vector database. Thistle is an en...
research
07/01/2019

Katecheo: A Portable and Modular System for Multi-Topic Question Answering

We introduce a modular system that can be deployed on any Kubernetes clu...
research
06/13/2020

Mining Implicit Relevance Feedback from User Behavior forWeb Question Answering

Training and refreshing a web-scale Question Answering (QA) system for a...

Please sign up or login with your details

Forgot password? Click here to reset