ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?

12/04/2020
by   Philipp Scharpf, et al.
0

The zbMATH database contains more than 4 million bibliographic entries. We aim to provide easy access to these entries. Therefore, we maintain different index structures, including a formula index. To optimize the findability of the entries in our database, we continuously investigate new approaches to satisfy the information needs of our users. We believe that the findings from the ARQMath evaluation will generate new insights into which index structures are most suitable to satisfy mathematical information needs. Search engines, recommender systems, plagiarism checking software, and many other added-value services acting on databases such as the arXiv and zbMATH need to combine natural and formula language. One initial approach to address this challenge is to enrich the mostly unstructured document data via Entity Linking. The ARQMath Task at CLEF 2020 aims to tackle the problem of linking newly posted questions from Math Stack Exchange (MSE) to existing ones that were already answered by the community. To deeply understand MSE information needs, answer-, and formula types, we performed manual runs for tasks 1 and 2. Furthermore, we explored several formula retrieval methods: For task 2, such as fuzzy string search, k-nearest neighbors, and our recently introduced approach to retrieve Mathematical Objects of Interest (MOI) with textual search queries. The task results show that neither our automated methods nor our manual runs archived good scores in the competition. However, the perceived quality of the hits returned by the MOI search particularly motivates us to conduct further research about MOI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2022

Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling

The increasing number of questions on Question Answering (QA) platforms ...
research
04/11/2021

Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation

Mathematical information retrieval (MathIR) applications such as semanti...
research
03/03/2023

Discovery and Recognition of Formula Concepts using Machine Learning

Citation-based Information Retrieval (IR) methods for scientific documen...
research
02/07/2020

Discovering Mathematical Objects of Interest – A Study of Mathematical Notations

Mathematical notation, i.e., the writing system used to communicate conc...
research
04/13/2018

Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

Mathematical formulae represent complex semantic information in a concis...
research
12/05/2017

One for All: Towards Language Independent Named Entity Linking

Entity linking (EL) is the task of disambiguating mentions in text by as...

Please sign up or login with your details

Forgot password? Click here to reset