Log In Sign Up

Semantic Search in Millions of Equations

by   Katharina Morik, et al.

Given the increase of publications, search for relevant papers becomes tedious. In particular, search across disciplines or schools of thinking is not supported. This is mainly due to the retrieval with keyword queries: technical terms differ in different sciences or at different times. Relevant articles might better be identified by their mathematical problem descriptions. Just looking at the equations in a paper already gives a hint to whether the paper is relevant. Hence, we propose a new approach for retrieval of mathematical expressions based on machine learning. We design an unsupervised representation learning task that combines embedding learning with self-supervised learning. Using graph convolutional neural networks we embed mathematical expression into low-dimensional vector spaces that allow efficient nearest neighbor queries. To train our models, we collect a huge dataset with over 29 million mathematical expressions from over 900,000 publications published on The math is converted into an XML format, which we view as graph data. Our empirical evaluations involving a new dataset of manually annotated search queries show the benefits of using embedding models for mathematical retrieval.


page 1

page 2

page 3

page 4


The Search for Equations - Learning to Identify Similarities between Mathematical Expressions

On your search for scientific articles relevant to your research questio...

CPS-MEBR: Click Feedback-Aware Web Page Summarization for Multi-Embedding-Based Retrieval

Embedding-based retrieval (EBR) is a technique to use embeddings to repr...

Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context

Mathematical formulae represent complex semantic information in a concis...

Semantic query-by-example speech search using visual grounding

A number of recent studies have started to investigate how speech system...

Math-Aware Search Engines: Physics Applications and Overview

Search engines for equations now exist, which return results matching th...