Identifying the Units of Measurement in Tabular Data

11/23/2021
by   Taha Ceritli, et al.
0

We consider the problem of identifying the units of measurement in a data column that contains both numeric values and unit symbols in each row, e.g., "5.2 l", "7 pints". In this case we seek to identify the dimension of the column (e.g. volume) and relate the unit symbols to valid units (e.g. litre, pint) obtained from a knowledge graph. Below we present PUC, a Probabilistic Unit Canonicalizer that can accurately identify the units of measurement, extract semantic descriptions of quantitative data columns and canonicalize their entries. We present the first messy real-world tabular datasets annotated for units of measurement, which can enable and accelerate the research in this area. Our experiments on these datasets show that PUC achieves better results than existing solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2021

ptype-cat: Inferring the Type and Values of Categorical Variables

Type inference is the task of identifying the type of values in a data c...
research
03/13/2023

Bounds and Algorithms for Frameproof Codes and Related Combinatorial Structures

In this paper, we study upper bounds on the minimum length of frameproof...
research
10/07/2019

Small Youden Rectangles and Their Connections to Other Row-Column Designs

In this paper we study Youden rectangles of small orders. We have enumer...
research
10/11/2017

Measurement Context Extraction from Text: Discovering Opportunities and Gaps in Earth Science

We propose Marve, a system for extracting measurement values, units, and...
research
07/12/2022

Sprague-Grundy values and complexity for LCTR

Given a Young diagram on n boxes as a non-increasing sequence of integer...
research
10/28/2019

The Multi-level Bottleneck Assignment Problem: Complexity and Solution Methods

We study the multi-level bottleneck assignment problem (MBA), which has ...
research
09/11/2023

Combinative Cumulative Knowledge Processes

We analyze Cumulative Knowledge Processes, introduced by Ben-Eliezer, Mi...

Please sign up or login with your details

Forgot password? Click here to reset