Machop: an End-to-End Generalized Entity Matching Framework

06/10/2022
by   Jin Wang, et al.
0

Real-world applications frequently seek to solve a general form of the Entity Matching (EM) problem to find associated entities. Such scenarios include matching jobs to candidates in job targeting, matching students with courses in online education, matching products with user reviews on e-commercial websites, and beyond. These tasks impose new requirements such as matching data entries with diverse formats or having a flexible and semantics-rich matching definition, which are beyond the current EM task formulation or approaches. In this paper, we introduce the problem of Generalized Entity Matching (GEM) that satisfies these practical requirements and presents an end-to-end pipeline Machop as the solution. Machop allows end-users to define new matching tasks from scratch and apply them to new domains in a step-by-step manner. Machop casts the GEM problem as sequence pair classification so as to utilize the language understanding capability of Transformers-based language models (LMs) such as BERT. Moreover, it features a novel external knowledge injection approach with structure-aware pooling methods that allow domain experts to guide the LM to focus on the key matching information thus further contributing to the overall performance. Our experiments and case studies on real-world datasets from a popular recruiting platform show a significant 17.1 score against state-of-the-art methods along with meaningful matching results that are human-understandable.

READ FULL TEXT
research
04/01/2020

Deep Entity Matching with Pre-Trained Language Models

We present Ditto, a novel entity matching system based on pre-trained Tr...
research
06/15/2021

Machamp: A Generalized Entity Matching Benchmark

Entity Matching (EM) refers to the problem of determining whether two di...
research
08/02/2023

MultiEM: Efficient and Effective Unsupervised Multi-Table Entity Matching

Entity Matching (EM), which aims to identify all entity pairs referring ...
research
06/08/2021

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

Entity Matching (EM) aims at recognizing entity records that denote the ...
research
01/12/2022

CompanyName2Vec: Company Entity Matching Based on Job Ads

Entity Matching is an essential part of all real-world systems that take...
research
07/26/2018

General Context-Aware Data Matching and Merging Framework

Due to numerous public information sources and services, many methods to...
research
05/12/2022

Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction

Entity matching (EM) is the most critical step for entity resolution (ER...

Please sign up or login with your details

Forgot password? Click here to reset