Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching

02/23/2022
by   Geewook Kim, et al.
0

Semi-structured query systems for document-oriented databases have many real applications. One particular application that we are interested in is matching each financial receipt image with its corresponding place of interest (POI, e.g., restaurant) in the nationwide database. The problem is especially challenging in the real production environment where many similar or incomplete entries exist in the database and queries are noisy (e.g., errors in optical character recognition). In this work, we aim to address practical challenges when using embedding-based retrieval for the query grounding problem in semi-structured data. Leveraging recent advancements in deep language encoding for retrieval, we conduct extensive experiments to find the most effective combination of modules for the embedding and retrieval of both query and database entries without any manually engineered component. The proposed model significantly outperforms the conventional manual pattern-based model while requiring much less development and maintenance cost. We also discuss some core observations in our experiments, which could be helpful for practitioners working on a similar problem in other domains.

READ FULL TEXT

page 1

page 2

research
05/22/2023

Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision

Most existing task-oriented dialog (TOD) systems track dialog states in ...
research
07/03/2020

MIRA: Leveraging Multi-Intention Co-click Information in Web-scale Document Retrieval using Deep Neural Networks

We study the problem of deep recall model in industrial web search, whic...
research
02/04/2022

Performance Evaluation of Structured and Semi-Structured Bioinformatics Tools: A Comparative Study

There is a wide range of available biological databases developed by bio...
research
04/16/2021

Cost-effective End-to-end Information Extraction for Semi-structured Document Images

A real-world information extraction (IE) system for semi-structured docu...
research
10/21/2018

3D shape retrieval basing on representatives of classes

In this paper, we present an improvement of our proposed technique for 3...
research
07/12/2018

Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates

In many retrieval problems, where we must retrieve one or more entries f...

Please sign up or login with your details

Forgot password? Click here to reset