Introduction of a novel word embedding approach based on technology labels extracted from patent data

01/31/2021
by   Mark Standke, et al.
0

Diversity in patent language is growing and makes finding synonyms for conducting patent searches more and more challenging. In addition to that, most approaches for dealing with diverse patent language are based on manual search and human intuition. In this paper, a word embedding approach using statistical analysis of human labeled data to produce accurate and language independent word vectors for technical terms is introduced. This paper focuses on the explanation of the idea behind the statistical analysis and shows first qualitative results. The resulting algorithm is a development of the former EQMania UG (eqmania.com) and can be tested under eqalice.com until April 2021.

READ FULL TEXT
research
10/24/2018

Local Homology of Word Embeddings

Topological data analysis (TDA) has been widely used to make progress on...
research
04/28/2020

Conversational Word Embedding for Retrieval-Based Dialog System

Human conversations contain many types of information, e.g., knowledge, ...
research
10/25/2018

Word Embedding based Edit Distance

Text similarity calculation is a fundamental problem in natural language...
research
03/13/2023

A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches

Vector-based word representations help countless Natural Language Proces...
research
05/25/2019

SuperCaptioning: Image Captioning Using Two-dimensional Word Embedding

Language and vision are processed as two different modal in current work...
research
07/16/2019

Quality-aware skill translation models for expert finding on StackOverflow

StackOverflow has become an emerging resource for talent recognition in ...

Please sign up or login with your details

Forgot password? Click here to reset