DeepAI AI Chat
Log In Sign Up

It Runs in the Family: Searching for Similar Names using Digitized Family Trees

by   Aviad Elyashar, et al.

Searching for a person's name is a common online activity. However, web search engines suffer from low numbers of accurate results to a query containing names. Underlying these poor results are the multiple legitimate spelling variations for a given name, as opposed to regular text that typically possesses a single way to be spelled correctly. Today, most of the techniques suggesting related names based on pattern matching and phonetic encoding approaches. However, they frequently lead to poor performance. Here, we propose a novel approach to tackle the problem of similar name suggestions. Our novel algorithm utilizes historical data collected from genealogy websites along with graph algorithms. In contrast to previous approaches that suggest similar names based on encoded representations or patterns, we propose a general approach that suggests similar names based on the construction and analysis of family trees. Using this valuable and historical information and combining it with network algorithms provides a large name-based graph that offers a great number of suggestions based on historical ancestors. Similar names are extracted from the graph based on generic ordering functions that outperform other algorithms suggesting names based on a single dimension, which limits their performance. Utilizing a large-scale online genealogy dataset with over 17M profiles and more than 200K unique first names, we constructed a large name-based graph. Using this graph along with 7,399 labeled given names with their true synonyms, we evaluated our proposed approach and showed that comparing our algorithm to other algorithms, including phonetic and string similarity algorithms, provides superior performance in terms of accuracy, F1, and precision. We suggest our algorithm as a useful tool for suggesting similar names based on constructing a name-based graph using family trees.


page 1

page 2

page 3

page 4


How Does That Sound? Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning

Searching for information about a specific person is an online activity ...

Personal Names in Modern Turkey

We analyzed the most common 5000 male and 5000 female Turkish names base...

When Are Names Similar Or the Same? Introducing the Code Names Matcher Library

Program code contains functions, variables, and data structures that are...

An Investigation of Biases in Web Search Engine Query Suggestions

Survey-based studies suggest that search engines are trusted more than s...

Deep Generation of Coq Lemma Names Using Elaborated Terms

Coding conventions for naming, spacing, and other essentially stylistic ...

An empirical study on the names of points of interest and their changes with geographic distance

While Points Of Interest (POIs), such as restaurants, hotels, and barber...