'Moving On' – Investigating Inventors' Ethnic Origins Using Supervised Learning

01/03/2022
by   Matthias Niggli, et al.
7

Patent data provides rich information about technical inventions, but does not disclose the ethnic origin of inventors. In this paper, I use supervised learning techniques to infer this information. To do so, I construct a dataset of 95'202 labeled names and train an artificial recurrent neural network with long-short-term memory (LSTM) to predict ethnic origins based on names. The trained network achieves an overall performance of 91 origins. I use this model to classify and investigate the ethnic origins of 2.68 million inventors and provide novel descriptive evidence regarding their ethnic origin composition over time and across countries and technological fields. The global ethnic origin composition has become more diverse over the last decades, which was mostly due to a relative increase of Asian origin inventors. Furthermore, the prevalence of foreign-origin inventors is especially high in the USA, but has also increased in other high-income economies. This increase was mainly driven by an inflow of non-western inventors into emerging high-technology fields for the USA, but not for other high-income countries.

READ FULL TEXT

page 14

page 31

research
12/10/2014

Bach in 2014: Music Composition with Recurrent Neural Network

We propose a framework for computer music composition that uses resilien...
research
03/09/2015

Compositional Distributional Semantics with Long Short Term Memory

We are proposing an extension of the recursive neural network that makes...
research
07/10/2020

Artificial Neural Network Approach for the Identification of Clove Buds Origin Based on Metabolites Composition

This paper examines the use of artificial neural network approach in ide...
research
09/21/2022

Caught in the Crossfire: Fears of Chinese-American Scientists

The US leadership in science and technology has greatly benefitted from ...
research
11/24/2022

Data Origin Inference in Machine Learning

It is a growing direction to utilize unintended memorization in ML model...
research
09/11/2018

DeepProteomics: Protein family classification using Shallow and Deep Networks

The knowledge regarding the function of proteins is necessary as it give...
research
01/24/2022

A Two-phase Recommendation Framework for Consistent Java Method Names

In software engineering (SE) tasks, the naming approach is so important ...

Please sign up or login with your details

Forgot password? Click here to reset