Informed Machine Learning, Centrality, CNN, Relevant Document Detection, Repatriation of Indigenous Human Remains

03/25/2023
by   Md Abul Bashar, et al.
0

Among the pressing issues facing Australian and other First Nations peoples is the repatriation of the bodily remains of their ancestors, which are currently held in Western scientific institutions. The success of securing the return of these remains to their communities for reburial depends largely on locating information within scientific and other literature published between 1790 and 1970 documenting their theft, donation, sale, or exchange between institutions. This article reports on collaborative research by data scientists and social science researchers in the Research, Reconcile, Renew Network (RRR) to develop and apply text mining techniques to identify this vital information. We describe our work to date on developing a machine learning-based solution to automate the process of finding and semantically analysing relevant texts. Classification models, particularly deep learning-based models, are known to have low accuracy when trained with small amounts of labelled (i.e. relevant/non-relevant) documents. To improve the accuracy of our detection model, we explore the use of an Informed Neural Network (INN) model that describes documentary content using expert-informed contextual knowledge. Only a few labelled documents are used to provide specificity to the model, using conceptually related keywords identified by RRR experts in provenance research. The results confirm the value of using an INN network model for identifying relevant documents related to the investigation of the global commercial trade in Indigenous human remains. Empirical analysis suggests that this INN model can be generalized for use by other researchers in the social sciences and humanities who want to extract relevant information from large textual corpora.

READ FULL TEXT
research
12/11/2018

Text data mining and data quality management for research information systems in the context of open data and open science

In the implementation and use of research information systems (RIS) in s...
research
06/09/2022

SsciBERT: A Pre-trained Language Model for Social Science Texts

The academic literature of social sciences is the literature that record...
research
07/07/2022

Word Embedding for Social Sciences: An Interdisciplinary Survey

To extract essential information from complex data, computer scientists ...
research
10/27/2020

Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports

The task of text and sentence classification is associated with the need...
research
04/13/2022

Retrieval of Scientific and Technological Resources for Experts and Scholars

Institutions of higher learning, research institutes and other scientifi...
research
10/09/2020

Scaling Systematic Literature Reviews with Machine Learning Pipelines

Systematic reviews, which entail the extraction of data from large numbe...
research
07/08/2015

Generating Navigable Semantic Maps from Social Sciences Corpora

It is now commonplace to observe that we are facing a deluge of online i...

Please sign up or login with your details

Forgot password? Click here to reset