Transferring Knowledge from Text to Predict Disease Onset

08/06/2016
by   Yun Liu, et al.
0

In many domains such as medicine, training data is in short supply. In such cases, external knowledge is often helpful in building predictive models. We propose a novel method to incorporate publicly available domain expertise to build accurate models. Specifically, we use word2vec models trained on a domain-specific corpus to estimate the relevance of each feature's text description to the prediction problem. We use these relevance estimates to rescale the features, causing more important features to experience weaker regularization. We apply our method to predict the onset of five chronic diseases in the next five years in two genders and two age groups. Our rescaling approach improves the accuracy of the model, particularly when there are few positive examples. Furthermore, our method selects 60 physicians. Our method is applicable to other domains where feature and outcome descriptions are available.

READ FULL TEXT
research
06/29/2023

Speech-based Age and Gender Prediction with Transformers

We report on the curation of several publicly available datasets for age...
research
12/07/2016

Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets

Providing accurate predictions is challenging for machine learning algor...
research
10/27/2022

A Curriculum Learning Approach for Multi-domain Text Classification Using Keyword weight Ranking

Text classification is a very classic NLP task, but it has two prominent...
research
05/23/2022

Non-Parametric Domain Adaptation for End-to-End Speech Translation

End-to-End Speech Translation (E2E-ST) has received increasing attention...
research
04/15/2023

Continual Domain Adaptation through Pruning-aided Domain-specific Weight Modulation

In this paper, we propose to develop a method to address unsupervised do...
research
06/22/2020

What shapes feature representations? Exploring datasets, architectures, and training

In naturalistic learning problems, a model's input contains a wide range...
research
08/25/2022

Fundamentals of Task-Agnostic Data Valuation

We study valuing the data of a data owner/seller for a data seeker/buyer...

Please sign up or login with your details

Forgot password? Click here to reset