Drug Repurposing for Cancer: An NLP Approach to Identify Low-Cost Therapies

More than 200 generic drugs approved by the U.S. Food and Drug Administration for non-cancer indications have shown promise for treating cancer. Due to their long history of safe patient use, low cost, and widespread availability, repurposing of generic drugs represents a major opportunity to rapidly improve outcomes for cancer patients and reduce healthcare costs worldwide. Evidence on the efficacy of non-cancer generic drugs being tested for cancer exists in scientific publications, but trying to manually identify and extract such evidence is intractable. In this paper, we introduce a system to automate this evidence extraction from PubMed abstracts. Our primary contribution is to define the natural language processing pipeline required to obtain such evidence, comprising the following modules: querying, filtering, cancer type entity extraction, therapeutic association classification, and study type classification. Using the subject matter expertise on our team, we create our own datasets for these specialized domain-specific tasks. We obtain promising performance in each of the modules by utilizing modern language modeling techniques and plan to treat them as baseline approaches for future improvement of individual components.


page 1

page 2

page 3

page 4


A stability-driven protocol for drug response interpretable prediction (staDRIP)

Modern cancer -omics and pharmacological data hold great promise in prec...

The efficacy of various machine learning models for multi-class classification of RNA-seq expression data

Late diagnosis and high costs are key factors that negatively impact the...

Deep learning methods for drug response prediction in cancer: predominant and emerging trends

Cancer claims millions of lives yearly worldwide. While many therapies h...

A frame semantic overview of NLP-based information extraction for cancer-related EHR notes

Objective: There is a lot of information about cancer in Electronic Heal...

A Dataset for N-ary Relation Extraction of Drug Combinations

Combination therapies have become the standard of care for diseases such...

Semi-Automating Knowledge Base Construction for Cancer Genetics

In this work, we consider the exponentially growing subarea of genetics ...

Please sign up or login with your details

Forgot password? Click here to reset