Identifying translational science through embeddings of controlled vocabularies
Objective: Translational science aims at "translating" basic scientific discoveries into clinical applications. The identification of translational science has practicality such as evaluating the effectiveness of investments made into large programs like the Clinical and Translational Science Awards. Despite several proposed methods that group publications—the primary unit of research output—into some categories, we still lack a quantitative way to place papers onto the full, continuous spectrum from basic research to clinical medicine. Methods: Here we learn vector-representations of controlled vocabularies assigned to MEDLINE papers to obtain a Translational Axis (TA) that points from basic science to clinical medicine. The projected position of a term on the TA, expressed by a continuous quantity, indicates the term's "appliedness." The position of a paper, determined by the average location over its terms, quantifies the degree of its "appliedness," which we term as "level score." Results: We validate our method by comparing with previous techniques, showing excellent agreement yet uncovering significant variations of scores of papers in previously defined categories. The measure allows us to characterize the standing of journals, disciplines, and the entire biomedical literature along the basic-applied spectrum. Analysis on large-scale citation network reveals two main findings. First, direct citations mainly occurred between papers with similar scores. Second, shortest paths are more likely ended up with a paper closer to the basic end of the spectrum, regardless of where the starting paper is on the spectrum. Conclusions: The proposed method provides a quantitative way to identify translational science.
READ FULL TEXT