Document similarity measures can support semi-automated identification of unreported links between trial registrations and published trial articles

09/07/2017 ∙ by Adam G. Dunn, et al. ∙ Macquarie University 0

Objectives: Trial registries can be used to measure reporting biases and support systematic reviews but 45 the article reporting on the trial. We evaluated the use of document similarity methods to identify unreported links between and PubMed. Study Design and Setting: We extracted terms and concepts from a dataset of 72,469 registrations and 276,307 PubMed articles, and tested methods for ranking articles across 16,005 reported links and 90 manually-identified unreported links. Performance was measured by the median rank of matching articles, and the proportion of unreported links that could be found by screening ranked candidate articles in order. Results: The best performing concept-based representation produced a median rank of 3 (IQR 1-21) for reported links and 3 (IQR 1-19) for the manually-identified unreported links, and term-based representations produced a median rank of 2 (1-20) for reported links and 2 (IQR 1-12) in unreported links. The matching article was ranked first for 40 registration identified 86 the growth in the corpus of reported links between and PubMed, we found that document similarity methods can assist in the identification of unreported links between trial registrations and corresponding articles.



There are no comments yet.


page 1

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Background

Clinical trial registries were established to track the conduct of clinical trials and make basic information about trials publicly available. A number of policies now mandate prospective registration for clinical studies of regulated interventions [1, 2, 3]. is a US-based registry for clinical studies and is the largest single database of trial registrations. also links registrations to published results by connecting to research articles indexed in bibliographic databases [4, 5]. This linkage is achieved using a unique identifier (the NCT Number) for each study. Publishers may include the NCT number in the abstract or full text of published articles, and the metadata stored by PubMed, which provides access to a bibliographic database with information for more than 26 million biomedical articles.

While the introduction of trial registries has been invaluable for monitoring trial reporting, a substantial proportion of trials reports remain disconnected from their registrations. In a 2012 study examining the quality of linking in, 44% of registrations without linked publications were found to have corresponding published articles found by manual searches [7]. In a systematic review of studies that examine reported and unreported links between registrations and articles, the median proportion of registrations with reported links was 23% and the median proportion of unreported links (those that required manual searches) was 17%, with the remainder unpublished [6].

The quality of this linkage between bibliographic databases and trial registries affects the time it takes to measure reporting biases. This includes determining which clinical studies remain unpublished [8, 9, 10, 11, 12], or comparing registered outcomes with what is reported in published articles [13, 14, 15, 16, 17, 18, 19, 20, 21]. Without comprehensive linking, research to evaluate reporting biases must instead rely on time-consuming manual searches to identify unreported links.

The presence of unreported links also limits the value of trial registries for systematic reviews. If links between registrations and articles were comprehensive, registrations could be more effectively used to automate the identification of trials for inclusion in systematic reviews [22, 23], as well as provide early signals that a systematic review should be updated [24, 25, 26, 27, 28, 29, 30, 31, 32, 33].

Our aim in this study was to evaluate whether we could use information contained in the recorded links between registrations and PubMed articles to help identify unreported links. The longer-term goal is to develop robust methods to identify all published research associated with a trial registration, whether or not links are provided in the registration record.

2 Methods

In the following experiments, we use similarities in the text from registrations in and articles in PubMed. Using the set of reported links as a baseline, we test a series of different methods to represent the text as features, and assign weights to each of the features. The resulting set of features are used to produce an automatic, weighted search query for use in PubMed, where the objective is to rank the matching article as high as possible in a list of candidate articles. The approach is expected to replace the need for an expert to construct search queries in PubMed for every registration without a link to a published article.

2.1 Study data

We included trial registrations in for trials that were received by on or after 1 October 2007, were marked as completed, and described an interventional study. The date was selected to correspond to the passing of the Food and Drug Administration Amendments Act of 2007, which expanded registration requirements for studies registered with A final search of was conducted on April 14 2017. Data extracted from registrations included titles, trial summaries, and conditions studied.

We next selected all articles indexed in PubMed that reported a clinical trial and were published on or after October 1 2007. Articles were assumed to be reporting the results of trials if they included a NCT Number as a secondary source identifier or listed “clinical trial”, “controlled clinical trial”, or “randomized controlled trial” as a publication type, and did not include “meta-analysis” or “review” as a publication type. A final search was conducted on April 14 2017. Data extracted from each PubMed article entry included the title text, abstract text, and any NCT Numbers stored as a secondary source identifier in the metadata. Where PubMed entries included NCT Numbers as secondary source identifiers, we described these as reported links and created a dataset comprising the set of registrations with known links from one or more articles.

We then created a second set of registrations for testing, comprising 90 registrations that had unreported links to trial articles identified by manually checking 200 registrations with trial completion dates between January 1 2007 and December 31 2015. The 200 registrations had no reported links to trial articles in PubMed at the time of the search. We manually searched PubMed and other bibliographic databases to identify articles that reported the results of the trials, following a search strategy previously described and common to studies examining outcome reporting biases [6]. The search uses study design information, investigator names, locations, and other identifying features to search PubMed for the matching article. To confirm a match, we compared the number of participants, the study design and the length of the study, and any information about when and where the trial was undertaken. Where there were multiple matches, we selected the article published closest to the completion date of the trial.

2.2 Feature representations and distance measures

Data elements were extracted for each registry entry including brief title, detailed title, brief summary, detailed summary, and condition. Articles were represented by the text of the titles and abstracts in PubMed.

A term-based representation of registrations and trial reports was created using the set of extracted words after removing punctuation. Using existing methods for extracting concepts from free text [34, 35, 36, 37, 38, 39, 40], we also created a concept-based representation where each document was represented by the set of clinical concepts. These concepts were produced by MetaMap [41], which identifies concepts from the Unified Medical Language System (UMLS). We used a local MetaMap server that recognised concepts from the 2014AA UMLS release, and used MetaMap with word sense disambiguation. This method uses a term’s context in a sentence to address the ambiguity of words with multiple meanings. The concept that was ranked highest by MetaMap for each phrase in the text was recorded.

We did not use information related to investigators, funding, or authors to allow unbiased testing of these document similarity measures. While investigator names, numbers of participants, and funding information are all useful for determining whether a published article describes a trial from a registration, it would need to be implemented differently from the terms and concepts used here and could introduce biases in the estimation of publication rates for studies with different funding sources.

For both the term and concept-based representations, we tested two ways to assign values to each of the features. A binary vector representation captured the presence or absence of a given feature (term or concept) in a document. A

term frequency-inverse document frequency (tf-idf) vector representation was also calculated [42, 43], given by the product of the number of times the feature appears in the document (log-transformed) and the inverse document frequency, which is given by the inverse of the proportion of documents in which the feature appears (also log-transformed). The log transformations are typically applied to the tf-idf calculation to help to ensure that individual terms or concepts that are used frequently in very few registrations and articles do not dominate the document similarity calculations described below.

We evaluated the performance of three standard pairwise measures of distance between the vectors representing registrations and the vectors representing trial reports. The normalised Euclidean distance

is given by the straight-line distance between the two vectors representing a registry entry and an article, divided by the number of features present in the article. The cosine similarity is the cosine of the angle between the two vectors representing a registry entry and an article. The

Jaccard distance is the number of features present in both a registration vector and article trial report vector, divided by the number of features present in either vector.

For each registry entry, we produced a ranked list of PubMed articles, created by sorting the set of all articles by their distance from the registration. Where a registry entry had known links from more than one trial report in PubMed, we determined the rank for the first linked trial report published after the completion date of the trial.

2.3 Performance measures

Performance was assessed using the final ranks of the linked (reported or unreported) trial reports, reported as median and interquartile range. We additionally determined the proportion of registrations for which the linked (reported or unreported) trial report was ranked first among all candidates, and the proportion of registrations for which the linked (reported or unreported) trial report was within the first 50 candidates (recall@50). We also reported the extension of this measure for recall after checking any number of candidates (recall@N), as a visual demonstration of the amount of manual effort required to identify a given percentage of previously unreported links in a cohort of registrations. The recall@N is the cumulative proportion of registrations for which the linked (reported or unreported) trial report can be found after having read the top N candidates, and is defined for values between 0 (where the proportion of found trial reports is always 0) and the total number of candidates (where the proportion is always 1).

Code for the document similarity methods are available online via a GitHub repository ( The repository includes scripts for using the document similarity to rank PubMed articles given a NCT Number.

3 Results

We identified 72,469 registrations in and 276,307 candidate trial reports in PubMed that met the inclusion criteria (Figure 1). Among the registrations, 16,005 (22.1%) had one or more links from trial reports published in the same period (Table I).

Count (% of total)
Trial type
 Phase 0 1 2,318 (14.5%)
 Phase 2 2,931 (18.3%)
 Phase 3 2,842 (17.8%)
 Phase 4 2,091 (13.1%)
 Unknown 5,823 (36.4%)
 Under 50 4,946 (30.9%)
 50-200 5,719 (35.7%)
 Over 200 4,990 (31.2%)
 Unknown 98 (0.6%)
 Industry only 4,759 (29.7%)
 Mixed 1,461 (9.1%)
 No industry 9,785 (61.1%)
Total 16,005 (100%)
TABLE I: Characteristics of 16,005 trials with completion dates after October 1 2007 and links from PubMed articles.
Fig. 1: The study data included 72,469 registrations from and the metadata from 276,307 articles available in PubMed. A testing set of 200 registrations were manually reviewed and 90 of those were found to have unreported links to trial reports.

In the set of 200 registrations, 90 were found to have unreported links to trial reports in PubMed. Of these, 33% (30 of 90) included the NCT Number in the full text of the article but the identifier was not reported in the PubMed entry metadata or provided by investigators in (Table II).

Count (% of total) Matching article in PubMed (% of group) NCT Number included in article (% of group)
Trial type
 Phase 0 & 1 34 (17%) 12/34 (35%) 5/34 (15%)
 Phase 2 52 (26%) 29/52 (56%) 7/52 (13%)
 Phase 3 35 (18%) 15/35 (43%) 6/35 (17%)
 Phase 4 30 (15%) 16/30 (53%) 6/30 (20%)
 Unknown 49 (24%) 18/49 (37%) 6/49 (12%)
 Under 50 88 (44%) 40/88 (45%) 13/88 (15%)
 50-200 85 (42%) 36/85 (42%) 13/85 (15%)
 Over 200 25 (12%) 14/25 (56%) 4/25 (16%)
 Unknown 2 (1%) 0/2 (0%) NA
 Industry only 91 (46%) 33/91 (36%) 10/91 (11%)
 Mixed 17 (8%) 13/17 (76%) 5/17 (29%)
 No industry 92 (46%) 44/92 (48%) 15/92 (16%)
Total 200 (100%) 90/200 (45%) 30/200 (15%)
TABLE II: Characteristics of 200 trials with completion dates after January 1 2007 and no reported links to PubMed.

The number of unique concepts found across both corpora was 490,814, and of these 46,777 were found more than once in both the registrations and the articles (Figure 2). There were 789,848 unique terms across both corpora, and of these 78,856 were found more than once in both the registrations and the articles. After reducing the sets of concepts and features to include only those that were found more than once in both the registrations and articles, the number of remaining concepts varied between 2 and 273 (median 32) per registration, and between 2 and 296 (median 79) in articles. After applying the same process, the number of remaining terms per registration varied between 2 and 1,576 (median 108), and between 2 and 472 (median 142) for articles.

Fig. 2: The distribution of 490,814 concepts (left) and 789,848 terms (right) that were present in 276,307 articles (top) and the 72,469 registrations (bottom).

The median ranks produced by testing combinations of feature representations and distance measures on reported links varied substantially (Table 3). There were only small differences between the median ranks of the term-based and concept-based representations for the best performing combination, which in both cases used tf-idf weights and cosine distances. In both the term-based representation and the concept-based representation the tf-idf score and cosine distance measure placed the linked article first among all candidates for 40% of registrations (ranked first from among 276,307 candidate trial reports), and within the top 50 ranked trial reports (ranked 50 from 276,307 candidate trial reports) for 83% of the registrations.

Feature representations Median rank (IQR) First-ranked candidate* Recall@50
Terms (binary)
 Euclidean 533 (8-7,900) 17.2% 34.4%
 Jaccard 6 (1-160) 34.8% 67.1%
 Cosine 4 (1-81) 37.4% 71.9%
Terms (tf-idf)
 Euclidean 363 (3-15,110) 18.5% 40.2%
 Cosine 2 (1-20) 40.2% 83.2%
Concepts (binary)
 Euclidean 25 (1-1,679) 28.8% 54.2%
 Jaccard 11 (1-191) 28.3% 64.1%
 Cosine 6 (1-89) 32.2% 70.9%
Concepts (tf-idf)
 Euclidean 97 (2-7,230) 23.1% 46.9%
 Cosine 3 (1-21) 39.6% 82.3%

*The first ranked candidate was the article that reported the results of the trial in the registration.

TABLE III: The effect of using terms or concepts on the performance by feature representation and distance measures in a set of 16,005 registrations with reported links to articles.

The results were similar in the 90 registrations with unreported links to trial articles, which were used to test the method in a practical setting. The best-performing combinations used the tf-idf score and the cosine distance measure (Table 4), and the term-based and concept-based representations produced similar results. The maximum recall@50 was 85.9% for terms and 82.4% for concepts. For other combinations of scores and distance measures, concept-based representations generally outperformed the equivalent term-based representations (Figure 3).

Feature representations Median rank (IQR) First-ranked candidate* Recall@50
Terms (binary)
 Euclidean 5,030 (225-25,675) 8.2% 16.5%
 Jaccard 16 (1-342) 30.6% 62.4%
 Cosine 12 (1-206) 34.1% 67.1%
Terms (tf-idf)
 Euclidean 832 (6-29,048) 12.9% 38.8%
 Cosine 2 (1-12) 40.0% 85.9%
Concepts (binary)
 Euclidean 221 (2-7,386) 22.4% 41.2%
 Jaccard 17 (2-383) 24.7% 61.2%
 Cosine 8 (1-92) 25.9% 69.4%
Concepts (tf-idf)
 Euclidean 291 (2-19,387) 17.6% 40.0%
 Cosine 3 (1-19) 29.4% 82.4%
TABLE IV: The effect of using terms or concepts on the performance by feature representation and distance measures in a set of 90 registrations with unreported links to trial reports.
Fig. 3: Recall@N for term and concept based feature representations among the set of 90 registrations with unreported links to trial reports. The proportions of unreported links that were found by checking the first candidate (and first 50 candidates) are indicated by the two vertical lines.

4 Discussion

The results demonstrate that relatively simple representations across shared terms or concepts can be used to support the identification of unreported links between a trial’s registration and the article reporting its results. For two in five registrations, the method ranks the matching article first among all candidates. For more than four in every five registrations, a user would only need to check 50 candidates to identify an unreported link to its matching article.

4.1 Comparisons with prior research

To the best of our knowledge no other methods have been proposed to automate the identification of unreported links between trials’ registrations and the articles reporting their results. The most closely related research in the application domain is a method for identifying multiple publications from the same clinical trial [44]. Conceptually, the method is similar to concept-mapping that is used in certain search methods [45], but differs because it produces what is effectively an automated and weighted query using the language of registrations rather than requiring users to author queries themselves. Related research in clinical epidemiology includes technologies that operate on bibliographic databases to support the screening of articles for inclusion in a systematic review [22]. Recent developments included supervised, unsupervised, and hybrid methods [37, 46, 47], and several have taken advantage of semantic similarities between documents [48, 49].

Our results on rates of reported and unreported links are consistent with previous observational studies examining and PubMed. A series of studies showed that 27.8% of 8,907 registrations for completed, interventional, Phase 2 or later trials were found to have one or more machine-readable links to PubMed, and 44% of a sample of 50 registrations without known links were found to have matching trial reports [7, 50, 51].

4.2 Implications

Our methods support novel approaches to the system-wide monitoring of trial reporting. A number of studies have used to examine reporting biases, including both publication bias and outcome reporting bias [52, 6]. In these studies, investigators relied on the manual identification of links between registrations and published articles to ensure that all published results were identified. This is a time-consuming and rate limiting step. By replacing the need for experts to construct search queries in PubMed for each registration without a link to a published article, our proposed approach could be used to reduce the expertise and time needed to identify unreported links between trial registrations and the articles that report their results. Because there are still a substantial proportion of links that remain unreported, and this proportion does not appear to have changed over time, the method is likely to be of continued value until existing registrations are comprehensively linked to their results.

In a systematic review of studies that examine both reported and unreported links between trial registrations and articles, the results indicated that there was a smaller proportion of unreported links when investigators started from a cohort of articles and tried to find links to registrations. This appeared to be because articles often included trial registry identifiers in the full text of articles that were not included in the metadata in bibliographic databases [6]. We found a similar result: for 30 of 90 unreported links, the NCT Number was available in the full text of the article. All stakeholders, including the trial investigators, the journals, the bibliographic databases, and the organisations funding the trials could improve the number of links available in metadata, either by adding the information directly to, or by ensuring the information is available in the metadata provided to PubMed.

The research also has implications for automating systematic review processes. Methods designed to help systematic reviewers identify articles for inclusion in systematic reviews often use machine learning to replicate human screening of articles. The best-performing methods in this area are able to reduce workload by between 30% to 70% with an estimated loss of 5% of relevant studies 

[22]. Comprehensive matching of registrations and trial reports could provide a more reliable and complete basis from which to develop methods for automating systematic review processes. Rather than relying on the minimal descriptions available for articles in bibliographic databases [53], information from could provide an earlier and more complete description of the trials and yield more accurate machine learning results for article screening and selection.

4.3 Limitations and future research

This study has several limitations. In the set of unreported links, the manual search protocol may not have identified all published results. If the 90 articles we identified were easier to find by manual searches because they shared a larger number of terms, this may have over-estimated the performance in practice. However, the rate of publication for these entries was similar to prior reports [6]. It is also possible that the set of reported links included articles that were not presenting the results of the trials (such as pilot studies, protocols, or secondary analyses), which could influence the performance relative to the manually-curated studies, which were all selected as the first articles presenting the results of the trials. To minimise the risk of including incorrect articles in the set of reported links, we included the reported link to the article that was published after the completion date of the trial and excluded studies that were published before the stated completion date.

The methods described here represent a series of baseline results that could be implemented directly in a process for finding unreported links, but could be improved in several ways. In particular, to ensure a fair comparison was made for the two approaches, we only included a limited selection of the fields available in to represent the registrations. If we had included the names, affiliations, and countries of the investigators, it is likely that the performance of the document similarity method would have increased (but only for the term-based representation because these fields would not be recognised as UMLS concepts). Likewise, text from the full articles could have improved the performance of the method, but only when finding links to articles published in open access journals. Second, a further pre-processing step to reduce the number of candidate articles may improve the performance, especially where distance measures perform poorly. Finally, machine learning methods such as learning to rank approaches could be used to reduce, re-weight, or transform the sets of features by training them on the sets of reported links, and this may also yield improvements in performance given the volume and rate of growth in the number of reported links that can be used for training. Further research in this area should also consider how a user-friendly implementation of the methods tested here compare to a baseline in which experts and non-experts construct their own search strategies for PubMed.

5 Conclusion

Information contained in clinical trial registries may be useful for monitoring trial reporting activities and synthesising evidence, but this type of surveillance remains limited due to the growing number of incomplete links between and PubMed. Here we evaluate a method for partially automating the identification of published articles related to registrations. Our results demonstrate that document similarity methods can be used to replace the need to construct search strategies in a bibliographic database when trying to identify unreported links. While the approach does not replace the need to screen the resulting ranked set of candidates, it demonstrates the potential for automation in this space.


This work was supported by the Agency for Healthcare Research and Quality (R03HS024798). The authors have no conflicts of interest to disclose.


  • [1] D. A. Zarin, T. Tse, R. J. Williams, R. M. Califf, and N. C. Ide, “The Results Database – Update and Key Issues,” New England Journal of Medicine, vol. 364, pp. 852–60, 2011.
  • [2] D. A. Zarin, T. Tse, and N. C. Ide, “Trial Registration at between May and October 2005,” New England Journal of Medicine, vol. 353, pp. 2779–87, 2005.
  • [3] A. T. McCray, “Better Access to Information about Clinical Trials,” Annals of Internal Medicine, vol. 133, pp. 609–14, 2000.
  • [4] D. A. Zarin and T. Tse, “Sharing Individual Participant Data (IPD) within the Context of the Trial Reporting System (TRS),” PLoS Medicine, vol. 13, p. e1001946, 2016.
  • [5] ——, “Moving Toward Transparency of Clinical Trials,” Science, vol. 319, pp. 1340–2, 2008.
  • [6] R. Bashir, F. T. Bourgeois, and A. G. Dunn, “A systematic review of the processes used to link clinical trial registrations to their published results,” Systematic Reviews, vol. 6, p. 123, 2017.
  • [7] V. Huser and J. J. Cimino, “Precision and Negative Predictive Value of Links between and PubMed,” in AMIA Annual Symposium Proceedings.   AMIA, 2012, pp. 400–8.
  • [8] C. Riveros, A. Dechartres, E. Perrodeau, R. Haneef, I. Boutron, and P. Ravaud, “Timing and Completeness of Trial Results Posted at and Published in Journals,” PLoS Medicine, vol. 10, p. e1001566, 2013.
  • [9] M. Baudart, P. Ravaud, G. Baron, A. Dechartres, R. Haneef, and I. Boutron, “Public availability of results of observational studies evaluating an intervention registered at,” BMC Medicine, vol. 14, pp. 1–11, 2016.
  • [10] K. Dickersin, S. Chan, T. C. Chalmers, H. S. Sacks, and H. Smith Jr., “Publication bias and clinical trials,” Controlled Clinical Trials, vol. 8, pp. 343–53, 1987.
  • [11] C. Schmucker, L. K. Schell, S. Portalupi et al., “Extent of Non-Publication in Cohorts of Studies Approved by Research Ethics Committees or Included in Trial Registries,” PLoS ONE, vol. 9, p. e114023, 2014.
  • [12] I. Chalmers, P. Glasziou, and F. Godlee, “All trials must be registered and the results published,” BMJ, vol. 346, p. f105, 2013.
  • [13] F. T. Bourgeois, S. Murthy, and K. D. Mandl, “Outcome Reporting Among Drug Trials Registered in,” Annals of Internal Medicine, vol. 153, pp. 158–66, 2010.
  • [14] A. Chan, A. Hróbjartsson, M. T. Haahr, P. C. Gøtzsche, and D. G. Altman, “Empirical evidence for selective reporting of outcomes in randomized trials: Comparison of protocols to published articles,” JAMA, vol. 291, pp. 2457–65, 2004.
  • [15] A.-W. Chan and D. G. Altman, “Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors,” BMJ, vol. 330, p. 753, 2005.
  • [16] K. Rising, P. Bacchetti, and L. Bero, “Reporting Bias in Drug Trials Submitted to the Food and Drug Administration: Review of Publication and Presentation,” PLoS Medicine, vol. 5, p. e217, 2008.
  • [17] A.-W. Chan, “Bias, Spin, and Misreporting: Time for Full Access to Trial Protocols and Results,” PLoS Medicine, vol. 5, p. e230, 2008.
  • [18] N. Rasmussen, K. Lee, and L. Bero, “Association of trial registration with the results and conclusions of published trials of new oncology drugs,” Trials, vol. 10, p. 116, 2009.
  • [19] R. Rosenthal and K. Dwan, “Comparison of Randomized Controlled Trial Registry Entries and Content of Reports in Surgical Journals,” Annals of Surgery, vol. 257, pp. 1007–15, 2013.
  • [20] S. Pranić and A. Marušić, “Changes to registration elements and results in a cohort of trials were not reflected in published articles,” Journal of Clinical Epidemiology, vol. 70, pp. 26–37, 2016.
  • [21] A. Scott, J. J. Rucklidge, and R. T. Mulder, “Is Mandatory Prospective Trial Registration Working to Prevent Publication of Unregistered Trials and Selective Outcome Reporting? An Observational Study of Five Psychiatry Journals That Mandate Prospective Clinical Trial Registration,” PLoS ONE, vol. 10, p. e0133718, 2015.
  • [22] A. O’Mara-Eves, J. Thomas, J. McNaught, M. Miwa, and S. Ananiadou, “Using text mining for study identification in systematic reviews: a systematic review of current approaches,” Systematic Reviews, vol. 4, p. 5, 2015.
  • [23] G. Tsafnat, P. Glasziou, M. K. Choong, A. G. Dunn, F. Galgani, and E. Coiera, “Systematic review automation technologies,” Systematic Reviews, vol. 3, p. 74, 2014.
  • [24] Y. Takwoingi, S. Hopewell, D. Tovey, and A. J. Sutton, “A multicomponent decision tool for prioritising the updating of systematic reviews,” BMJ, vol. 347, p. f7191, 2013.
  • [25] P. G. Shekelle, A. Motala, B. Johnsen, and S. J. Newberry, “Assessment of a method to detect signals for updating systematic reviews,” Systematic Reviews, vol. 3, p. 13, 2014.
  • [26] P. Pattanittum, M. Laopaiboon, D. Moher, P. Lumbiganon, and C. Ngamjarus, “A Comparison of Statistical Methods for Identifying Out-of-Date Systematic Reviews,” PLoS ONE, vol. 7, p. e48894, 2012.
  • [27] D. Moher, A. Tsertsvadze, A. C. Tricco et al., “When and how to update systematic reviews,” Cochrane Database of Systematic Reviews, vol. 1, p. MR000023.pub3, 2008.
  • [28] P. Garner, S. Hopewell, J. Chandler et al., “When and how to update systematic reviews: consensus and checklist,” BMJ, vol. 354, p. i3507, 2016.
  • [29] A. Cohen, K. Ambert, and M. McDonagh, “Cross-topic learning for work prioritization in systematic review creation and update,” Journal of the American Medical Informatics Association, vol. 16, pp. 690–704, 2009.
  • [30] ——, “Studying the potential impact of automated document classification on scheduling a systematic review update,” BMC Medical Informatics and Decision Making, vol. 12, p. 33, 2012.
  • [31] M. Chung, S. J. Newberry, M. T. Ansari et al., “Two methods provide similar signals for the need to update systematic reviews,” Journal of Clinical Epidemiology, vol. 65, pp. 660–8, 2012.
  • [32] N. Barrowman, M. Fang, M. Sampson, and D. Moher, “Identifying null meta-analyses that are ripe for updating,” BMC Medical Research Methodology, vol. 3, p. 13, 2003.
  • [33] N. Ahmadzai, S. J. Newberry, M. A. Maglione et al., “A surveillance system to assess the need for updating systematic reviews,” Systematic Reviews, vol. 2, p. 104, 2013.
  • [34] B. C. Wallace, T. A. Trikalinos, J. Lau, C. Brodley, and C. H. Schmid, “Semi-automated screening of biomedical citations for systematic reviews,” BMC Bioinformatics, vol. 11, p. 55, 2010.
  • [35] S. Matwin, A. Kouznetsov, D. Inkpen, O. Frunza, and P. O’Blenis, “A new algorithm for reducing the workload of experts in performing systematic reviews,” Journal of the American Medical Informatics Association, vol. 17, pp. 446–53, 2010.
  • [36] A. M. Cohen, “Optimizing Feature Representation for Automated Systematic Review Work Prioritization,” in AMIA Annual Symposium Proceedings, 2008, pp. 121–5.
  • [37] X. Ji and P. Y. Yen, “Using MEDLINE Elemental Similarity to Assist in the Article Screening Process for Systematic Reviews,” JMIR Medical Informatics, vol. 3, p. e28, 2015.
  • [38] W. Hersh, S. Price, and L. Donohoe, “Assessing thesaurus-based query expansion using the UMLS Metathesaurus,” in Proceedings of the AMIA Symposium, 2000, pp. 344–8.
  • [39] B. Koopman, G. Zuccon, P. Bruza, L. Sitbon, and M. Lawley, “An evaluation of corpus-driven measures of medical concept similarity for information retrieval,” in Proceedings of the 21st ACM international conference on Information and knowledge management.   ACM, 2012, pp. 2439–42.
  • [40] X. Zhang, L. Jing, X. Hu, M. Ng, and X. Zhou, “A Comparative Study of Ontology Based Term Similarity Measures on PubMed Document Clustering,” in Advances in Databases: Concepts, Systems and Applications, R. Kotagiri, P. R. Krishna, M. Mohania, and E. Nantajeewarawat, Eds.   Springer Berlin Heidelberg, 2007, pp. 115–26.
  • [41] A. R. Aronson and F.-M., “An overview of MetaMap: historical perspective and recent advances,” Journal of the American Medical Informatics Association, vol. 17, pp. 229–36, 2010.
  • [42] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing & Management, vol. 24, pp. 513–23, 1988.
  • [43] K. S. Jones, “A statistical interpretation of term specificity and its application in retrieval,” Journal of Documentation, vol. 28, pp. 11–21, 1972.
  • [44] W. Shao, C. E. Adams, A. M. Cohen et al., “Aggregator: A machine learning approach to identifying MEDLINE articles that derive from the same underlying clinical trial,” Methods, vol. 74, pp. 65–70, 2015.
  • [45] N. C. Ide, R. F. Loane, and D. Demner-Fushman, “Essie: A Concept-based Search Engine for Structured Biomedical Text,” Journal of the American Medical Informatics Association, vol. 14, pp. 253–63, 2007.
  • [46] X. Ji, A. Ritter, and P.-Y. Yen, “Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews,” Journal of Biomedical Informatics, vol. 69, pp. 33–42, 2017.
  • [47] M. Miwa, J. Thomas, A. O’Mara-Eves, and S. Ananiadou, “Reducing systematic review workload through certainty-based screening,” Journal of Biomedical Informatics, vol. 51, pp. 242–253, 2014.
  • [48] T. K. Saha, M. Ouzzani, H. M. Hammady, and A. K. Elmagarmid, “A large scale study of SVM based methods for abstract screening in systematic reviews,” arXiv, p. 1610.00192v2, 2016.
  • [49]

    K. Hashimoto, G. Kontonatsios, M. Miwa, and S. Ananiadou, “Topic detection using paragraph vectors to support active learning in systematic reviews,”

    Journal of Biomedical Informatics, vol. 62, pp. 59–65, 2016.
  • [50] V. Huser and J. Cimino, “Linking and PubMed to track results of interventional human clinical trials,” PLoS One, vol. 8, p. e68409, 2013.
  • [51] ——, “Evaluating adherence to the International Committee of Medical Journal Editors’ policy of mandatory, timely clinical trial registration,” Journal of the American Medical Informatics Association, vol. 20, pp. e169–e174, 2013.
  • [52] K. Dwan, C. Gamble, P. R. Williamson et al., “Systematic review of the empirical evidence of study publication bias and outcome reporting bias — an updated review,” PLoS One, vol. 8, p. e66844, 2013.
  • [53] S. Kiritchenko, B. de Bruijn, S. Carini, J. Martin, and I. Sim, “ExaCT: automatic extraction of clinical trial characteristics from journal publications,” BMC Medical Informatics and Decision Making, vol. 10, p. 56, 2010.