Final Report on MITRE Evaluations for the DARPA Big Mechanism Program

11/08/2022
by   Matthew Peterson, et al.
0

This report presents the evaluation approach developed for the DARPA Big Mechanism program, which aimed at developing computer systems that will read research papers, integrate the information into a computer model of cancer mechanisms, and frame new hypotheses. We employed an iterative, incremental approach to the evaluation of the three phases of the program. In Phase I, we evaluated the ability of system and human teams ability to read-with-a-model to capture mechanistic information from the biomedical literature, integrated with information from expert curated biological databases. In Phase II we evaluated the ability of systems to assemble fragments of information into a mechanistic model. The Phase III evaluation focused on the ability of systems to provide explanations of experimental observations based on models assembled (largely automatically) by the Big Mechanism process. The evaluation for each phase built on earlier evaluations and guided developers towards creating capabilities for the new phase. The report describes our approach, including innovations such as a reference set (a curated data set limited to major findings of each paper) to assess the accuracy of systems in extracting mechanistic findings in the absence of a gold standard, and a method to evaluate model-based explanations of experimental data. Results of the evaluation and supporting materials are included in the appendices.

READ FULL TEXT

page 1

page 12

page 16

page 34

page 35

research
08/29/2023

Improving the State of the Art for Training Human-AI Teams: Technical Report #3 – Analysis of Testbed Alternatives

Sonalysts is working on an initiative to expand our current expertise in...
research
09/10/2021

A Precise Program Phase Identification Method Based on Frequency Domain Analysis

In this paper, we present a systematic approach that transforms the prog...
research
11/18/2017

Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions

This paper demonstrates the use of genetic algorithms for evolving a gra...
research
11/18/2017

Expert-Driven Genetic Algorithms for Simulating Evaluation Functions

In this paper we demonstrate how genetic algorithms can be used to rever...
research
02/23/2023

The Generalizability of Explanations

Due to the absence of ground truth, objective evaluation of explainabili...
research
09/07/2021

On the Challenges of Evaluating Compositional Explanations in Multi-Hop Inference: Relevance, Completeness, and Expert Ratings

Building compositional explanations requires models to combine two or mo...
research
01/23/2021

Recovery and Analysis of Architecture Descriptions using Centrality Measures

The necessity of an explicit architecture description has been continuou...

Please sign up or login with your details

Forgot password? Click here to reset