Complex Coordinate-Based Meta-Analysis with Probabilistic Programming

12/02/2020
by   Valentin Iovene, et al.
6

With the growing number of published functional magnetic resonance imaging (fMRI) studies, meta-analysis databases and models have become an integral part of brain mapping research. Coordinate-based meta-analysis (CBMA) databases are built by automatically extracting both coordinates of reported peak activations and term associations using natural language processing (NLP) techniques. Solving term-based queries on these databases make it possible to obtain statistical maps of the brain related to specific cognitive processes. However, with tools like Neurosynth, only singleterm queries lead to statistically reliable results. When solving richer queries, too few studies from the database contribute to the statistical estimations. We design a probabilistic domain-specific language (DSL) standing on Datalog and one of its probabilistic extensions, CP-Logic, for expressing and solving rich logic-based queries. We encode a CBMA database into a probabilistic program. Using the joint distribution of its Bayesian network translation, we show that solutions of queries on this program compute the right probability distributions of voxel activations. We explain how recent lifted query processing algorithms make it possible to scale to the size of large neuroimaging data, where state of the art knowledge compilation (KC) techniques fail to solve queries fast enough for practical applications. Finally, we introduce a method for relating studies to terms probabilistically, leading to better solutions for conjunctive queries on smaller databases. We demonstrate results for two-term conjunctive queries, both on simulated meta-analysis databases and on the widely-used Neurosynth database.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

page 9

page 10

page 11

page 12

page 14

page 16

12/28/2021

Monads for Measurable Queries in Probabilistic Databases

We consider a bag (multiset) monad on the category of standard Borel spa...
11/30/2020

Standard Probabilistic Databases

Probabilistic databases (PDBs) model uncertainty in data in a quantitati...
02/27/2019

On Constrained Open-World Probabilistic Databases

Increasing amounts of available data have led to a heightened need for r...
04/14/2021

Translating synthetic natural language to database queries: a polyglot deep learning framework

The number of databases as well as their size and complexity is increasi...
07/11/2019

Provenance for Large-scale Datalog

Logic programming languages such as Datalog have become popular as Domai...
12/15/2015

BayesDB: A probabilistic programming system for querying the probable implications of data

Is it possible to make statistical inference broadly accessible to non-s...
04/04/2017

Probabilistic Search for Structured Data via Probabilistic Programming and Nonparametric Bayes

Databases are widespread, yet extracting relevant data can be difficult....
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.