
Graph cuts always find a global optimum (with a catch)
We prove that the alphaexpansion algorithm for MAP inference always ret...
read it

Trajectory Inspection: A Method for Iterative ClinicianDriven Design of Reinforcement Learning Studies
Treatment policies learned via reinforcement learning (RL) from observat...
read it

Robust Benchmarking for Machine Learning of Clinical Entity Extraction
Clinical studies often require understanding elements of a patient's nar...
read it

Fast, Structured Clinical Documentation via Contextual Autocomplete
We present a system that uses a learned autocompletion mechanism to faci...
read it

PClean: Bayesian Data Cleaning at Scale with DomainSpecific Probabilistic Programming
Data cleaning can be naturally framed as probabilistic inference in a ge...
read it

Deep Contextual Clinical Prediction with Reverse Distillation
Healthcare providers are increasingly using learned methods to predict a...
read it

Consistent Estimators for Learning to Defer to an Expert
Learning algorithms are often used in conjunction with expert decision m...
read it

Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes
In several medical decisionmaking problems, such as antibiotic prescrip...
read it

Knowledge Base Completion for Constructing ProblemOriented Medical Records
Both electronic health records and personal health records are typically...
read it

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects
Practitioners in diverse fields such as healthcare, economics and educat...
read it

Estimation of UtilityMaximizing Bounds on Potential Outcomes
Estimation of individual treatment effects is often used as the basis fo...
read it

Open Set Medical Diagnosis
Machinelearned diagnosis models have shown promise as medical aides but...
read it

Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph
Increasingly large electronic health records (EHRs) provide an opportuni...
read it

Characterization of Overlap in Observational Studies
Overlap between treatment groups is required for nonparametric estimatio...
read it

Benefits of Overparameterization in SingleLayer Latent Variable Generative Models
One of the most surprising and exciting discoveries in supervising learn...
read it

Counterfactual OffPolicy Evaluation with GumbelMax Structural Causal Models
We introduce an offpolicy evaluation procedure for highlighting episode...
read it

Support and Invertibility in DomainInvariant Representations
Learning domaininvariant representations has become a popular approach ...
read it

Overcomplete Independent Component Analysis via SDP
We present a novel algorithm for overcomplete independent components ana...
read it

Prototypical Clustering Networks for Dermatological Disease Diagnosis
We consider the problem of image classification for the purpose of aidin...
read it

Block Stability for MAP Inference
To understand the empirical success of approximate MAP inference, recent...
read it

Evaluating Reinforcement Learning Algorithms in Observational Health Settings
Much attention has been devoted recently to the development of machine l...
read it

Why Is My Classifier Discriminatory?
Recent attempts to achieve fairness in predictive models focus on the ba...
read it

Learning Weighted Representations for Generalization Across Designs
Predictive models that generalize well under distributional shift are of...
read it

SemiAmortized Variational Autoencoders
Amortized variational inference (AVI) replaces instancespecific local i...
read it

Alphaexpansion is Exact on Stable Instances
Approximate algorithms for structured prediction problemssuch as the ...
read it

Causal Effect Inference with Deep LatentVariable Models
Learning individuallevel causal effects from observational data, such a...
read it

Grounded Recurrent Neural Networks
In this work, we present the Grounded Recurrent Neural Network (GRNN), a...
read it

DiscourseBased Objectives for Fast Unsupervised Sentence Representation Learning
This work presents a novel objective function for the unsupervised train...
read it

Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
We consider multiclass classification where the predictor has a hierarc...
read it

Structured Inference Networks for Nonlinear State Space Models
Gaussian state space models have been used for decades as generative mod...
read it

Identifiable Phenotyping using Constrained NonNegative Matrix Factorization
This work proposes a new algorithm for automated and simultaneous phenot...
read it

Clinical Tagging with Joint Probabilistic Models
We describe a method for parameter estimation in bipartite probabilistic...
read it

Estimating individual treatment effect: generalization bounds and algorithms
There is intense interest in applying machine learning to problems of ca...
read it

Recurrent Neural Networks for Multivariate Time Series with Missing Values
Multivariate time series data in practical applications, such as health ...
read it

Learning Representations for Counterfactual Inference
Observational studies are rising in importance due to the widespread acc...
read it

Deep Kalman Filters
Kalman Filters are one of the most influential models of timevarying ph...
read it

Anchored Discrete Factor Analysis
We present a semisupervised learning algorithm for learning discrete fa...
read it

Barrier FrankWolfe for Marginal Inference
We introduce a globallyconvergent algorithm for optimizing the treerew...
read it

Train and Test Tightness of LP Relaxations in Structured Prediction
Structured prediction is used in areas such as computer vision and natur...
read it

CharacterAware Neural Language Models
We describe a simple neural language model that relies only on character...
read it

Incorporating Type II Error Probabilities from Independence Tests into ScoreBased Learning of Bayesian Network Structure
We give a new consistent scoring function for structure learning of Baye...
read it

Tight Error Bounds for Structured Prediction
Structured prediction tasks in machine learning involve the simultaneous...
read it

Lifted TreeReweighted Variational Inference
We analyze variational inference for highly symmetric graphical models s...
read it

Unsupervised Learning of NoisyOr Bayesian Networks
This paper considers the problem of learning the parameters in Bayesian ...
read it

SparsityBoost: A New Scoring Function for Learning Bayesian Network Structure
We give a new consistent scoring function for structure learning of Baye...
read it

A Practical Algorithm for Topic Modeling with Provable Guarantees
Topic models provide a useful method for dimensionality reduction and ex...
read it

Efficiently Searching for Frustrated Cycles in MAP Inference
Dual decomposition provides a tractable framework for designing algorith...
read it

Tightening LP Relaxations for MAP using Message Passing
Linear Programming (LP) relaxations have become powerful tools for findi...
read it
David Sontag
is this you? claim profile
Assistant Professor in the Department of Electrical Engineering and Computer Science (EECS) MIT, Assistant Professor in Computer Science and Data Science at New York University’s Courant Institute of Mathematical Sciences from 2011 to 2016, Sprowls award for outstanding doctoral thesis in Computer Science at MIT in 2010, best paper awards at the conferences Empirical Methods in Natural Language Processing (EMNLP), Uncertainty in Artificial Intelligence (UAI), and Neural Information Processing Systems (NIPS), faculty awards from Google, Facebook, and Adobe, and a NSF CAREER Award.