
Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal
Minmax optimization, especially in its general nonconvexnonconcave form...
02/02/2019 ∙ by Chi Jin, et al. ∙ 34 ∙ shareread it

The Power of Batching in Multiple Hypothesis Testing
One important partition of algorithms for controlling the false discover...
10/11/2019 ∙ by Tijana Zrnic, et al. ∙ 31 ∙ shareread it

LSTree: Model Interpretation When the Data Are Linguistic
We study the problem of interpreting trained classification models in th...
02/11/2019 ∙ by Jianbo Chen, et al. ∙ 22 ∙ shareread it

Stochastic Gradient Descent Escapes Saddle Points Efficiently
This paper considers the perturbed stochastic gradient descent algorithm...
02/13/2019 ∙ by Chi Jin, et al. ∙ 20 ∙ shareread it

Bayesian Robustness: A Nonasymptotic Viewpoint
We study the problem of robustly estimating the posterior distribution f...
07/27/2019 ∙ by Kush Bhatia, et al. ∙ 18 ∙ shareread it

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm
In this note, we derive concentration inequalities for random vectors wi...
02/11/2019 ∙ by Chi Jin, et al. ∙ 16 ∙ shareread it

On the Complexity of Approximating Multimarginal Optimal Transport
We study the complexity of approximating the multimarginal optimal trans...
09/30/2019 ∙ by Tianyi Lin, et al. ∙ 14 ∙ shareread it

Global Error Bounds and Linear Convergence for GradientBased Algorithms for Trend Filtering and ℓ_1Convex Clustering
We propose a class of firstorder gradienttype optimization algorithms ...
04/16/2019 ∙ by Nhat Ho, et al. ∙ 12 ∙ shareread it

Towards Understanding the Transferability of Deep Representations
Deep neural networks trained on a wide range of datasets demonstrate imp...
09/26/2019 ∙ by Hong Liu, et al. ∙ 12 ∙ shareread it

HighOrder Langevin Diffusion Yields an Accelerated MCMC Algorithm
We propose a Markov chain Monte Carlo (MCMC) algorithm based on thirdor...
08/28/2019 ∙ by Wenlong Mou, et al. ∙ 12 ∙ shareread it

Bridging Theory and Algorithm for Domain Adaptation
This paper addresses the problem of unsupervised domain adaption from th...
04/11/2019 ∙ by Yuchen Zhang, et al. ∙ 10 ∙ shareread it

CostEffective Incentive Allocation via Structured Counterfactual Inference
We address a practical problem ubiquitous in modern industry, in which a...
02/07/2019 ∙ by Romain Lopez, et al. ∙ 8 ∙ shareread it

Boundary Attack++: QueryEfficient DecisionBased Adversarial Attack
Decisionbased adversarial attack studies the generation of adversarial ...
04/03/2019 ∙ by Jianbo Chen, et al. ∙ 6 ∙ shareread it

A joint model of unpaired data from scRNAseq and spatial transcriptomics for imputing missing gene expression measurements
Spatial studies of transcriptome provide biologists with gene expression...
05/06/2019 ∙ by Romain Lopez, et al. ∙ 6 ∙ shareread it

Convergence Rates for Gaussian Mixtures of Experts
We provide a theoretical treatment of overspecified Gaussian mixtures o...
07/09/2019 ∙ by Nhat Ho, et al. ∙ 6 ∙ shareread it

Fundamental limits of detection in the spiked Wigner model
We study the fundamental limits of detecting the presence of an additive...
06/25/2018 ∙ by Ahmed El Alaoui, et al. ∙ 4 ∙ shareread it

Theoretically Principled Tradeoff between Robustness and Accuracy
We identify a tradeoff between robustness and accuracy that serves as a...
01/24/2019 ∙ by Hongyang Zhang, et al. ∙ 4 ∙ shareread it

Is There an Analog of Nesterov Acceleration for MCMC?
We formulate gradientbased Markov chain Monte Carlo (MCMC) sampling as ...
02/04/2019 ∙ by Yian Ma, et al. ∙ 4 ∙ shareread it

A Dynamical Systems Perspective on Nesterov Acceleration
We present a dynamical system framework for understanding Nesterov's acc...
05/17/2019 ∙ by Michael Muehlebach, et al. ∙ 4 ∙ shareread it

MLLOO: Detecting Adversarial Examples with Feature Attribution
Deep neural networks obtain stateoftheart performance on a series of ...
06/08/2019 ∙ by Puyudi Yang, et al. ∙ 3 ∙ shareread it

Approximate SheraliAdams Relaxations for MAP Inference via Entropy Regularization
Maximum a posteriori (MAP) inference is a fundamental computational para...
07/02/2019 ∙ by Jonathan N. Lee, et al. ∙ 3 ∙ shareread it

Quantitative W_1 Convergence of LangevinLike Stochastic Processes with NonConvex Potential StateDependent Noise
We prove quantitative convergence rates at which discrete Langevinlike ...
07/07/2019 ∙ by Xiang Cheng, et al. ∙ 3 ∙ shareread it

Provably Efficient Reinforcement Learning with Linear Function Approximation
Modern Reinforcement Learning (RL) is commonly applied to practical prob...
07/11/2019 ∙ by Chi Jin, et al. ∙ 3 ∙ shareread it

Learning Stages: Phenomenon, Root Cause, Mechanism Hypothesis, and Implications
Under StepDecay learning rate strategy (decaying the learning rate after...
08/05/2019 ∙ by Kaichao You, et al. ∙ 3 ∙ shareread it

A HigherOrder Swiss Army Infinitesimal Jackknife
Cross validation (CV) and the bootstrap are ubiquitous modelagnostic to...
07/28/2019 ∙ by Ryan Giordano, et al. ∙ 3 ∙ shareread it

Competing Bandits in Matching Markets
Stable matching, a classical model for twosided markets, has long been ...
06/12/2019 ∙ by Lydia T. Liu, et al. ∙ 3 ∙ shareread it

PolicyGradient Algorithms Have No Guarantees of Convergence in Continuous Action and State MultiAgent Settings
We show by counterexample that policygradient algorithms have no guaran...
07/08/2019 ∙ by Eric Mazumdar, et al. ∙ 1 ∙ shareread it

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Nesterov's accelerated gradient descent (AGD), an instance of the genera...
11/28/2017 ∙ by Chi Jin, et al. ∙ 0 ∙ shareread it

Stochastic Cubic Regularization for Fast Nonconvex Optimization
This paper proposes a stochastic variant of a classic algorithmthe cu...
11/08/2017 ∙ by Nilesh Tripuraneni, et al. ∙ 0 ∙ shareread it

Firstorder Methods Almost Always Avoid Saddle Points
We establish that firstorder methods avoid saddle points for almost all...
10/20/2017 ∙ by Jason D. Lee, et al. ∙ 0 ∙ shareread it

Online control of the false discovery rate with decaying memory
In the online multiple testing problem, pvalues corresponding to differ...
10/02/2017 ∙ by Aaditya Ramdas, et al. ∙ 0 ∙ shareread it

DAGGER: A sequential algorithm for FDR control on DAGs
We propose a topdown algorithm for multiple testing on directed acyclic...
09/29/2017 ∙ by Aaditya Ramdas, et al. ∙ 0 ∙ shareread it

Kernel Feature Selection via Conditional Covariance Minimization
We propose a framework for feature selection that employs kernelbased m...
07/04/2017 ∙ by Jianbo Chen, et al. ∙ 0 ∙ shareread it

Fast Blackbox Variational Inference through Stochastic TrustRegion Optimization
We introduce TrustVI, a fast secondorder algorithm for blackbox variat...
06/07/2017 ∙ by Jeffrey Regier, et al. ∙ 0 ∙ shareread it

Gradient Descent Can Take Exponential Time to Escape Saddle Points
Although gradient descent (GD) almost always escapes saddle points asymp...
05/29/2017 ∙ by Simon S. Du, et al. ∙ 0 ∙ shareread it

A unified treatment of multiple testing with prior knowledge using the pfilter
A significant literature studies ways of employing prior knowledge to im...
03/18/2017 ∙ by Aaditya Ramdas, et al. ∙ 0 ∙ shareread it

How to Escape Saddle Points Efficiently
This paper shows that a perturbed form of gradient descent converges to ...
03/02/2017 ∙ by Chi Jin, et al. ∙ 0 ∙ shareread it

Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method
We develop and analyze a procedure for gradientbased optimization that ...
09/12/2016 ∙ by Lihua Lei, et al. ∙ 0 ∙ shareread it

CYCLADES: Conflictfree Asynchronous Machine Learning
We present CYCLADES, a general framework for parallelizing stochastic op...
05/31/2016 ∙ by Xinghao Pan, et al. ∙ 0 ∙ shareread it

CommunicationEfficient Distributed Statistical Inference
We present a Communicationefficient Surrogate Likelihood (CSL) framewor...
05/25/2016 ∙ by Michael I. Jordan, et al. ∙ 0 ∙ shareread it

Deep Transfer Learning with Joint Adaptation Networks
Deep networks have been successfully applied to learn transferable featu...
05/21/2016 ∙ by Mingsheng Long, et al. ∙ 0 ∙ shareread it

On kernel methods for covariates that are rankings
Permutationvalued features arise in a variety of applications, either i...
03/25/2016 ∙ by Horia Mania, et al. ∙ 0 ∙ shareread it

A Variational Perspective on Accelerated Methods in Optimization
Accelerated gradient methods play a central role in optimization, achiev...
03/14/2016 ∙ by Andre Wibisono, et al. ∙ 0 ∙ shareread it

Asymptotic behavior of ℓ_pbased Laplacian regularization in semisupervised learning
Given a weighted graph with N vertices, consider a realvalued regressio...
03/02/2016 ∙ by Ahmed El Alaoui, et al. ∙ 0 ∙ shareread it

Gradient Descent Converges to Minimizers
We show that gradient descent converges to a local minimizer, almost sur...
02/16/2016 ∙ by Jason D. Lee, et al. ∙ 0 ∙ shareread it

A Kernelized Stein Discrepancy for Goodnessoffit Tests and Model Evaluation
We derive a new discrepancy statistic for measuring differences between ...
02/10/2016 ∙ by Qiang Liu, et al. ∙ 0 ∙ shareread it

SparkNet: Training Deep Networks in Spark
Training deep networks is a timeconsuming process, with networks for ob...
11/19/2015 ∙ by Philipp Moritz, et al. ∙ 0 ∙ shareread it

Optimistic Concurrency Control for Distributed Unsupervised Learning
Research on distributed machine learning algorithms has focused primaril...
07/30/2013 ∙ by Xinghao Pan, et al. ∙ 0 ∙ shareread it

A LinearlyConvergent Stochastic LBFGS Algorithm
We propose a new stochastic LBFGS algorithm and prove a linear converge...
08/09/2015 ∙ by Philipp Moritz, et al. ∙ 0 ∙ shareread it

On the accuracy of selfnormalized loglinear models
Calculation of the lognormalizer is a major computational obstacle in a...
06/12/2015 ∙ by Jacob Andreas, et al. ∙ 0 ∙ shareread it
Michael I. Jordan
is this you? claim profile
Michael Irwin Jordan is an american scientist, professor in machine learning, statistical science and artificial intelligence at the University of California, and researcher in Berkeley. He is one of the leading figures in machine learning, and Science has reported him as the most important computer scientist in the world in 2016.
In 1978, Jordan received his BS magna cum laude degree in Psychology from Louisiana State University, his MS degree in Mathematics from Arizona State University in 1980 and his PhD in cognitive science from the University of California in San Diego in 1985. Jordan was a student of David Rumelhart and a member of the PDP Group in the 1980s at the University of California, San Diego.
Jordan currently is a full professor, working in the Department of Statistics and the Department of EECS at the University of California, Berkeley. From 1988 to 1998 he was professor in the Brain and Cognitive Sciences Department at MIT.
Jordan began to develop recurrent neural networks as a cognitive model in the 1980s. In recent years, his work has been less driven by a cognitive point of view and more by traditional statistics.
In the machinelearning community, Jordan popularized Bayesian networks and is known for pointing out links between machine learning and statistics. He was also prominent in formalizing variation methods for approximate inference and popularizing the machine learning expectative maximization algorithm.
In 2001, Jordan and others resigned from the Machine Learning editorial board. They advocated less restrictive access in a public letter and committed support to a new open access newspaper, The Journal of Machine Learning Research, created by Leslie Kaelbling to support the development of machine learning.
Jordan has earned numerous awards, including the ACM  AAAI Allen Newell Award, the IEEE Pioneer Award for Neural Networks, and the NSF Young Investigator Award. This is a prize for the best paper award at the International Conference on Machine Learn. In 2010 he was appointed a Fellow for “contributions to the theory and application of machine training” in the Association for Machinery for Computing Machinery. Jordan belongs to the National Academy of Science, to the National Academy of Engineering and to the Academy of Arts and Sciences in the US.
He was named a Neyman lecturer and an Institute of Mathematical Statistics medallion lecturer. In 2015 he was awarded the David E. Rumelhart Prize and in 2009 received the ACM/AAAI Allen Newell Award.
In 2016 Jordan was identified by an analysis of published literature by the Semantic Scholar Project as the “most influential computer scientist.”