
WildWood: a new Random Forest algorithm
We introduce WildWood (WW), a new ensemble algorithm for supervised lear...
read it

About contrastive unsupervised representation learning for classification and its convergence
Contrastive representation learning has been recently proved to be very ...
read it

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression
We introduce a procedure for predictive conditional density estimation u...
read it

ZiMM: a deep learning model for long term adverse events with nonclinical claims data
This paper considers the problem of modeling longterm adverse events fo...
read it

SCALPEL3: a scalable opensource library for healthcare claims databases
This article introduces SCALPEL3, a scalable opensource framework for s...
read it

AMF: Aggregated Mondrian Forests for Online Learning
Random Forests (RF) is one of the algorithms of choice in many supervise...
read it

Anytime Hedge achieves optimal regret in the stochastic regime
This paper is about a surprising fact: we prove that the anytime Hedge a...
read it

Comparison of methods for earlyreadmission prediction in a highdimensional heterogeneous covariates and timetoevent outcome framework
Background: Choosing the most performing method in terms of outcome pred...
read it

Dual optimization for convex constrained objectives without the gradientLipschitz assumption
The minimization of convex objectives coming from linear supervised lear...
read it

Minimax optimal rates for Mondrian trees and forests
Introduced by Breiman (2001), Random Forests are widely used as classifi...
read it

ConvSCCS: convolutional selfcontrolled case series model for lagged adverse event detection
With the increased availability of large databases of electronic health ...
read it

Highdimensional robust regression and outliers detection with SLOPE
The problems of outliers detection and robust regression in a highdimen...
read it

Universal consistency and minimax rates for online Mondrian Forests
We establish the consistency of an algorithm of Mondrian Forests, a rand...
read it

Sparse inference of the drift of a highdimensional OrnsteinUhlenbeck process
Given the observation of a highdimensional OrnsteinUhlenbeck (OU) proc...
read it

tick: a Python library for statistical learning, with a particular emphasis on timedependent modeling
tick is a statistical learning library for Python 3, with a particular e...
read it

Binarsity: a penalization for onehot encoded features
This paper deals with the problem of largescale linear supervised learn...
read it

Cmix: a high dimensional mixture model for censored durations, with applications to genetic data
We introduce a mixture model for censored durations (Cmix), and develop...
read it

Uncovering Causality from Multivariate Hawkes Integrated Cumulants
We design a new nonparametric method that allows one to estimate the mat...
read it

SGD with Variance Reduction beyond Empirical Risk Minimization
We introduce a doubly stochastic proximal gradient algorithm for optimiz...
read it

Learning the intensity of time events with changepoints
We consider the problem of learning the inhomogeneous intensity of a cou...
read it

A generalization error bound for sparse and lowrank multivariate Hawkes processes
We consider the problem of unveiling the implicit network structure of u...
read it

Concentration for matrix martingales in continuous time and microscopic activity of social networks
This paper gives new concentration inequalities for the spectral norm of...
read it

Sparse Bayesian Unsupervised Learning
This paper is about variable selection, clustering and estimation in an ...
read it

Link Prediction in Graphs with Autoregressive Features
In the paper, we consider the problem of link prediction in timeevolvin...
read it

Hypersparse optimal aggregation
In this paper, we consider the problem of "hypersparse aggregation". Na...
read it
Stéphane Gaïffas
is this you? claim profile
Professor Lecturer , Center for Applied Mathematics at Ecole Polytechnique