
On the geometry of solutions and on the capacity of multilayer neural networks with ReLU activations
Rectified Linear Units (ReLU) have become the main model for the neural ...
What the [MASK]? Making Sense of LanguageSpecific BERT Models
Recently, Natural Language Processing (NLP) has witnessed an impressive ...
"An Image is Worth a Thousand Features": Scalable Product Representations for InSession TypeAhead Personalization
We address the problem of personalizing query completion in a digital co...
Pretraining is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence
Topic models extract meaningful groups of words from documents, allowing...
Crosslingual Contextualized Topic Models with Zeroshot Learning
Many data sets in a domain (reviews, forums, news, etc.) exist in parall...
Fantastic Embeddings and How to Align Them: ZeroShot Inference in a MultiShop Scenario
This paper addresses the challenge of leveraging multiple embedding spac...
Parle: parallelizing stochastic gradient descent
We propose a new algorithm called Parle for parallel training of deep ne...
Flexible Models for Microclustering with Application to Entity Resolution
Most generative models for clustering implicitly assume that the number ...
A characterization of productform exchangeable feature probability functions
We characterize the class of exchangeable feature allocations assigning ...
A momentmatching Ferguson and Klass algorithm
Completely random measures (CRM) represent the key building block of a w...
A note on quadratic approximations of logistic loglikelihoods
Quadratic approximations of logistic loglikelihoods are fundamental to ...
Functional ANOVA with Multiple Distributions: Implications for the Sensitivity Analysis of Computer Experiments
The functional ANOVA expansion of a multivariate mapping plays a fundame...
Conjugate Bayes for probit regression via unified skewnormals
Regression models for dichotomous data are ubiquitous in statistics. Bes...
Edgeworth trading on networks
We define a class of pure exchange Edgeworth trading processes that unde...
Scalable inference for crossed random effects models
We analyze the complexity of Gibbs samplers for inference in crossed ran...
Mater certa est, pater numquam: What can Facebook Advertising Data Tell Us about Male Fertility Rates?
In many developing countries, timely and accurate information about birt...
Scalable Importance Tempering and Bayesian Variable Selection
We propose a Monte Carlo algorithm to sample from highdimensional proba...
Hiding the start of Brownian motion: towards a Bayesian analysis of privacy for GPS trajectories
The diffusion of GPS sensors and the success of applications for sharing...
Mean and dispersion of harmonic measure
In this note, we provide and prove exact formulas for the mean and the t...
Generalized Pareto Copulas: A Key to Multivariate Extremes
This paper reviews generalized Pareto copulas (GPC), which turn out to b...
Reinforced urns and the subdistribution betaStacy process prior for competing risks analysis
In this paper we introduce the subdistribution betaStacy process, a nov...
Bayesian optimality of testing procedures for survival data
Most statistical tests for treatment effects used in randomized clinical...
Bayesian cumulative shrinkage for infinite factorizations
There are a variety of Bayesian models relying on representations in whi...
Strong Consistency of Nonparametric Bayesian Inferential Methods for Multivariate MaxStable Distributions
Predicting extreme events is important in many applications in risk anal...
Estimation and uncertainty quantification for extreme quantile regions
Estimation of extreme quantile regions, spaces in which future extreme e...
Recombinatorkmeans: Enhancing kmeans++ by seeding from pools of previous runs
We present a heuristic algorithm, called recombinatorkmeans, that can ...
Conditionally Gaussian Random Sequences for an Integrated Variance Estimator with Correlation between Noise and Returns
Correlation between microstructure noise and latent financial logarithmi...
On the Consistency among Prior, Posteriors, and Information Sets (Extended Abstract)
This paper studies implications of the consistency conditions among prio...
Natural representation of composite data with replicated autoencoders
Generative processes in biology and other fields often produce data that...
Subexponential LPs Approximate MaxCut
We show that for every ε > 0, the degreen^ε SheraliAdams linear progra...
A general Bayesian bootstrap for censored data based on the betaStacy process
We introduce a novel procedure to perform Bayesian nonparametric infere...
Computing Shapley Effects for Sensitivity Analysis
Shapley effects are attracting increasing attention as sensitivity measu...
Is Time to Intervention in the COVID19 Outbreak Really Important? A Global Sensitivity Analysis Approach
Italy has been one of the first countries timewise strongly impacted by ...
Sharp Thresholds for a SIR Model on OneDimensional SmallWorld Networks
We study epidemic spreading according to a SusceptibleInfectiousRecove...
