
Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
We explore Boccaccio's Decameron to see how digital humanities tools can...
read it

Equivariant Manifold Flows
Tractably modelling distributions over manifolds has long been an import...
read it

How Low Can We Go: Trading Memory for Error in LowPrecision Training
Lowprecision arithmetic trains deep learning models using less energy, ...
read it

LowPrecision Reinforcement Learning
Lowprecision training has become a popular approach to reduce computati...
read it

Revisiting BFloat16 Training
Stateoftheart generic lowprecision training algorithms use a mix of ...
read it

MetaLearning for Variational Inference
Variational inference (VI) plays an essential role in approximate Bayesi...
read it

Regulating AccuracyEfficiency TradeOffs in Distributed Machine Learning Systems
In this paper we discuss the tradeoff between accuracy and efficiency i...
read it

Asymptotically Optimal Exact Minibatch MetropolisHastings
MetropolisHastings (MH) is a commonlyused MCMC algorithm, but it can b...
read it

Neural Manifold Ordinary Differential Equations
To better conform to data geometry, recent deep generative modelling tec...
read it

MixML: A Unified Analysis of Weakly Consistent Parallel Learning
Parallelism is a ubiquitous method for accelerating machine learning alg...
read it

Optimizing JPEG Quantization for Classification Networks
Deep learning for computer vision depends on lossy image compression: it...
read it

Differentiating through the Fréchet Mean
Recent advances in deep representation learning on Riemannian manifolds ...
read it

AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient meth...
read it

Moniqua: Modulo Quantized Communication in Decentralized SGD
Running Stochastic Gradient Descent (SGD) in a decentralized fashion has...
read it

PoissonMinibatching for Gibbs Sampling with Convergence Rate Guarantees
Gibbs sampling is a Markov chain Monte Carlo method that is often used f...
read it

Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Outliers in weights and activations pose a key challenge for fixedpoint...
read it

PipeMare: Asynchronous Pipeline Parallel DNN Training
Recently there has been a flurry of interest around using pipeline paral...
read it

QPyTorch: A LowPrecision Arithmetic Simulation Framework
Lowprecision training reduces computational cost and produces efficient...
read it

SWALP : Stochastic Weight Averaging in LowPrecision Training
Low precision operations can provide scalability, memory savings, portab...
read it

SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
read it

Distributed Learning with Sublinear Communication
In distributed statistical learning, N samples are split across m machin...
read it

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Quantization can improve the execution latency and energy efficiency of ...
read it

Improving Neural Network Quantization using Outlier Channel Splitting
Quantization can improve the execution latency and energy efficiency of ...
read it

Building Efficient Deep Neural Networks with Unitary Group Convolutions
We propose unitary group convolutions (UGConvs), a building block for CN...
read it

Minibatch Gibbs Sampling on Large Graphical Models
Gibbs sampling is the de facto Markov chain Monte Carlo method used for ...
read it

Channel Gating Neural Networks
Employing deep neural networks to obtain stateoftheart performance on...
read it

Representation Tradeoffs for Hyperbolic Embeddings
Hyperbolic embeddings offer excellent quality with few dimensions when e...
read it

The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine ...
read it

A Kernel Theory of Modern Data Augmentation
Data augmentation, a technique in which a training set is expanded with ...
read it

HighAccuracy LowPrecision Training
Lowprecision computation is often used to lower the time and energy cos...
read it

A Formal Framework For Probabilistic Unclean Databases
Traditional modeling of inconsistency in database theory casts all possi...
read it

Accelerated Stochastic Power Iteration
Principal component analysis (PCA) is one of the most powerful tools in ...
read it

Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
A challenge in training discriminative models like neural networks is ob...
read it

Parallel SGD: When does averaging help?
Consider a number of workers running SGD independently on the same pool ...
read it

Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that ite...
read it

Data Programming: Creating Large Training Sets, Quickly
Large labeled training sets are the critical building blocks of supervis...
read it

Taming the Wild: A Unified Analysis of Hogwild!Style Algorithms
Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variet...
read it

Incremental Knowledge Base Construction Using DeepDive
Populating a database with unstructured information is a longstanding p...
read it

Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems
Stochastic gradient descent (SGD) on a lowrank factorization is commonl...
read it
Christopher De Sa
is this you? claim profile
Assistant Professor in the Computer Science department at Cornell University