
Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
We explore Boccaccio's Decameron to see how digital humanities tools can...
Equivariant Manifold Flows
Tractably modelling distributions over manifolds has long been an import...
How Low Can We Go: Trading Memory for Error in LowPrecision Training
Lowprecision arithmetic trains deep learning models using less energy, ...
LowPrecision Reinforcement Learning
Lowprecision training has become a popular approach to reduce computati...
Revisiting BFloat16 Training
Stateoftheart generic lowprecision training algorithms use a mix of ...
MetaLearning for Variational Inference
Variational inference (VI) plays an essential role in approximate Bayesi...
Regulating AccuracyEfficiency TradeOffs in Distributed Machine Learning Systems
In this paper we discuss the tradeoff between accuracy and efficiency i...
Asymptotically Optimal Exact Minibatch MetropolisHastings
MetropolisHastings (MH) is a commonlyused MCMC algorithm, but it can b...
Neural Manifold Ordinary Differential Equations
To better conform to data geometry, recent deep generative modelling tec...
MixML: A Unified Analysis of Weakly Consistent Parallel Learning
Parallelism is a ubiquitous method for accelerating machine learning alg...
Optimizing JPEG Quantization for Classification Networks
Deep learning for computer vision depends on lossy image compression: it...
Differentiating through the Fréchet Mean
Recent advances in deep representation learning on Riemannian manifolds ...
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient meth...
Moniqua: Modulo Quantized Communication in Decentralized SGD
Running Stochastic Gradient Descent (SGD) in a decentralized fashion has...
PoissonMinibatching for Gibbs Sampling with Convergence Rate Guarantees
Gibbs sampling is a Markov chain Monte Carlo method that is often used f...
Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators
Outliers in weights and activations pose a key challenge for fixedpoint...
PipeMare: Asynchronous Pipeline Parallel DNN Training
Recently there has been a flurry of interest around using pipeline paral...
QPyTorch: A LowPrecision Arithmetic Simulation Framework
Lowprecision training reduces computational cost and produces efficient...
SWALP : Stochastic Weight Averaging in LowPrecision Training
Low precision operations can provide scalability, memory savings, portab...
SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
Distributed Learning with Sublinear Communication
In distributed statistical learning, N samples are split across m machin...
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Quantization can improve the execution latency and energy efficiency of ...
Improving Neural Network Quantization using Outlier Channel Splitting
Quantization can improve the execution latency and energy efficiency of ...
Building Efficient Deep Neural Networks with Unitary Group Convolutions
We propose unitary group convolutions (UGConvs), a building block for CN...
Minibatch Gibbs Sampling on Large Graphical Models
Gibbs sampling is the de facto Markov chain Monte Carlo method used for ...
Channel Gating Neural Networks
Employing deep neural networks to obtain stateoftheart performance on...
Representation Tradeoffs for Hyperbolic Embeddings
Hyperbolic embeddings offer excellent quality with few dimensions when e...
The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine ...
A Kernel Theory of Modern Data Augmentation
Data augmentation, a technique in which a training set is expanded with ...
HighAccuracy LowPrecision Training
Lowprecision computation is often used to lower the time and energy cos...
A Formal Framework For Probabilistic Unclean Databases
Traditional modeling of inconsistency in database theory casts all possi...
Accelerated Stochastic Power Iteration
Principal component analysis (PCA) is one of the most powerful tools in ...
Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
A challenge in training discriminative models like neural networks is ob...
Parallel SGD: When does averaging help?
Consider a number of workers running SGD independently on the same pool ...
Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that ite...
Data Programming: Creating Large Training Sets, Quickly
Large labeled training sets are the critical building blocks of supervis...
Taming the Wild: A Unified Analysis of Hogwild!Style Algorithms
Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variet...
Incremental Knowledge Base Construction Using DeepDive
Populating a database with unstructured information is a longstanding p...
Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems
Stochastic gradient descent (SGD) on a lowrank factorization is commonl...
Christopher De Sa
Assistant Professor in the Computer Science department at Cornell University