
-
Stable Weight Decay Regularization
Weight decay is a popular regularization technique for training of deep ...
read it
-
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Deep learning is often criticized by two serious issues which rarely exi...
read it
-
Classification from Ambiguity Comparisons
Labeling data is an unavoidable pre-processing procedure for most machin...
read it
-
Diagnostic Uncertainty Calibration: Towards Reliable Machine Predictions in Medical Domain
Label disagreement between human experts is a common issue in the medica...
read it
-
Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia
Adaptive Momentum Estimation (Adam), which combines Adaptive Learning Ra...
read it
-
LFD-ProtoNet: Prototypical Network Based on Local Fisher Discriminant Analysis for Few-shot Learning
The prototypical network (ProtoNet) is a few-shot learning framework tha...
read it
-
γ-ABC: Outlier-Robust Approximate Bayesian Computation based on Robust Divergence Estimator
Making a reliable inference in complex models is an essential issue in s...
read it
-
Similarity-based Classification: Connecting Similarity Learning to Binary Classification
In real-world classification problems, pairwise supervision (i.e., a pai...
read it
-
Sequential Gallery for Interactive Visual Design Optimization
Visual design tasks often involve tuning many design parameters. For exa...
read it
-
Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time
The Gaussian process bandit is a problem in which we want to find a maxi...
read it
-
Few-shot Domain Adaptation by Causal Mechanism Transfer
We study few-shot supervised domain adaptation (DA) for regression probl...
read it
-
A Diffusion Theory for Deep Learning Dynamics: Stochastic Gradient Descent Escapes From Sharp Minima Exponentially Fast
Stochastic optimization algorithms, such as Stochastic Gradient Descent ...
read it
-
Bayesian interpretation of SGD as Ito process
The current interpretation of stochastic gradient descent (SGD) as a sto...
read it
-
Classification from Triplet Comparison Data
Learning from triplet comparison data has been extensively studied in th...
read it
-
Interactive Subspace Exploration on Generative Image Modelling
Generative image modeling techniques such as GAN demonstrate highly conv...
read it
-
Solving NP-Hard Problems on Graphs by Reinforcement Learning without Domain Knowledge
We propose an algorithm based on reinforcement learning for solving NP-h...
read it
-
Directing DNNs Attention for Facial Attribution Classification using Gradient-weighted Class Activation Mapping
Deep neural networks (DNNs) have a high accuracy on image classification...
read it
-
Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization
Pairwise similarities and dissimilarities between data points might be e...
read it
-
Use of Ghost Cytometry to Differentiate Cells with Similar Gross Morphologic Characteristics
Imaging flow cytometry shows significant potential for increasing our un...
read it
-
On Learning from Ghost Imaging without Imaging
Computational ghost imaging is an imaging technique with which an object...
read it
-
On Transformations in Stochastic Gradient MCMC
Stochastic gradient Langevin dynamics (SGLD) is a widely used sampler fo...
read it
-
PAC-Bayes Analysis of Sentence Representation
Learning sentence vectors from an unlabeled corpus has attracted attenti...
read it
-
Online Multiclass Classification Based on Prediction Margin for Partial Feedback
We consider the problem of online multiclass classification with partial...
read it
-
Multi-level Monte Carlo Variational Inference
In many statistics and machine learning frameworks, stochastic optimizat...
read it
-
Semi-Supervised Ordinal Regression Based on Empirical Risk Minimization
We consider the semi-supervised ordinal regression problem, where unlabe...
read it
-
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis
The notion of flat minima has played a key role in the generalization pr...
read it
-
Clipped Matrix Completion: a Remedy for Ceiling Effects
We consider the recovery of a low-rank matrix from its clipped observati...
read it
-
On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions
Data-agnostic quasi-imperceptible perturbations on inputs can severely d...
read it
-
Unsupervised Domain Adaptation Based on Source-guided Discrepancy
Unsupervised domain adaptation is the problem setting where data generat...
read it
-
Frank-Wolfe Stein Sampling
In Bayesian inference, the posterior distributions are difficult to obta...
read it
-
Variational Inference for Gaussian Process with Panel Count Data
We present the first framework for Gaussian-process-modulated Poisson pr...
read it
-
Analysis of Minimax Error Rate for Crowdsourcing and Its Application to Worker Clustering Model
While crowdsourcing has become an important means to label data, crowdwo...
read it
-
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks
High sensitivity of neural networks against malicious perturbations on i...
read it
-
Gaussian Process Classification with Privileged Information by Soft-to-Hard Labeling Transfer
Learning using privileged information is an attractive problem setting t...
read it
-
Variational Inference based on Robust Divergences
Robustness to outliers is a central issue in real-world machine learning...
read it
-
On the Model Shrinkage Effect of Gamma Process Edge Partition Models
The edge partition model (EPM) is a fundamental Bayesian nonparametric m...
read it
-
Expectation Propagation for t-Exponential Family Using Q-Algebra
Exponential family distributions are highly useful in machine learning s...
read it
-
Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling
Analyzing the underlying structure of multiple time-sequences provides i...
read it
-
Stochastic Divergence Minimization for Biterm Topic Model
As the emergence and the thriving development of social networks, a huge...
read it
-
Revisiting Distributionally Robust Supervised Learning in Classification
Distributionally Robust Supervised Learning (DRSL) is necessary for buil...
read it
-
Reparameterization trick for discrete variables
Low-variance gradient estimation is crucial for learning directed graphi...
read it
-
Generative Adversarial Nets from a Density Ratio Estimation Perspective
Generative adversarial networks (GANs) are successful deep generative mo...
read it
-
Quantum Annealing for Variational Bayes Inference
This paper presents studies on a deterministic annealing algorithm based...
read it
-
Quantum Annealing for Dirichlet Process Mixture Models with Applications to Network Clustering
We developed a new quantum annealing (QA) algorithm for Dirichlet proces...
read it
-
Rethinking Collapsed Variational Bayes Inference for LDA
We propose a novel interpretation of the collapsed variational Bayes inf...
read it
-
Restricted Collapsed Draw: Accurate Sampling for Hierarchical Chinese Restaurant Process Hidden Markov Models
We propose a restricted collapsed draw (RCD) sampler, a general Markov c...
read it