Taiji Suzuki

research

∙ 09/07/2023

Gradient-Based Feature Learning under Structured Data

Recent works have demonstrated that the sample complexity of gradient-ba...

0 Alireza Mousavi Hosseini, et al. ∙

research

∙ 08/01/2023

Learning Green's Function Efficiently Using Low-Rank Approximations

Learning the Green's function using deep learning models enables to solv...

0 Kishan Wimalawarne, et al. ∙

research

∙ 06/24/2023

Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective

Graph neural networks (GNNs) have pioneered advancements in graph repres...

0 Wei Huang, et al. ∙

research

∙ 06/12/2023

Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction

The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of...

0 Taiji Suzuki, et al. ∙

research

∙ 05/30/2023

Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input

Despite the great success of Transformer networks in various application...

0 Shokichi Takakura, et al. ∙

research

∙ 05/13/2023

Tight and fast generalization error bound of graph embedding in metric space

Recent studies have experimentally shown that we can achieve in non-Eucl...

0 Atsushi Suzuki, et al. ∙

research

∙ 03/06/2023

Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems

The entropic fictitious play (EFP) is a recently proposed algorithm that...

0 Atsushi Nitanda, et al. ∙

research

∙ 03/03/2023

Diffusion Models are Minimax Optimal Distribution Estimators

While efficient distribution learning is no doubt behind the groundbreak...

0 Kazusato Oko, et al. ∙

research

∙ 02/12/2023

Koopman-Based Bound for Generalization: New Aspect of Neural Networks Regarding Nonlinear Noise Filtering

We propose a new bound for generalization of neural networks using Koopm...

0 Yuka Hashimoto, et al. ∙

research

∙ 02/08/2023

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

Differential private optimization for nonconvex smooth objective is cons...

0 Tomoya Murata, et al. ∙

research

∙ 09/12/2022

Graph Polynomial Convolution Models for Node Classification of Non-Homophilous Graphs

We investigate efficient learning from higher-order graph convolution an...

0 Kishan Wimalawarne, et al. ∙

research

∙ 09/01/2022

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

While variance reduction methods have shown great success in solving lar...

0 Kazusato Oko, et al. ∙

research

∙ 05/30/2022

Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods

While deep learning has outperformed other methods for various tasks, th...

0 Shunta Akiyama, et al. ∙

research

∙ 03/30/2022

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

The stochastic gradient Langevin Dynamics is one of the most fundamental...

0 Yuri Kinoshita, et al. ∙

research

∙ 03/19/2022

Convergence Error Analysis of Reflected Gradient Langevin Dynamics for Globally Optimizing Non-Convex Constrained Problems

Non-convex optimization problems have various important applications, wh...

0 Kanji Sato, et al. ∙

research

∙ 02/12/2022

Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning

In recent centralized nonconvex distributed learning and federated learn...

0 Tomoya Murata, et al. ∙

research

∙ 01/25/2022

Convex Analysis of the Mean Field Langevin Dynamics

As an example of the nonlinear Fokker-Planck equation, the mean field La...

0 Atsushi Nitanda, et al. ∙

research

∙ 12/25/2021

Neural Network Module Decomposition and Recomposition

We propose a modularization method that decomposes a deep neural network...

0 Hiroaki Kingetsu, et al. ∙

research

∙ 08/24/2021

Adaptive and Interpretable Graph Convolution Networks Using Generalized Pagerank

We investigate adaptive layer-wise graph convolution in deep GCN models....

0 Kishan Wimalawarne, et al. ∙

research

∙ 08/05/2021

AutoLL: Automatic Linear Layout of Graphs based on Deep Neural Network

Linear layouts are a graph visualization method that can be used to capt...

8 Chihiro Watanabe, et al. ∙

research

∙ 06/11/2021

On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting

Deep learning empirically achieves high performance in many applications...

0 Shunta Akiyama, et al. ∙

research

∙ 03/26/2021

Deep Two-Way Matrix Reordering for Relational Data Analysis

Matrix reordering is a task to permute the rows and columns of a given o...

0 Chihiro Watanabe, et al. ∙

research

∙ 02/23/2021

Goodness-of-fit Test on the Number of Biclusters in Relational Data Matrix

Biclustering is a method for detecting homogeneous submatrices in a give...

7 Chihiro Watanabe, et al. ∙

research

∙ 02/05/2021

Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning

Federated learning is one of the important learning scenarios in distrib...

0 Tomoya Murata, et al. ∙

research

∙ 12/31/2020

Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis

We propose the particle dual averaging (PDA) method, which generalizes t...

0 Atsushi Nitanda, et al. ∙

research

∙ 12/06/2020

Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods

Establishing a theoretical analysis that explains why deep learning can ...

0 Taiji Suzuki, et al. ∙

research

∙ 09/23/2020

Estimation error analysis of deep learning on the regression problem on the variable exponent Besov space

Deep learning has achieved notable success in various fields, including ...

18 Kazuma Tsuji, et al. ∙

research

∙ 09/19/2020

Neural Architecture Search Using Stable Rank of Convolutional Layers

In Neural Architecture Search (NAS), Differentiable ARchiTecture Search ...

18 Kengo Machida, et al. ∙

research

∙ 07/11/2020

Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics

We introduce a new theoretical framework to analyze deep learning optimi...

5 Taiji Suzuki, et al. ∙

research

∙ 06/22/2020

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

We analyze the convergence of the averaged stochastic gradient descent f...

14 Atsushi Nitanda, et al. ∙

research

∙ 06/19/2020

Gradient Descent in RKHS with Importance Labeling

Labeling cost is often expensive and is a fundamental limitation of supe...

15 Tomoya Murata, et al. ∙

research

∙ 06/18/2020

When Does Preconditioning Help or Hurt Generalization?

While second order optimizers such as natural gradient descent (NGD) oft...

0 Shun-ichi Amari, et al. ∙

research

∙ 06/15/2020

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

It is known that the current graph neural networks (GNNs) are difficult ...

0 Kenta Oono, et al. ∙

research

∙ 05/27/2020

Selective Inference for Latent Block Models

Model selection in latent block models has been a challenging but import...

21 Chihiro Watanabe, et al. ∙

research

∙ 03/04/2020

Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding Meta-Amortization Error

The ability to learn new concepts with small amounts of data is a crucia...

10 Yusuke Hayashi, et al. ∙

research

∙ 02/29/2020

Dimension-free convergence rates for gradient Langevin dynamics in RKHS

Gradient Langevin dynamics (GLD) and stochastic GLD (SGLD) have attracte...

12 Boris Muzellec, et al. ∙

research

∙ 01/14/2020

Understanding Generalization in Deep Learning via Tensor Methods

Deep neural networks generalize well on unseen data though the number of...

67 Jingling Li, et al. ∙

research

∙ 12/26/2019

Domain Adaptation Regularization for Spectral Pruning

Deep Neural Networks (DNNs) have recently been achieving state-of-the-ar...

11 Laurent Dillard, et al. ∙

research

∙ 11/13/2019

Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features

Although kernel methods are widely used in many learning problems, they ...

12 Shingo Yashima, et al. ∙

research

∙ 10/29/2019

Scalable Deep Neural Networks via Low-Rank Matrix Factorization

Compressing deep neural networks (DNNs) is important for real-world appl...

42 Atsushi Yaguchi, et al. ∙

research

∙ 10/28/2019

Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space

Deep learning has exhibited superior performance for various tasks, espe...

11 Taiji Suzuki, et al. ∙

research

∙ 09/25/2019

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

One of biggest issues in deep learning theory is its generalization abil...

8 Taiji Suzuki, et al. ∙

research

∙ 09/09/2019

Understanding the Effects of Pre-Training for Object Detectors via Eigenspectrum

ImageNet pre-training has been regarded as essential for training accura...

8 Yosuke Shinya, et al. ∙

research

∙ 06/26/2019

Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD

Large-batch stochastic gradient descent (SGD) is widely used for trainin...

4 Kosuke Haruki, et al. ∙

research

∙ 06/10/2019

Goodness-of-fit Test for Latent Block Models

Latent Block Models are used for probabilistic biclustering, which is sh...

2 Chihiro Watanabe, et al. ∙

research

∙ 05/29/2019

Accelerated Sparsified SGD with Error Feedback

We study a stochastic gradient method for synchronous distributed optimi...

0 Tomoya Murata, et al. ∙

research

∙ 05/27/2019

On Asymptotic Behaviors of Graph CNNs from Dynamical Systems Perspective

Graph Convolutional Neural Networks (graph CNNs) are a promising deep le...

3 Kenta Oono, et al. ∙

research

∙ 05/23/2019

Refined Generalization Analysis of Gradient Descent for Over-parameterized Two-layer Neural Networks with Smooth Activations on Classification Problems

Recently, several studies have proven the global convergence and general...

2 Atsushi Nitanda, et al. ∙

research

∙ 05/22/2019

On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces

Deep learning has been applied to various tasks in the field of machine ...

6 Satoshi Hayakawa, et al. ∙

research

∙ 03/24/2019

Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks

Convolutional neural networks (CNNs) have been shown to achieve optimal ...

0 Kenta Oono, et al. ∙

Taiji Suzuki

Featured Co-authors

Sign in with Google

Consider DeepAI Pro