Rio Yokota

research

∙ 08/29/2023

Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library

NVIDIA Tensor Core is a mixed-precision matrix-matrix multiplication and...

0 Hiroyuki Ootomo, et al. ∙

research

∙ 07/27/2023

Pre-training Vision Transformers with Very Limited Synthesized Images

Formula-driven supervised learning (FDSL) is a pre-training method that ...

0 Ryo Nakamura, et al. ∙

research

∙ 06/21/2023

DGEMM on Integer Matrix Multiplication Unit

Deep learning hardware achieves high throughput and low power consumptio...

0 Hiroyuki Ootomo, et al. ∙

research

∙ 05/08/2023

ASDL: A Unified Interface for Gradient Preconditioning in PyTorch

Gradient preconditioning is a key technique to integrate the second-orde...

0 Kazuki Osawa, et al. ∙

research

∙ 04/10/2023

Mixed-Precision Random Projection for RandNLA on Tensor Cores

Random projection can reduce the dimension of data while capturing its s...

0 Hiroyuki Ootomo, et al. ∙

research

∙ 03/15/2023

Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection

Quantum circuit simulation provides the foundation for the development o...

0 Hiryuki Ootomo, et al. ∙

research

∙ 03/02/2023

Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves

Formula-driven supervised learning (FDSL) has been shown to be an effect...

0 Sora Takashima, et al. ∙

research

∙ 11/18/2022

Informative Sample-Aware Proxy for Deep Metric Learning

Among various supervised deep metric learning methods proxy-based approa...

0 Aoyu Li, et al. ∙

research

∙ 11/15/2022

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Modern deep learning systems are fragile and do not generalize well unde...

0 Hiroki Naganuma, et al. ∙

research

∙ 08/23/2022

Scalable Linear Time Dense Direct Solver for 3-D Problems Without Trailing Sub-Matrix Dependencies

Factorization of large dense matrices are ubiquitous in engineering and ...

0 Qianxiang Ma, et al. ∙

research

∙ 08/12/2022

Parallel QR Factorization of Block Low-Rank Matrices

We present two new algorithms for Householder QR factorization of Block ...

0 M. Ridwan Apriansyah, et al. ∙

research

∙ 06/18/2022

Replacing Labeled Real-image Datasets with Auto-generated Contours

In the present work, we show that the performance of formula-driven supe...

11 Hirokatsu Kataoka, et al. ∙

research

∙ 03/07/2022

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance

Tensor Core is a mixed-precision matrix-matrix multiplication unit on NV...

0 Hiroyuki Ootomo, et al. ∙

research

∙ 09/09/2021

OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching

Inverse Reinforcement Learning (IRL) is attractive in scenarios where re...

29 Hana Hoshino, et al. ∙

research

∙ 04/01/2021

RePOSE: Real-Time Iterative Rendering and Refinement for 6D Object Pose Estimation

The use of iterative pose refinement is a critical processing step for 6...

3 Shun Iwase, et al. ∙

research

∙ 07/30/2020

Epipolar-Guided Deep Object Matching for Scene Change Detection

This paper describes a viewpoint-robust object-based change detection ne...

1 Kento Doi, et al. ∙

research

∙ 02/13/2020

Scalable and Practical Natural Gradient for Large-Scale Deep Learning

Large-scale distributed training of deep neural networks results in mode...

0 Kazuki Osawa, et al. ∙

research

∙ 10/30/2019

Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis

Hierarchical Matrix (H-matrix) is an approximation technique which split...

0 Rise Ooi, et al. ∙

research

∙ 06/06/2019

Practical Deep Learning with Bayesian Principles

Bayesian methods promise to fix many shortcomings of deep learning, but ...

23 Kazuki Osawa, et al. ∙

research

∙ 11/29/2018

Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs

Large-scale distributed training of deep neural networks suffer from the...

0 Kazuki Osawa, et al. ∙

research

∙ 03/27/2018

Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering

Algorithmic and architecture-oriented optimizations are essential for ac...

0 Mustafa AbdulJabbar, et al. ∙

Rio Yokota

Featured Co-authors

Sign in with Google

Consider DeepAI Pro