b'Samet Oymak'

research

∙ 08/31/2023

Transformers as Support Vector Machines

Since its inception in "Attention Is All You Need", transformer architec...

0 Davoud Ataee Tarzanagh, et al. ∙

research

∙ 08/16/2023

Can Transformers Learn Optimal Filtering for Unknown Systems?

Transformers have demonstrated remarkable success in natural language pr...

0 Haldun Balim, et al. ∙

research

∙ 07/10/2023

FedYolo: Augmenting Federated Learning with Pretrained Transformers

The growth and diversity of machine learning applications motivate a ret...

0 Xuechen Zhang, et al. ∙

research

∙ 06/23/2023

Max-Margin Token Selection in Attention Mechanism

Attention mechanism is a central component of the transformer architectu...

1 Davoud Ataee Tarzanagh, et al. ∙

research

∙ 06/06/2023

On the Role of Attention in Prompt-tuning

Prompt-tuning is an emerging strategy to adapt large language models (LL...

0 Samet Oymak, et al. ∙

research

∙ 06/02/2023

Federated Multi-Sequence Stochastic Approximation with Local Hypergradient Estimation

Stochastic approximation with multiple coupled sequences (MSA) has found...

0 Davoud Ataee Tarzanagh, et al. ∙

research

∙ 05/30/2023

Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs

Chain-of-thought (CoT) is a method that enables language models to handl...

0 Yingcong Li, et al. ∙

research

∙ 05/15/2023

Learning on Manifolds: Universal Approximations Properties using Geometric Controllability Conditions for Neural ODEs

In numerous robotics and mechanical engineering applications, among othe...

0 Karthik Elamvazhuthi, et al. ∙

research

∙ 03/08/2023

Provable Pathways: Learning Multiple Tasks over Multiple Paths

Constructing useful representations across a large number of tasks is a ...

0 Yingcong Li, et al. ∙

research

∙ 02/02/2023

Stochastic Contextual Bandits with Long Horizon Rewards

The growing interest in complex decision-making and language modeling pr...

0 Yuzhen Qin, et al. ∙

research

∙ 01/17/2023

Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning

In-context learning (ICL) is a type of prompting where a transformer mod...

0 Yingcong Li, et al. ∙

research

∙ 08/29/2022

Finite Sample Identification of Bilinear Dynamical Systems

Bilinear dynamical systems are ubiquitous in many different domains and ...

0 Yahya Sattar, et al. ∙

research

∙ 05/12/2022

Representation Learning for Context-Dependent Decision-Making

Humans are capable of adjusting to changing environments flexibly and qu...

0 Yuzhen Qin, et al. ∙

research

∙ 05/04/2022

FEDNEST: Federated Bilevel, Minimax, and Compositional Optimization

Standard federated optimization methods successfully apply to stochastic...

0 Davoud Ataee Tarzanagh, et al. ∙

research

∙ 03/30/2022

System Identification via Nuclear Norm Regularization

This paper studies the problem of identifying low-order linear systems v...

0 Yue Sun, et al. ∙

research

∙ 03/03/2022

Provable and Efficient Continual Representation Learning

In continual learning (CL), the goal is to design models that can learn ...

0 Yingcong Li, et al. ∙

research

∙ 01/16/2022

Towards Sample-efficient Overparameterized Meta-learning

An overarching goal in machine learning is to build a generalizable mode...

0 Yue Sun, et al. ∙

research

∙ 01/13/2022

Non-Stationary Representation Learning in Sequential Linear Bandits

In this paper, we study representation learning for multi-task decision-...

0 Yuzhen Qin, et al. ∙

research

∙ 01/04/2022

AutoBalance: Optimized Loss Functions for Imbalanced Data

Imbalanced datasets are commonplace in modern machine learning problems....

0 Mingchen Li, et al. ∙

research

∙ 11/13/2021

Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

Learning how to effectively control unknown dynamical systems is crucial...

0 Yahya Sattar, et al. ∙

research

∙ 10/06/2021

Post-hoc Models for Performance Estimation of Machine Learning Inference

Estimating how well a machine learning model performs during inference i...

0 Xuechen Zhang, et al. ∙

research

∙ 05/26/2021

Certainty Equivalent Quadratic Control for Markov Jump Systems

Real-world control applications often involve complex dynamics subject t...

0 Zhe Du, et al. ∙

research

∙ 04/29/2021

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

Neural Architecture Search (NAS) is a popular method for automatically d...

0 Samet Oymak, et al. ∙

research

∙ 04/05/2021

Unsupervised Multi-source Domain Adaptation Without Access to Source Data

Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for...

0 Sk. Miraj Ahmed, et al. ∙

research

∙ 03/02/2021

Label-Imbalanced and Group-Sensitive Classification under Overparameterization

Label-imbalanced and group-sensitive classification seeks to appropriate...

0 Ganesh Ramachandra Kini, et al. ∙

research

∙ 02/22/2021

Super-Convergence with an Unstable Learning Rate

Conventional wisdom dictates that learning rate should be in the stable ...

0 Samet Oymak, et al. ∙

research

∙ 02/14/2021

Sample Efficient Subspace-based Representations for Nonlinear Meta-Learning

Constructing good representations is critical for learning complex tasks...

0 Halil Ibrahim Gulluk, et al. ∙

research

∙ 12/16/2020

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Deep networks are typically trained with many more parameters than the s...

0 Xiangyu Chang, et al. ∙

research

∙ 11/16/2020

On the Marginal Benefit of Active Learning: Does Self-Supervision Eat Its Cake?

Active learning is the set of techniques for intelligently labeling larg...

0 Yao-Chun Chan, et al. ∙

research

∙ 11/16/2020

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Contemporary machine learning applications often involve classification ...

0 Christos Thrampoulidis, et al. ∙

research

∙ 07/05/2020

Unsupervised Paraphrasing via Deep Reinforcement Learning

Paraphrasing is expressing the meaning of an input sentence in different...

0 A. B. Siddique, et al. ∙

research

∙ 06/19/2020

Statistical and Algorithmic Insights for Semi-supervised Learning with Self-training

Self-training is a classical approach in semi-supervised learning which ...

0 Samet Oymak, et al. ∙

research

∙ 06/19/2020

Exploring Weight Importance and Hessian Bias in Model Pruning

Model pruning is an essential procedure for building compact and computa...

0 Mingchen Li, et al. ∙

research

∙ 02/23/2020

On the Role of Dataset Quality and Heterogeneity in Model Confidence

Safety-critical applications require machine learning models that output...

0 Yuan Zhao, et al. ∙

research

∙ 02/20/2020

Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems

We consider the problem of learning stabilizable systems governed by non...

0 Yahya Sattar, et al. ∙

research

∙ 07/03/2019

Quickly Finding the Best Linear Model in High Dimensions

We study the problem of finding the best linear model that can minimize ...

0 Yahya Sattar, et al. ∙

research

∙ 06/12/2019

Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian

Modern neural network architectures often generalize well despite contai...

0 Samet Oymak, et al. ∙

research

∙ 03/27/2019

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks

Modern neural networks are typically trained in an over-parameterized re...

0 Mingchen Li, et al. ∙

research

∙ 02/12/2019

Towards moderate overparameterization: global convergence guarantees for training shallow neural networks

Many modern neural network architectures are trained in an overparameter...

0 Samet Oymak, et al. ∙

research

∙ 12/25/2018

Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?

Many modern learning tasks involve fitting nonlinear models to data whic...

0 Samet Oymak, et al. ∙

research

∙ 09/09/2018

Stochastic Gradient Descent Learns State Equations with Nonlinear Activations

We study discrete time dynamical systems governed by the state equation ...

0 Samet Oymak, et al. ∙

research

∙ 06/14/2018

Non-asymptotic Identification of LTI Systems from a Single Trajectory

We consider the problem of learning a realization for a linear time-inva...

0 Samet Oymak, et al. ∙

research

∙ 05/16/2018

End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition

In this paper we study the problem of learning the weights of a deep con...

0 Samet Oymak, et al. ∙

research

∙ 02/05/2018

Learning Compact Neural Networks with Regularization

We study the impact of regularization for learning neural networks. Our ...

0 Samet Oymak, et al. ∙

research

∙ 05/20/2017

Learning Feature Nonlinearities with Non-Convex Regularized Binned Regression

For various applications, the relations between the dependent and indepe...

0 Samet Oymak, et al. ∙

research

∙ 10/23/2016

Fast and Reliable Parameter Estimation from Nonlinear Observations

In this paper we study the problem of recovering a structured but unknow...

0 Samet Oymak, et al. ∙

research

∙ 11/30/2015

Universality laws for randomized dimension reduction, with applications

Dimension reduction is the process of embedding high-dimensional data in...

0 Samet Oymak, et al. ∙

research

∙ 06/11/2015

Isometric sketching of any set via the Restricted Isometry Property

In this paper we show that for the purposes of dimensionality reduction ...

0 Samet Oymak, et al. ∙

research

∙ 04/27/2011

Finding Dense Clusters via "Low Rank + Sparse" Decomposition

Finding "densely connected clusters" in a graph is in general an importa...

0 Samet Oymak, et al. ∙

research

∙ 11/29/2010

New Null Space Results and Recovery Thresholds for Matrix Rank Minimization

Nuclear norm minimization (NNM) has recently gained significant attentio...

0 Samet Oymak, et al. ∙

Samet Oymak

Featured Co-authors

Sign in with Google

Consider DeepAI Pro