Since its inception in "Attention Is All You Need", transformer architec...
Transformers have demonstrated remarkable success in natural language
pr...
The growth and diversity of machine learning applications motivate a
ret...
Attention mechanism is a central component of the transformer architectu...
Prompt-tuning is an emerging strategy to adapt large language models (LL...
Stochastic approximation with multiple coupled sequences (MSA) has found...
Chain-of-thought (CoT) is a method that enables language models to handl...
In numerous robotics and mechanical engineering applications, among othe...
Constructing useful representations across a large number of tasks is a ...
The growing interest in complex decision-making and language modeling
pr...
In-context learning (ICL) is a type of prompting where a transformer mod...
Bilinear dynamical systems are ubiquitous in many different domains and ...
Humans are capable of adjusting to changing environments flexibly and
qu...
Standard federated optimization methods successfully apply to stochastic...
This paper studies the problem of identifying low-order linear systems v...
In continual learning (CL), the goal is to design models that can learn ...
An overarching goal in machine learning is to build a generalizable mode...
In this paper, we study representation learning for multi-task
decision-...
Imbalanced datasets are commonplace in modern machine learning problems....
Learning how to effectively control unknown dynamical systems is crucial...
Estimating how well a machine learning model performs during inference i...
Real-world control applications often involve complex dynamics subject t...
Neural Architecture Search (NAS) is a popular method for automatically
d...
Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for...
Label-imbalanced and group-sensitive classification seeks to appropriate...
Conventional wisdom dictates that learning rate should be in the stable
...
Constructing good representations is critical for learning complex tasks...
Deep networks are typically trained with many more parameters than the s...
Active learning is the set of techniques for intelligently labeling larg...
Contemporary machine learning applications often involve classification ...
Paraphrasing is expressing the meaning of an input sentence in different...
Self-training is a classical approach in semi-supervised learning which ...
Model pruning is an essential procedure for building compact and
computa...
Safety-critical applications require machine learning models that output...
We consider the problem of learning stabilizable systems governed by
non...
We study the problem of finding the best linear model that can minimize
...
Modern neural network architectures often generalize well despite contai...
Modern neural networks are typically trained in an over-parameterized re...
Many modern neural network architectures are trained in an overparameter...
Many modern learning tasks involve fitting nonlinear models to data whic...
We study discrete time dynamical systems governed by the state equation
...
We consider the problem of learning a realization for a linear time-inva...
In this paper we study the problem of learning the weights of a deep
con...
We study the impact of regularization for learning neural networks. Our ...
For various applications, the relations between the dependent and indepe...
In this paper we study the problem of recovering a structured but unknow...
Dimension reduction is the process of embedding high-dimensional data in...
In this paper we show that for the purposes of dimensionality reduction
...
Finding "densely connected clusters" in a graph is in general an importa...
Nuclear norm minimization (NNM) has recently gained significant attentio...