Mechanistic interpretability seeks to understand the neural mechanisms t...
Learning multimodal representations involves integrating information fro...
In a wide range of multimodal tasks, contrastive learning has become a
p...
Societal biases present in pre-trained large language models are a criti...
In many machine learning systems that jointly learn from multiple modali...
Multimodal fusion of multiple heterogeneous and interconnected signals i...
Self-supervised learning (SSL) and the objective of masking-and-predicti...
Accurately modeling affect dynamics, which refers to the changes and
flu...
The recent explosion of interest in multimodal applications has resulted...
High sample complexity has long been a challenge for RL. On the other ha...
Despite recent progress towards scaling up multimodal vision-language mo...
Pretrained language models have demonstrated extraordinary capabilities ...
Multimodal machine learning is a vibrant multi-disciplinary research fie...
Lecture slide presentations, a sequence of pages that contain text and
f...
Creating artificial social intelligence - algorithms that can understand...
The promise of multimodal models for real-world applications has inspire...
In order for AI to be safely deployed in real-world scenarios such as
ho...
The ability for a human to understand an Artificial Intelligence (AI) mo...
Learning multimodal representations involves discovering correspondences...
Learning multimodal representations involves integrating information fro...
As machine learning methods are deployed in real-world settings such as
...
Mental health conditions remain underdiagnosed even in countries with co...
In many real-world scenarios where extrinsic rewards to the agent are
ex...
Existing approaches to ensuring privacy of user speech data primarily fo...
The natural world is abundant with concepts expressed via visual, acoust...
Mental health conditions remain under-diagnosed even in countries with c...
It has been hypothesized that label smoothing can reduce overfitting and...
As natural language processing methods are increasingly deployed in
real...
Learning continuous representations of discrete objects such as text, us...
Multi-agent trajectory forecasting in autonomous driving requires an age...
Several recent works have found the emergence of grounded compositional
...
Learning in the presence of label noise is a challenging yet important t...
Federated learning is an emerging research paradigm to train models on
p...
The complex world around us is inherently multimodal and sequential
(con...
There has been an increased interest in multimodal language processing
i...
We deal with the selective classification problem
(supervised-learning p...
Human language is often multimodal, which comprehends a mixture of natur...
Human language is a rich multimodal signal consisting of spoken words, f...
Learning a generative model from partial data (data with missingness) is...
Learning a generative model from partial data (data with missingness) is...
The power of randomized algorithms in numerical methods have led to fast...
Multimodal sentiment analysis is a core research area that studies speak...
Humans convey their intentions through the usage of both verbal and nonv...
Computational modeling of human multimodal language is an emerging resea...
Emotion recognition is a core research area at the intersection of artif...
Multimodal machine learning is a core research area spanning the languag...
Learning representations of multimodal data is a fundamentally complex
r...
Multimodal research is an emerging field of artificial intelligence, and...
Multi-view sequential learning is a fundamental problem in machine learn...
With the increasing popularity of video sharing websites such as YouTube...