Ali Farhadi

research

∙ 07/24/2023

On the Connection between Pre-training Data Diversity and Fine-tuning Robustness

Pre-training has been widely adopted in deep learning to improve model p...

0 Vivek Ramanujan, et al. ∙

research

∙ 07/11/2023

Objaverse-XL: A Universe of 10M+ 3D Objects

Natural language processing and 2D vision models have attained remarkabl...

1 Matt Deitke, et al. ∙

research

∙ 06/16/2023

Neural Priming for Sample-Efficient Adaptation

We propose Neural Priming, a technique for adapting large pretrained mod...

0 Matthew Wallingford, et al. ∙

research

∙ 05/31/2023

Bytes Are All You Need: Transformers Operating Directly On File Bytes

Modern deep learning approaches usually transform inputs into a modality...

0 Maxwell Horton, et al. ∙

research

∙ 05/30/2023

AdANNS: A Framework for Adaptive Semantic Search

Web-scale search systems learn an encoder to embed a given query which i...

0 Aniket Rege, et al. ∙

research

∙ 04/27/2023

DataComp: In search of the next generation of multimodal datasets

Large multimodal datasets have been instrumental in recent breakthroughs...

0 Samir Yitzhak Gadre, et al. ∙

research

∙ 04/25/2023

Stable and low-precision training for large-scale vision-language models

We introduce new methods for 1) accelerating and 2) stabilizing training...

0 Mitchell Wortsman, et al. ∙

research

∙ 04/24/2023

Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics

A common assumption when training embodied agents is that the impact of ...

0 Kuo-Hao Zeng, et al. ∙

research

∙ 03/15/2023

Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement

We propose Dataset Reinforcement, a strategy to improve a dataset once s...

0 Fartash Faghri, et al. ∙

research

∙ 03/08/2023

FastFill: Efficient Compatible Model Update

In many retrieval systems the original high dimensional data (e.g., imag...

0 Florian Jaeckle, et al. ∙

research

∙ 01/10/2023

Neural Radiance Field Codebooks

Compositional representations of the world are a promising step towards ...

0 Matthew Wallingford, et al. ∙

research

∙ 12/20/2022

RangeAugment: Efficient Online Augmentation with Range Learning

State-of-the-art automatic augmentation methods (e.g., AutoAugment and R...

0 Sachin Mehta, et al. ∙

research

∙ 12/15/2022

Objaverse: A Universe of Annotated 3D Objects

Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebIm...

0 Matt Deitke, et al. ∙

research

∙ 12/09/2022

Object Goal Navigation with End-to-End Self-Supervision

A household robot should be able to navigate to target locations without...

0 So Yeon Min, et al. ∙

research

∙ 12/08/2022

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

Training embodied agents in simulation has become mainstream for the emb...

0 Matt Deitke, et al. ∙

research

∙ 12/08/2022

Editing Models with Task Arithmetic

Changing how pre-trained models behave – e.g., improving their performan...

0 Gabriel Ilharco, et al. ∙

research

∙ 10/19/2022

lo-fi: distributed fine-tuning without communication

When fine-tuning large neural networks, it is common to use multiple nod...

5 Mitchell Wortsman, et al. ∙

research

∙ 09/27/2022

Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents

The perception system in personalized mobile agents requires developing ...

3 Yao-Hung Hubert Tsai, et al. ∙

research

∙ 09/23/2022

Safe Real-World Reinforcement Learning for Mobile Agent Obstacle Avoidance

Collision avoidance is key for mobile robots and agents to operate safel...

5 Mario Srouji, et al. ∙

research

∙ 09/07/2022

What does a platypus look like? Generating customized prompts for zero-shot image classification

Open vocabulary models are a promising new paradigm for image classifica...

3 Sarah Pratt, et al. ∙

research

∙ 08/10/2022

Patching open-vocabulary models by interpolating weights

Open-vocabulary models like CLIP achieve high accuracy across many image...

10 Gabriel Ilharco, et al. ∙

research

∙ 07/27/2022

Break and Make: Interactive Structural Understanding Using LEGO Bricks

Visual understanding of geometric structures with complex spatial relati...

1 Aaron Walsman, et al. ∙

research

∙ 06/14/2022

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Massive datasets and high-capacity models have driven many recent advanc...

16 Matt Deitke, et al. ∙

research

∙ 05/26/2022

Matryoshka Representations for Adaptive Deployment

Learned representations are a central component in modern ML systems, se...

10 Aditya Kusupati, et al. ∙

research

∙ 03/15/2022

Object Manipulation via Visual Target Localization

Object manipulation is a critical skill required for Embodied AI agents ...

0 Lucas Taylor, et al. ∙

research

∙ 03/10/2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

The conventional recipe for maximizing model accuracy is to (1) train mu...

10 Mitchell Wortsman, et al. ∙

research

∙ 01/02/2022

The Introspective Agent: Interdependence of Strategy, Physiology, and Sensing for Embodied Agents

The last few years have witnessed substantial progress in the field of e...

7 Sarah Pratt, et al. ∙

research

∙ 12/06/2021

Forward Compatible Training for Representation Learning

In visual retrieval systems, updating the embedding model requires recom...

4 Vivek Ramanujan, et al. ∙

research

∙ 12/01/2021

Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

Communicating with humans is challenging for AIs because it requires a s...

8 Christopher Clark, et al. ∙

research

∙ 10/08/2021

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

When deploying deep learning models to a device, it is traditionally ass...

2 Elvis Nunez, et al. ∙

research

∙ 09/04/2021

Robust fine-tuning of zero-shot models

Large pre-trained models such as CLIP offer consistent accuracy across a...

7 Mitchell Wortsman, et al. ∙

research

∙ 07/07/2021

LanguageRefer: Spatial-Language Model for 3D Visual Grounding

To realize robots that can understand human instructions and perform mea...

5 Junha Roh, et al. ∙

research

∙ 06/04/2021

MERLOT: Multimodal Neural Script Knowledge Models

As humans, we understand events in the visual world contextually, perfor...

0 Rowan Zellers, et al. ∙

research

∙ 06/02/2021

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Learning binary representations of instances and classes is a classical ...

8 Aditya Kusupati, et al. ∙

research

∙ 06/01/2021

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World

We propose PIGLeT: a model that learns physical commonsense knowledge th...

0 Rowan Zellers, et al. ∙

research

∙ 04/28/2021

Pushing it out of the Way: Interactive Visual Navigation

We have observed significant progress in visual navigation for embodied ...

4 Kuo-Hao Zeng, et al. ∙

research

∙ 02/20/2021

Learning Neural Network Subspaces

Recent observations have advanced our understanding of the neural networ...

15 Mitchell Wortsman, et al. ∙

research

∙ 11/18/2020

Layer-Wise Data-Free CNN Compression

We present an efficient method for compressing a trained neural network ...

8 Maxwell Horton, et al. ∙

research

∙ 10/16/2020

What Can You Learn from Your Muscles? Learning Visual Representation from Human Interactions

Learning effective representations of visual data that generalize to a v...

0 Lucas Taylor, et al. ∙

research

∙ 07/09/2020

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Autonomous agents must learn to collaborate. It is not scalable to devel...

2 Unnat Jain, et al. ∙

research

∙ 07/06/2020

In the Wild: From ML Models to Pragmatic ML Systems

Enabling robust intelligence in the wild entails learning systems that o...

13 Matthew Wallingford, et al. ∙

research

∙ 06/26/2020

Supermasks in Superposition

We present the Supermasks in Superposition (SupSup) model, capable of se...

0 Mitchell Wortsman, et al. ∙

research

∙ 05/01/2020

Probing Text Models for Common Ground with Visual Representations

Vision, as a central component of human perception, plays a fundamental ...

0 Gabriel Ilharco, et al. ∙

research

∙ 04/22/2020

Visual Commonsense Graphs: Reasoning about the Dynamic Context of a Still Image

Even from a single frame of a still image, people can reason about the d...

14 Jae Sung Park, et al. ∙

research

∙ 04/14/2020

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undenia...

2 Matt Deitke, et al. ∙

research

∙ 04/07/2020

Evaluating Machines by their Real-World Language Use

There is a fundamental gap between how humans understand and use languag...

0 Rowan Zellers, et al. ∙

research

∙ 03/26/2020

Grounded Situation Recognition

We introduce Grounded Situation Recognition (GSR), a task that requires ...

25 Sarah Pratt, et al. ∙

research

∙ 03/18/2020

Watching the World Go By: Representation Learning from Unlabeled Videos

Recent single image unsupervised representation learning techniques show...

12 Daniel Gordon, et al. ∙

research

∙ 02/15/2020

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

Fine-tuning pretrained contextual word embedding models to supervised do...

0 Jesse Dodge, et al. ∙

research

∙ 02/08/2020

Soft Threshold Weight Reparameterization for Learnable Sparsity

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the ...

3 Aditya Kusupati, et al. ∙

Ali Farhadi

Featured Co-authors

Sign in with Google

Consider DeepAI Pro