b'Manzil Zaheer'

research

∙ 07/26/2023

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex t...

0 Kensen Shi, et al. ∙

research

∙ 05/20/2023

Can Public Large Language Models Help Private Cross-device Federated Learning?

We study (differentially) private federated learning (FL) of language mo...

0 Boxin Wang, et al. ∙

research

∙ 05/04/2023

Adaptive Selection of Anchor Items for CUR-based k-NN search with Cross-Encoders

Cross-encoder models, which jointly encode and score a query-item pair, ...

0 Nishant Yadav, et al. ∙

research

∙ 03/27/2023

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

Dual encoder models are ubiquitous in modern classification and retrieva...

0 Nicholas Monath, et al. ∙

research

∙ 02/03/2023

ResMem: Learn what you can and memorize the rest

The impressive generalization performance of modern neural networks is a...

2 Zitong Yang, et al. ∙

research

∙ 01/27/2023

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

Large neural models (such as Transformers) achieve state-of-the-art perf...

12 Seungyeon Kim, et al. ∙

research

∙ 12/09/2022

Multi-Task Off-Policy Learning from Bandit Feedback

Many practical applications, such as recommender systems and learning to...

0 Joey Hong, et al. ∙

research

∙ 12/01/2022

Differentially Private Adaptive Optimization with Delayed Preconditioners

Privacy noise may negate the benefits of using adaptive optimizers in di...

0 Tian Li, et al. ∙

research

∙ 11/09/2022

Large Language Models with Controllable Working Memory

Large language models (LLMs) have led to a series of breakthroughs in na...

6 Daliang Li, et al. ∙

research

∙ 10/31/2022

Learning to Navigate Wikipedia by Taking Random Walks

A fundamental ability of an intelligent web-based agent is seeking out a...

0 Manzil Zaheer, et al. ∙

research

∙ 10/23/2022

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

Efficient k-nearest neighbor search is a fundamental task, foundational ...

0 Nishant Yadav, et al. ∙

research

∙ 10/07/2022

Longtonotes: OntoNotes with Longer Coreference Chains

Ontonotes has served as the most important benchmark for coreference res...

0 Kumar Shridhar, et al. ∙

research

∙ 10/06/2022

Generalization Properties of Retrieval-based Models

Many modern high-performing machine learning models such as GPT-3 primar...

0 Soumya Basu, et al. ∙

research

∙ 10/05/2022

A Fourier Approach to Mixture Learning

We revisit the problem of learning mixtures of spherical Gaussians. Give...

0 Mingda Qiao, et al. ∙

research

∙ 06/21/2022

Questions Are All You Need to Train a Dense Passage Retriever

We introduce ART, a new corpus-level autoencoding approach for training ...

6 Devendra Singh Sachan, et al. ∙

research

∙ 05/23/2022

StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models

Knowledge and language understanding of models evaluated through questio...

0 Adam Liska, et al. ∙

research

∙ 04/07/2022

Compositional Generalization and Decomposition in Neural Program Synthesis

When writing programs, people have the ability to tackle a new complex t...

7 Kensen Shi, et al. ∙

research

∙ 02/22/2022

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

Question answering (QA) over real-world knowledge bases (KBs) is challen...

7 Rajarshi Das, et al. ∙

research

∙ 02/12/2022

Private Adaptive Optimization with Side Information

Adaptive optimization methods have become the default solvers for many m...

0 Tian Li, et al. ∙

research

∙ 02/03/2022

Deep Hierarchy in Bandits

Mean rewards of actions are often correlated. The form of these correlat...

0 Joey Hong, et al. ∙

research

∙ 02/02/2022

Robust Training of Neural Networks using Scale Invariant Architectures

In contrast to SGD, adaptive gradient methods like Adam allow robust tra...

6 Zhiyuan Li, et al. ∙

research

∙ 01/29/2022

A Context-Integrated Transformer-Based Neural Network for Auction Design

One of the central problems in auction design is developing an incentive...

0 Zhijian Duan, et al. ∙

research

∙ 11/12/2021

Hierarchical Bayesian Bandits

Meta-, multi-task, and federated learning can be all viewed as solving s...

0 Joey Hong, et al. ∙

research

∙ 10/19/2021

When in Doubt, Summon the Titans: Efficient Inference with Large Models

Scaling neural networks to "large" sizes, with billions of parameters, h...

5 Ankit Singh Rawat, et al. ∙

research

∙ 07/13/2021

No Regrets for Learning the Prior in Bandits

We propose AdaTS, a Thompson sampling algorithm that adapts sequentially...

0 Soumya Basu, et al. ∙

research

∙ 06/10/2021

Thompson Sampling with a Mixture Prior

We study Thompson sampling (TS) in online decision-making problems where...

0 Joey Hong, et al. ∙

research

∙ 04/18/2021

Case-based Reasoning for Natural Language Queries over Knowledge Bases

It is often challenging for a system to solve a new complex problem from...

21 Rajarshi Das, et al. ∙

research

∙ 04/14/2021

Exact and Approximate Hierarchical Clustering Using A*

Hierarchical clustering is a critical task in numerous domains. Many app...

7 Craig S. Greenberg, et al. ∙

research

∙ 02/14/2021

Model-Agnostic Graph Regularization for Few-Shot Learning

In many domains, relationships between categories are encoded in the kno...

15 Ethan Shen, et al. ∙

research

∙ 02/11/2021

Meta-Thompson Sampling

Efficient exploration in multi-armed bandits is a fundamental online lea...

0 Branislav Kveton, et al. ∙

research

∙ 12/01/2020

Non-Stationary Latent Bandits

Users of recommender systems often behave in a non-stationary fashion, d...

0 Joey Hong, et al. ∙

research

∙ 12/01/2020

Latent Programmer: Discrete Latent Codes for Program Synthesis

In many sequence learning tasks, such as program synthesis and document ...

1 Joey Hong, et al. ∙

research

∙ 12/01/2020

Modifying Memories in Transformer Models

Large Transformer models have achieved impressive performance in many na...

0 Chen Zhu, et al. ∙

research

∙ 11/17/2020

Federated Composite Optimization

Federated Learning (FL) is a distributed learning paradigm which scales ...

0 Honglin Yuan, et al. ∙

research

∙ 10/22/2020

Scalable Bottom-Up Hierarchical Clustering

Bottom-up algorithms such as the classic hierarchical agglomerative clus...

9 Nicholas Monath, et al. ∙

research

∙ 10/07/2020

Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion

A case-based reasoning (CBR) system solves a new problem by retrieving `...

0 Rajarshi Das, et al. ∙

research

∙ 09/15/2020

Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes

High-quality dialogue-summary paired data is expensive to produce and do...

0 Xinyuan Zhang, et al. ∙

research

∙ 09/08/2020

Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function

In this paper, we study bidirectional LSTM network for the task of text ...

0 Devendra Singh Sachan, et al. ∙

research

∙ 07/28/2020

Big Bird: Transformers for Longer Sequences

Transformers-based models, such as BERT, have been one of the most succe...

73 Manzil Zaheer, et al. ∙

research

∙ 06/25/2020

A Simple Approach to Case-Based Reasoning in Knowledge Bases

We present a surprisingly simple yet accurate approach to reasoning in k...

0 Rajarshi Das, et al. ∙

research

∙ 06/15/2020

Latent Bandits Revisited

A latent bandit problem is one in which the learning agent knows the arm...

0 Joey Hong, et al. ∙

research

∙ 06/15/2020

Piecewise-Stationary Off-Policy Optimization

Off-policy learning is a framework for evaluating and optimizing policie...

0 Joey Hong, et al. ∙

research

∙ 06/09/2020

Differentiable Meta-Learning in Contextual Bandits

We study a contextual bandit setting where the learning agent has access...

0 Branislav Kveton, et al. ∙

research

∙ 04/11/2020

Robust Large-Margin Learning in Hyperbolic Space

Recently, there has been a surge of interest in representation learning ...

12 Melanie Weber, et al. ∙

research

∙ 03/18/2020

Anchor Transform: Learning Sparse Representations of Discrete Objects

Learning continuous representations of discrete objects such as text, us...

0 Paul Pu Liang, et al. ∙

research

∙ 02/29/2020

Adaptive Federated Optimization

Federated learning is a distributed machine learning paradigm in which a...

28 Sashank Reddi, et al. ∙

research

∙ 02/27/2020

Towards Modular Algorithm Induction

We present a modular neural network architecture Main that learns algori...

0 Daniel A. Abolafia, et al. ∙

research

∙ 02/25/2020

Differentiable Reasoning over a Virtual Knowledge Base

We consider the task of answering complex multi-hop questions using a co...

0 Bhuwan Dhingra, et al. ∙

research

∙ 02/17/2020

Differentiable Bandit Exploration

We learn bandit policies that maximize the average reward over bandit in...

22 Craig Boutilier, et al. ∙

research

∙ 01/07/2020

FedDANE: A Federated Newton-Type Method

Federated learning aims to jointly learn statistical models over massive...

0 Tian Li, et al. ∙

Manzil Zaheer

Featured Co-authors

Sign in with Google

Consider DeepAI Pro