This paper studies a model learning and online planning approach towards...
This extended abstract describes a framework for analyzing the
expressiv...
We present a meta-learning framework for learning new visual concepts
qu...
We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist
...
We present Temporal and Object Quantification Networks (TOQ-Nets), a new...
Current approaches to video analysis of human motion focus on raw pixels...
We study the problem of dynamic visual reasoning on raw videos. This is ...
We present Language-mediated, Object-centric Representation Learning (LO...
When answering questions about an image, it not only needs knowing what ...
We consider two important aspects in understanding and editing images:
m...
We study the inverse graphics problem of inferring a holistic representa...
Humans reason with concepts and metaconcepts: we recognize red and green...
Humans are capable of building holistic representations for images at va...
Most structure inference methods either rely on exhaustive search or are...
We present the Visually Grounded Neural Syntax Learner (VG-NSL), an appr...
We propose the Neural Logic Machine (NLM), a neural-symbolic architectur...
We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that lear...
We propose the Unified Visual-Semantic Embeddings (Unified VSE) for lear...
We propose Unified Visual-Semantic Embeddings (UniVSE) for learning a jo...
In this paper, we propose Neural Phrase-to-Phrase Machine Translation
(N...
Modern CNN-based object detectors rely on bounding box regression and
no...
We study the problem of grounding distributional representations of text...
Aggregating extra features has been considered as an effective approach ...