Referenceless metrics (e.g., CLIPScore) use pretrained vision–language
m...
Inductive reasoning is a core problem-solving capacity: humans can ident...
Agents must be able to adapt quickly as an environment changes. We find ...
Language model training in distributed settings is limited by the
commun...
Infants explore their complex physical and social environment in an orga...
Despite recent success in large language model (LLM) reasoning, LLMs sti...
Computer Vision (CV) classifiers which distinguish and detect nonverbal
...
Current emotion detection classifiers predict discrete emotions. However...
Automated emotion classification could aid those who struggle to recogni...
World models are self-supervised predictive models of how the world evol...
We introduce ThreeDWorld (TDW), a platform for interactive multi-modal
p...
With most recent estimates giving an incidence rate of 1 in 68 children ...
Humans have a remarkable capacity to understand the physical dynamics of...
Infants are experts at playing, with an amazing ability to generate nove...
Infants are experts at playing, with an amazing ability to generate nove...