Meta-Learning with Temporal Convolutions

07/11/2017 ∙ by Nikhil Mishra, et al. ∙ 0

Deep neural networks excel in regimes with large amounts of data, but tend to struggle when data is scarce or when they need to adapt quickly to changes in the task. Recent work in meta-learning seeks to overcome this shortcoming by training a meta-learner on a distribution of similar tasks; the goal is for the meta-learner to generalize to novel but related tasks by learning a high-level strategy that captures the essence of the problem it is asked to solve. However, most recent approaches to meta-learning are extensively hand-designed, either using architectures that are specialized to a particular application, or hard-coding algorithmic components that tell the meta-learner how to solve the task. We propose a class of simple and generic meta-learner architectures, based on temporal convolutions, that is domain- agnostic and has no particular strategy or algorithm encoded into it. We validate our temporal-convolution-based meta-learner (TCML) through experiments pertaining to both supervised and reinforcement learning, and demonstrate that it outperforms state-of-the-art methods that are less general and more complex.



There are no comments yet.


page 12

page 13

Code Repositories


My personal toolkit for PyTorch development.

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.