AutoCaption: Image Captioning with Neural Architecture Search

12/16/2020
by   Xinxin Zhu, et al.
0

Image captioning transforms complex visual information into abstract natural language for representation, which can help computers understanding the world quickly. However, due to the complexity of the real environment, it needs to identify key objects and realize their connections, and further generate natural language. The whole process involves a visual understanding module and a language generation module, which brings more challenges to the design of deep neural networks than other tasks. Neural Architecture Search (NAS) has shown its important role in a variety of image recognition tasks. Besides, RNN plays an essential role in the image captioning task. We introduce a AutoCaption method to better design the decoder module of the image captioning where we use the NAS to design the decoder module called AutoRNN automatically. We use the reinforcement learning method based on shared parameters for automatic design the AutoRNN efficiently. The search space of the AutoCaption includes connections between the layers and the operations in layers both, and it can make AutoRNN express more architectures. In particular, RNN is equivalent to a subset of our search space. Experiments on the MSCOCO datasets show that our AutoCaption model can achieve better performance than traditional hand-design methods. Our AutoCaption obtains the best published CIDEr performance of 135.8 technology, CIDEr is boosted up to 139.5

READ FULL TEXT
research
09/13/2018

Image Captioning based on Deep Reinforcement Learning

Recently it has shown that the policy-gradient methods for reinforcement...
research
07/07/2021

GLiT: Neural Architecture Search for Global and Local Image Transformer

We introduce the first Neural Architecture Search (NAS) method to find a...
research
12/23/2019

TextNAS: A Neural Architecture Search Space tailored for Text Representation

Learning text representation is crucial for text classification and othe...
research
09/07/2021

ISyNet: Convolutional Neural Networks design for AI accelerator

In recent years Deep Learning reached significant results in many practi...
research
08/18/2020

NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural Architecture Search

Link prediction is the task of predicting missing connections between en...
research
03/28/2021

On Hallucination and Predictive Uncertainty in Conditional Language Generation

Despite improvements in performances on different natural language gener...
research
03/26/2022

AutoTS: Automatic Time Series Forecasting Model Design Based on Two-Stage Pruning

Automatic Time Series Forecasting (TSF) model design which aims to help ...

Please sign up or login with your details

Forgot password? Click here to reset