Video2Shop: Exactly Matching Clothes in Videos to Online Shopping Images

04/14/2018
by   Zhi-Qi Cheng, et al.
1

In recent years, both online retail and video hosting service are exponentially growing. In this paper, we explore a new cross-domain task, Video2Shop, targeting for matching clothes appeared in videos to the exact same items in online shops. A novel deep neural network, called AsymNet, is proposed to explore this problem. For the image side, well- established methods are used to detect and extract features for clothing patches with arbitrary sizes. For the video side, deep visual features are extracted from detected object re- gions in each frame, and further fed into a Long Short-Term Memory (LSTM) framework for sequence modeling, which captures the temporal dynamics in videos. To conduct exact matching between videos and online shopping images, LSTM hidden states, representing the video, and image features, which represent static object images, are jointly mod- eled under the similarity network with reconfigurable deep tree structure. Moreover, an approximate training method is proposed to achieve the efficiency when training. Extensive experiments conducted on a large cross-domain dataset have demonstrated the effectiveness and efficiency of the proposed AsymNet, which outperforms the state-of-the-art methods.

READ FULL TEXT

page 2

page 8

research
05/02/2020

Video2Shop: Exact Matching Clothes in Videos to Online Shopping Images

In recent years, both online retail and video hosting service have been ...
research
04/07/2015

Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

Classifying videos according to content semantics is an important proble...
research
09/28/2021

Information Elevation Network for Fast Online Action Detection

Online action detection (OAD) is a task that receives video segments wit...
research
09/04/2015

Object Recognition from Short Videos for Robotic Perception

Deep neural networks have become the primary learning technique for obje...
research
04/26/2022

ROMA: Cross-Domain Region Similarity Matching for Unpaired Nighttime Infrared to Daytime Visible Video Translation

Infrared cameras are often utilized to enhance the night vision since th...
research
06/14/2017

Large-Scale YouTube-8M Video Understanding with Deep Neural Networks

Video classification problem has been studied many years. The success of...
research
04/06/2018

Cross-Domain Image Matching with Deep Feature Maps

We investigate the problem of automatically determining what type of sho...

Please sign up or login with your details

Forgot password? Click here to reset