Video2Shop: Exact Matching Clothes in Videos to Online Shopping Images

05/02/2020
by   Zhi-Qi Cheng, et al.
0

In recent years, both online retail and video hosting service have been exponentially grown. In this paper, a novel deep neural network, called AsymNet, is proposed to explore a new cross-domain task, Video2Shop, targeting for matching clothes appeared in videos to the exactly same items in online shops. For the image side, well-established methods are used to detect and extract features for clothing patches with arbitrary sizes. For the video side, deep visual features are extracted from detected object regions in each frame, and further fed into a Long Short-Term Memory (LSTM) framework for sequence modeling, which captures the temporal dynamics in videos. To conduct exact matching between videos and online shopping images, LSTM hidden states for videos and image features extracted from static images are jointly modeled, under the similarity network with reconfigurable deep tree structure. Moreover, an approximate training method is proposed to achieve the efficiency when training. Extensive experiments conducted on a large cross-domain dataset have demonstrated the effectiveness and efficiency of the proposed AsymNet, which outperforms the state-of-the-art methods.

READ FULL TEXT

page 2

page 8

research
04/14/2018

Video2Shop: Exactly Matching Clothes in Videos to Online Shopping Images

In recent years, both online retail and video hosting service are expone...
research
04/07/2015

Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

Classifying videos according to content semantics is an important proble...
research
09/28/2021

Information Elevation Network for Fast Online Action Detection

Online action detection (OAD) is a task that receives video segments wit...
research
09/04/2015

Object Recognition from Short Videos for Robotic Perception

Deep neural networks have become the primary learning technique for obje...
research
05/09/2019

Feature Extraction and Classification Based on Spatial-Spectral ConvLSTM Neural Network for Hyperspectral Images

In recent years, deep learning has presented a great advance in hyperspe...
research
04/26/2022

ROMA: Cross-Domain Region Similarity Matching for Unpaired Nighttime Infrared to Daytime Visible Video Translation

Infrared cameras are often utilized to enhance the night vision since th...
research
09/21/2021

VPN: Video Provenance Network for Robust Content Attribution

We present VPN - a content attribution method for recovering provenance ...

Please sign up or login with your details

Forgot password? Click here to reset