Due to the boom of online shopping services, clothing business is one of the fastest-growing ventures in industry and technology today , as well as one of the most promising profitable platforms. The contributing factors for such tremendous growth include consumers’ frequent adoption of broadband networks and mobile devices, changes in internet content, subsequent experiences on online shopping, and the constant upgrading of the online shopping process and convenience. eMarketers reported that e-commerce sales will reach $1.922 trillion in 2016 and increase nearly 23% to $2.356 trillion in 2018 . Nielsen showed that the most popular e-commerce categories growing in prominence for online shopping including clothing, and airline and hotel reservations . Investigating effective of determination clothing selling items has become a great interest for the industry because of its promising opportunity for online shopping profit and for boosting many emerging applications such as clothing recommendation and advertising by clothing brand association. A traditional way to discover clothing selling trend and favorable style elements would be relying on the manual observation by experts or user survey. However, it is very time consuming and would vary with the season.
In the academia field, there has been increasing interest in clothing product analysis from the computer vision and multimedia communities. The research closely related to our work could be mainly classified into two categories: clothing fashion analysis and product feature analysis by customer reviews. For clothing fashion analysis, most existing fashion analysis works focused on the investigation of the clothing attributes, such as clothing parsing, fashion trend   and clothing retrieval . For product feature analysis by customer reviews, the research studies  considered customer reviews and proposed systems to summarize all the customer reviews of a product. However, the customer reviews might be noisy, ambiguous and inconsistent to a clothing producer .
In contrast to other work, we focus on analyzing and learning the profitable clothing features by popular and attractive clothing features discovery (cf. Fig. 1). Moreover, to our best knowledge, this is the first work to address the profitable clothing features in a major large-scale clothing shopping website. In this paper, we first organize a large-scale Alibaba Taobao Clothing Dataset: a large number of clothing data with customers’ transaction history from a real-world large-scale online shopping website, Taobao. We then exploit and analyze attractive and profitable clothing features in this large-scale clothing dataset. Moreover, the clothing features are extracted by automatically analyzing clothing images. More specifically, for every image, we automatically extract 60 clothing attributes such as collar, necktie, color, etc. Using semantic clothing attributes to represent clothing products can tell online sellers the most popular clothing elements, which is not only a specific clothing reference to understand customers’ preference but also is a good way to consider clothing elements for clothing designers in the view of industry. In our experimental results, we demonstrate the effectiveness of the clothing attributes and further analyze the profitable clothing features.
We presented the preliminary results in 
. In this paper, we propose to 1) prune noisy images by a deep learning approach (cf. Section4.1), 2) incorporate the product sales information from an online business platform to measure the true impact of clothing elements on consumers (cf. Section 4.3), 3) investigate the effects of clothing attribute representation on a large-scale online shopping dataset (cf. Section 3), and 4) provide more details on the proposed approaches (cf. Section 4.2), experimental results, and discussions (cf. Section 5).
The primary contributions of this paper include:
Proposing a framework that facilitates the investigation of consumers’ clothing preference in a fine-grained manner (Section 2).
Conducting empirical analysis of a large-scale online shopping dataset collected between June 2014 and June 2015 (Section 3).
Implementing an effective and efficient method for pruning noisy images in the online shopping dataset (Section 4).
Mining attractive and profitable clothing features in a large number of clothing data with customers’ transaction history (Section 4).
Discovering significant insights using the proposed framework from real-world large-scale data (Section 5).
2 Overview of the framework
To discover popular and attractive clothing features, an effective framework of analyzing clothing shopping transactions to draw profitable clothing features from a large-scale dataset is beneficial for an online shopping industry. The proposed system diagram is shown in Fig. 3. The core algorithms include:
(a) Noisy image pruning model learning. We observed that a part of images are not clothing items but might be shown along with an attractive clothing item for a clothing product, such as an advertisement, pets, and the logo of a clothing brand. These noisy, unrelated, and inappropriate images are referred to as noisy images in our work (cf. Fig. 2). To tackle this problem, the intuitive way is browsing the whole dataset and manually filtering noisy images. However, this is very time consuming for a large-scale dataset and restricts the scalability of the system. The pruning of noisy images can be treated as a binary classification problem. Inspired by the deep learning architecture, which has achieved very promising results in handwritten digits , image classification , speech recognition , computer vision 14], we learn a classifier based on a deep learning architecture to automatically pruning noisy images (cf. Section 4.1).
(b) Clothing feature learning. Using an appropriate clothing representation for exploring clothing style characteristic is required to offer a semantic and intuitive way to determine what clothing elements people would like to purchase. Inspired by the paper , a learning-based clothing attributes approach was carried out to describe clothing style. In the research , Chen et al. only detected 42 upper body clothing attributes. We observed the clothing information in the lower body is an essential clue for clothing style understanding (e.g. pants or skirt). Motivated by the research , we utilize New York Fashion Show images for learning clothing style features, which contains 3914 images from 2014 summer/spring New York Fashion Show and 4000 images from 2015 summer/spring New York Fashion Show, respectively . The 7914 images from 2014 and 2015 New York Fashion Shows are used to extract features and learning 60-attribute semantic representation to describe both upper and lower body clothing features (cf. Section 4.2).
(c) Profitable clothing feature mining. First, we eliminate noisy and unrelated clothing images on the online shopping dataset using noisy image pruning model. Then, we split clothing items into different bins of a category histogram. In order to measure the popularity of clothing items, we then extract the selling frequency of each clothing item from user transaction history table. Next, we exploit and analyze the popularity of clothing items in different seasons, followed by clothing feature extraction as the representation of clothing style features (cf. Section4.3). In the following, we first describe clothing datasets and then the adopted approaches for mining the profitable clothing features.
|Short||Casual Shoes||Rainy Shoes||Sports Shoes||Boots||Slipper|
|Whole body||Suit||Pajamas||Sport||Sun Protection||Uniform||Wedding||Chenogsum||Dress|
|Work (Server)||Work (Doctor)||Activewear (Cheer)||Activewear (Performance)|
3 Clothing DATASET Collection
In this work, we conduct our experiments on two datasets.
1. Online Clothing Shopping Dataset. In order to study the feasible and popular clothing features, we mainly exploit the profitable clothing features in a large-scale clothing shopping platform. Taobao is one of the largest online shopping website in China, which is similar to eBay and Amazon. In 2015, Taobao released a large-scale clothing dataset which includes clothing collocation from fashion experts, image data of Taobao items, and user behavior data. The item data table, item image, and user transaction history are utilized in this work. Examples of clothing product images are shown in Fig. 1 and Fig. 3. In particular, 1) the item data table contains about half million clothing products sold on Taobao during 13 months from June 2014 to June 2015. In this table, there are four types of data: item_id is a unique id for each product, cat_id is the category id the product belongs to, name_arr is an array that contains the name of this product and img_data is the image information of each product. Note that we observe that the category id in this table is only a number and a large number of irrelevant clothing items fall into the same category. Therefore, we define new clothing categories with more semantic meaning in this work (cf. Table 1). (2) The item image contains images for each product in item data table. Some images only present a single item and some images have models that wear the items in order to show the try-on style. (3) The user history table contains around 10 millions user transaction data. In this table, there are three types of data: user_id is the user’s unique id in one transaction, item_id is the id of specific product the user purchases in this transaction, and the date is the time information of this transaction. The number of transactions in each month is shown in Fig. 4.
2. New York Fashion Show. In order to learn clothing feature detection models, we manually labeled the entire clothing dataset  for complete fine-grained clothing attribute annotations. This dataset contains 3914 images from 2014 summer/spring New York Fashion Show and 4000 images from 2015 summer/spring New York Fashion Show, respectively . The 7914 images from 2014 and 2015 New York Fashion Shows are used to extract features and to learn clothing attributes.
4 Discriminative mining of best
selling clothing features
4.1 Noisy Image Pruning
As illustrated in Fig. 3, we propose to prune noisy images from the clothing shopping dataset. This is based on the observation that these images are not clothing items but might be shown along with an attractive clothing item for a clothing product, such as an advertisement and the logo of a clothing brand (cf. Fig. 2). The intuitive way to tackle this problem is browsing the whole dataset and manually filtering noisy images . However, this is very time consuming for a large-scale dataset. Taking the scalability and generalization of the proposed system into consideration, we learn a classifier for automatically filtering noisy images.
The deep learning framework is considered as one promising direction by the research community and has been proven to be effective in various classification tasks, including handwritten digits , image classification , speech recognition , computer vision  and natural language processing . The noisy image pruning can also be treated as a binary classification problem.
The network structure we employ is similar to VGG-16 , which has been demonstrated powerful in various computer vision tasks. First, we resized each image to 256
256 and the resized images are processed by five convolutional layers. Each convolutional layer is also followed by max-pooling layers. Max-pooling is performed over a 2
2 pixel window, with a stride of 2 pixels. A stack of 5 convolutional layers is followed by three fully connected layers. The first two layers have 4096 kernels each and are followed by dropout regularizations
. The final fully connected layer performs a softmax activation with 2 kernels for the neurons to turn real-valued vector into a vector of probabilities. We use rectified linearunits (ReLUs) activation functions
for first 5 convolution layers and 2 fully connneted layers. Furthermore, we utilize the cross-entropy loss function during training, the preferred loss function for binary classification problems. The model also uses the efficient Adam optimization algorithm for gradient descent. This model for noisy image pruning is trained and implement using the Tensorflow backend with the batch size of 128. We manually labeled noisy images and randomly split them into 80% training, 10% validation, and 10% testing. Our noisy image pruning model achieves an accuracy of 75.5%, a recall of 70%, and a precision of 78.6%.
4.2 Clothing Feature Learning
4.2.1 Pose Estimation and Body Region Extraction
In order to learn clothing features, we need to extract visual features beforehand to train classifiers for every clothing attributes. Thanks for Marcin Eichner’s team 
, we apply their pose estimation software to detect the pose of a human body and retrieve the body region of the model. We briefly describe the method of the pose estimation. First, a human upper-body is detected by a pre-trained upper-body detector. More clearly, the approximate location and scale of the person, and where the torso and head should lie could be roughly determined by using a sliding window detection based on Histograms of Oriented Gradients. Next, the structure of the detection window is utilized as the initialization of a Grabcut segmentation. A human body could be represented as a pictorial structure composed of body parts tied together in a tree-structured. Therefore, given an image , the location and orientation of each body part could be inferenced by the posterior of a configuration of human body parts using a log-linear model:
where the binary potential corresponds to a spatial prior on the relative position of parts, e.g., the upper arms must be attached to the torso, and the unary potential corresponds to the likelihood of a local image evidence for a part in a particular position. More specifically, the pose estimation process is to segment body into nine parts: torso, upper left arm, upper right arm, lower left arm, lower right arm, upper left leg, upper right leg, lower left leg and lower right leg. Furthermore, four different kinds of visual features are computed in each body part, including color in the LAB space, texture descriptors, SIFT local feature, and skin probabilities. Finally, the features are aggregated by employing average or max pooling to generate a visual feature vector for all the parts of the body. An example is illustrated in Fig. 3.
4.2.2 Clothing Attribute Learning
The most intuitive way for training each clothing attribute is concatenating 72 features (i.e. 9 human body parts, 4 different kinds of visual features and 2 aggregation methods) into a long vector that becomes the full body visual feature vector. However, the influence of different types of visual feature on each attributes may vary. For example, texture features might have a great effect on pattern based attributes. Consequently, we compute the classification performance of each feature to represent the importance of features as weights towards individual attributes. As a result, we adopt a Support Vector Machine (SVM) with a Chi-square kernel to learning 60 clothing attribute models  and a weighting parameter are applied to vectors from different features to emphasize the importance of different visual features.
|Classic / Attractive|
|white, black, multicolor, lower_solid, round_neckline|
|Spring Popular||Winter Popular|
|Spring Unpopular||Winter Unpopular|
4.2.3 Attribute Relation Inference
In section 4.2.2 the clothing features are considered as isolated attributes. However, it is highly possible that some attributes appear in pairs (or groups). For example, we observe that a plaid shirt might have more than two colors. Note that inter-attribute dependencies are not always symmetric. For example, while a plaid shirt strongly suggests the presence of more than two colors, more than two colors do not necessarily suggest a shirt being plaid. We adopt a Conditional Random Fields (CRF) approach to inference the relation betweens attributes. More specifically, each clothing attribute acts as a node in the CRF framework and the edge connecting every two nodes indicates the joint probability of these two attributes. We build a fully connected CRF with all the attributes pairwise connected. The conditional probability of two clothing attributes given features is maximized by:
4.3 Profitable Clothing Feature Mining
To exploit attractive and profitable clothing features, we extract and analyze popular clothing features in a large-scale online clothing shopping dataset. First, we split clothing items into groups using category information. More specifically, the clothing items are separated into different category bins. Table 1 shows the details of clothing product categories. In order to measure the popularity of clothing items, we then extract the selling frequency of each clothing item from user transaction history table. Moreover, we integrate the popularity information into category bins. In addition, the popularity of clothing item style and transaction might be various in different seasons. Therefore, we further take this context (e.g. Spring and Winter) into consideration in our system. Next, all clothing items are sorted based on selling frequency in each category bin and the major proportion (i.e. Top 10% in our work) of clothing selling items in each bin are picked in different seasons as the popular clothing items. The clothing features are extracted from popular clothing items as profitable attribute references (cf. Section 4.2), which could be utilized not only to maximize the online clothing shopping system revenues, but also to provide sellers and designers a popular clothing style reference. Note that the price information might be one of influence factors for popularity. Due to lacking of price information in this dataset, we tentatively consider the selling frequency in this work but could flexibly combine price information in this framework. Finally, we adopt the Fg-growth algorithm  to extract the frequent item sets of clothing features to further discuss and analyze popular clothing features in different seasons.
5 Experimental Results
We conduct several experiments to gain understanding of the performance of the clothing feature detection models. The overall accuracy of models is 62.6%. An interesting observation is that some categories suffer from worse results (e.g. 42% accuracy of belt, 56% accuracy of accessories) since the objects are relatively small compared to the entire body. In the future, we could segment the body into more parts to improve the accuracy of the feature detection model. For example, we could segment the middle body for belt and the neck part for the accessories. We divide these models into three categories, color, pattern and clothing style. We observed and discovered that these results can be a very significant reflection for the fashion shows’ styles. For example, in upper body color, both 2014 and 2015 fashion images have a large amount of white, gray and black colors. In upper body and lower body patterns, solid pattern is the classic pattern and solid pattern clothes always dominate every year’s fashion shows. In the style category, there are many images with skirt in spring/summer fashion shows. These examples indicate the clothing feature detection models are very reasonable and effective for clothing features representation. It is worth noting that we summarize the experimental results of the clothing detection model in this paper and the detailed discussion of quantitative results is in .
Table 2 shows classic/attractive, popular, and unpopular clothing features on the online clothing shopping dataset. We observed that there are some colors or styles that have large amount of image product items both in the frequent and seldom selling clothing items in 2014 winter and 2015 spring. For example, white, black, multicolor in both upper and lower body, solid pattern in lower body, and round-shape neckline. The white and black colors are reasonable because these two colors could be for all-purpose and be easily matched with other colors. Therefore, these two clothing features are most likely to appear during the whole year and referred to as classic clothing features in our experiments. In addition, multicolor, lower_solid, and round_neckline presented in large numbers both in popular and unpopular clothing items in 2014 winter and 2015 spring as well. These clothing features could be regarded as attractive and safe features in the selling products, which are referred to as attractive clothing features in our experiments.
Fig. 5 (a) compares popular clothing features with unpopular clothing features in spring. Note that we remove classic clothing features, black and white, and attractive clothing features, multicolor, lower_solid, round_neckline to present the changes more clearly. Interesting, people tend to wear blue and red colors in the upper body and are less likely to wear gray and brown colors in spring. A reasonable explanation might be that light-colored clothes can absorb less sun light instead of deeper color clothes, and therefore these clothes can keep people cooler in spring. Besides, graphics and floral patterns are popular clothing features. We observed that spring is a time of renewal and refresh; therefore these lovely patterns could be in line with the spring theme (cf. Fig. 6 (a)). These clothing features, which could boost customer’s shopping behavior, are referred to as popular clothing features. The visualizations of popular clothing features are shown in Fig. 9. The clothing features such as brown and gray colors, which had led to a decrease in customer’s consumption, are referred to as unpopular clothing features (cf. Fig. 6 (b)).
Fig. 5 (b) compares popular with unpopular clothing features in winter. In contract to spring, the darker colors (e.g. brown and gray colors) and clothing styles which could keep people warmer (e.g. collar) are more popular. This phenomenon is typical to a colder season in a year. Interestingly, the yellow color could encourage more consumptions in winter. We observed that the yellow color is a small portion compared to the main color in an entire clothing product. This combination of clothing features (e.g. mainly black or darker colors mixed with a small portion of a light color as shown in Fig. 8 (a) or an unpopular red color mixed with large portions of different colors in Fig. 6 (c)) could be regarded as unique clothing styles at the time and could be further provided to sellers for references to import proper clothing products. Another interesting observation is that we observed that the red color appeared in a specific popular clothing item (i.e. wedding dress) in winter. The red color with graphics is a typical wedding dress in China (cf. Fig. 6 (c)). This observation attracts our interests of embedding geographic information into the system in the future to further enable a more comprehensive framework.
Furthermore, we are also interested in the changes in the clothing features trend. These changes in clothing features could indicate special clothing features for a specific season. The comparison is shown in Fig. 7. In the style category, bags and accessories, belt, and placket increased substantially. In the color category, blue in the upper body and red in the upper and lower body showed an upward trend. In the style category, collar decreased markedly. In the color category, brown and gray in the lower body, and yellow in the upper body showed a downturn trend. An interesting observation is that there are more v-shape neckline in spring. The clothing products in Fig. 6 (a) are good demonstrations. Another interesting observation is that there are a popular short skirt and a popular coat in winter (cf. Fig. 8 (b)). A reasonable explanation is that people tend to wear layers to make quick adjustments based on different indoor and outdoor environments. These particular clothing outfits could be discovered through our proposed framework. In summary, it seems clear that mining of selling clothing features in an online shopping website indeed has benefit to a big picture of clothing element preference, which could influence both clothing production and clothing consumption in a timely fashion. Furthermore, through our experimental results and observations, social conditions and natural conditions, as well as weather and culture, could be important factors for people to determine what they would be likely to wear and purchase.
In this work, we organize and exploit a large-scale online shopping dataset in order to investigate the possible popular and attractive clothing features. In addition, we have developed machine learning based methods to automatically prune noisy images and detect clothing features as the representation of popular clothing style features. We conduct our experiments on two datasets and further demonstrate that the proposed framework is effective in discriminative mining of best-selling clothing features. In the future, we plan to integrate more clothing information (e.g. price or customer profile) and more clothing related datasets  to increase the comprehensive views of estimating popularity of clothing product for more comprehensive studies. Moreover, there is also a keen interest in exploring the proper clothing outfits to recommend users by aggregating user preference  in clothing style. One future direction would be to incorporate these features into the existing model and the scope can also be extended to emerging applications such as clothing advertising.
We gratefully acknowledge the Taiwan Government MOST Study Abroad Program grants 105-2917-I-564-060 and the support of New York State through the Goergen Institute for Data Science.
-  In vogue: How does catwalk influence the main street. http://www.bbc.com/news/magazine-14984468.
-  emarket. worldwide ecommerce sales to increase nearly 20% in 2014. http://www.emarketer.com/.
-  The 14 best trends of new york fashion week. http://www.marieclaire.com/fashion/g2359/new-york-fashion-week-spring-2015-trends/.
-  Nielsen. global online purchase intentions have doubled since 2011 for ebooks, toys, sporting goods; online market for pet and baby supplies, other consumable products also growing. http://www.nielsen.com/.
-  The new york city economic development corporation. http://www.nycedc.com/.
-  The fashion and style news section in new york times. http://www.nytimes.com/section/fashion.
-  Vogue. http://www.vogue.com/.
-  Zappos: An online clothing shopping website. zappos: http://www.zappos.com/.
-  M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
-  L. D. Bourdev, S. Maji, and J. Malik. Describing people: A poselet-based approach to attribute classification. In IEEE ICCV 2011, Barcelona, Spain, November 6-13, 2011, pages 1543–1550, 2011.
-  C.-C. Chang and C.-J. Lin. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):27:1–27:27, May 2011.
-  H. Chen, A. Gallagher, and B. Girod. Describing clothing by semantic attributes. In ECCV 2012, volume 7574, pages 609–623, 2012.
-  K. Chen, K. Chen, P. Cong, W. H. Hsu, and J. Luo. Who are the devils wearing prada in new york city? In Proceedings of the 23rd ACM International Conference on Multimedia, MM ’15, pages 177–180, New York, NY, USA, 2015. ACM.
R. Collobert and J. Weston.
A unified architecture for natural language processing: Deep neural networks with multitask learning.In Proceedings of the 25th international conference on Machine learning, pages 160–167. ACM, 2008.
G. Dahl, A.-r. Mohamed, G. E. Hinton, et al.
Phone recognition with the mean-covariance restricted boltzmann machine.In Advances in neural information processing systems, pages 469–477, 2010.
-  M. Eichner and V. Ferrari. Appearance sharing for collective human pose estimation. In Computer Vision - ACCV 2012, pages 138–151, 2012.
-  J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. SIGMOD Rec., 29(2):1–12, May 2000.
-  R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web, pages 507–517. International World Wide Web Conferences Steering Committee, 2016.
-  S. C. Hidayati, K.-L. Hua, W.-H. Cheng, and S.-W. Sun. What are the fashion trends in new york? In Proceedings of the ACM International Conference on Multimedia, MM ’14, pages 197–200. ACM, 2014.
-  M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, AKDD’04, pages 168–177, New York, NY, USA, 2004. ACM.
-  M. Hu and B. Liu. Mining opinion features in customer reviews. In Proceedings of the 19th National Conference on Artifical Intelligence, AAAI’04, pages 755–760. AAAI Press, 2004.
-  V. Y. Karkare and S. R. Gupta. Product evaluation using mining and rating opinions of product features. In Electronic Systems, Signal Processing and Computing Technologies (ICESC), 2014 International Conference on, pages 382–385, Jan 2014.
-  M. H. Kiapour, X. Han, S. Lazebnik, A. C. Berg, and T. L. Berg. Where to buy it: Matching street clothing photos in online shops. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 3343–3351, Dec 2015.
-  A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
-  R. kumar V; K. Raghuveer. Web user opinion analysis for product features extraction. In International Journal of Web & Semantic Technology, volume 3, pages 382–385, Nov 2012.
-  Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
-  S. Liu, J. Feng, C. Domokos, H. Xu, J. Huang, Z. Hu, and S. Yan. Fashion parsing with weak color-category labels. IEEE Transactions on Multimedia, 16(1):253–265, 2014.
-  S. Liu, J. Feng, Z. Song, T. Zhang, H. Lu, C. Xu, and S. Yan. Hi, magic closet, tell me what to wear! In Proceedings of the 20th ACM Multimedia Conference, MM ’12, Nara, Japan, 2012, pages 619–628, 2012.
-  S. Liu, Z. Song, G. Liu, C. Xu, H. Lu, and S. Yan. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In IEEE CVPR 2012, pages 3330–3337, 2012.
-  V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807–814, 2010.
-  T. V. Nguyen, S. Liu, B. Ni, J. Tan, Y. Rui, and S. Yan. Sense beauty via face, dressing, and/or voice. In Proceedings of the 20th ACM Multimedia Conference, MM ’12, pages 239–248, 2012.
-  C. Rother, V. Kolmogorov, and A. Blake. Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM transactions on graphics (TOG), volume 23, pages 309–314. ACM, 2004.
-  O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
-  K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
-  N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929–1958, 2014.
-  S. Q. X. W. Ziwei Liu, Ping Luo and X. Tang. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE CVPR, 2016.