Log In Sign Up

Personalizing Similar Product Recommendations in Fashion E-commerce

by   Pankaj Agarwal, et al.

In fashion e-commerce platforms, product discovery is one of the key components of a good user experience. There are numerous ways using which people find the products they desire. Similar product recommendations is one of the popular modes using which users find products that resonate with their intent. Generally these recommendations are not personalized to a specific user. Traditionally, collaborative filtering based approaches have been popular in the literature for recommending non-personalized products given a query product. Also, there has been focus on personalizing the product listing for a given user. In this paper, we marry these approaches so that users will be recommended with personalized similar products. Our experimental results on a large fashion e-commerce platform (Myntra) show that we can improve the key metrics by applying personalization on similar product recommendations.


page 1

page 2

page 3


Large-scale Real-time Personalized Similar Product Recommendations

Similar product recommendation is one of the most common scenes in e-com...

E-commerce in Your Inbox: Product Recommendations at Scale

In recent years online advertising has become increasingly ubiquitous an...

Fine-Grained Session Recommendations in E-commerce using Deep Reinforcement Learning

Sustaining users' interest and keeping them engaged in the platform is v...

Addressing Marketing Bias in Product Recommendations

Modern collaborative filtering algorithms seek to provide personalized p...

Buy Me That Look: An Approach for Recommending Similar Fashion Products

The recent proliferation of numerous fashion e-commerce platforms has le...

1. Introduction

In e-commerce, the number of products in the shelf space are practically infinite. Thus, the users have to navigate through a plethora of options in any category before making a purchase and they often get disinterested in the process soon. This problem is more prominent in fashion compared to other e-commerce domains like that of movies, books, electronics, etc. In other categories, the users generally have a crisp understanding of what they want to buy. In fashion, the users mostly don’t know what they want. Even if they know, it is hard for them to explain it to the search engine which understands a product with limited taxonomy of attributes. Limited real estate in mobile screens aggravates the problem further. The faster we can assist a user in finding the right product, higher are the chances of user conversion. Hence, personalization becomes an important lever to cater to diverse users’ need, allowing for better product discovery and customer experience. In this paper, we propose an approach to personalize similar products that are being shown to the user for a given product. We perform our experiments on data collected from Myntra, which is a large e-commerce platform in India and show how our approach performs better compared to non-personalized recommendations.

Figure 1. Example screenshots of Myntra mobile app showing how similar products are typically displayed to the user on our platform.

Similar Products is a great way to recommend certain products to users based on current context (Linden et al., 2003). These recommendations are highly useful for a user if he/she has liked a certain kind of product and may buy if presented with few more slightly varied products. Figure 1 shows how the similar products are displayed to the user for a given query product on our platform.

Typically, similar product recommendations are solved through either content based or collaborative filtering based approaches. Content based filtering approaches recommend products by using the attributes of the products. Collaborative filtering approaches use historical user product interactions. In fashion, products can be represented with their product attributes like colour, pattern, fabric, sleeve type, collar type etc. These attributes are seldom fixed and usually change with new trends. Another challenge is to tag attributes for all the products manually. Further even after tagging, users’ taste is often complex and are hard to explain in terms of the limited set of attributes. Thus collaborative filtering based approaches are preferred over content based ones.

Typically, algorithms for similar products recommend a non-personalized set of products to all the users i.e. the result set is completely agnostic of the user (Amber Madvariya, 2017)(Linden et al., 2003). Though the results are derived considering the browsing behavior of all the users, the recommended results tend to favour the choices of majority of the population while ignoring an individual’s subtle preferences. Figure 2 depicts the general recommendations against a query product which is an orange color solid shirt dress. Non - personalized recommendations are shown in first row. However, the recommendations can be re-ranked if we have certain information about the user. Below are the two possible examples:

Figure 2. Through the above image, we illustrate how user level personalization can improve similar product recommendations. On the left hand side, we have a query product. On the right hand side, the first row shows the non-personalized similar product recommendations. The second row shows how ideal ranking will look like if the user generally likes floral dresses. And the third row shows the ranking in case of a user who has affinity towards lighter colours
  • Lets say we know that the user has strong affinity towards floral pattern compared to solid. Then, if we can recommend floral styled shirt dresses to the user on top, it should result in better recommendations. This is depicted in 2nd row of Figure 2.

  • If we know that the user has strong affinity towards light colours then it makes sense to recommend the white dress as a top result. This is depicted in 3rd row of Figure 2.

In literature, we find solutions for product recommendations with query being either a product or a user(Rendle et al., 2009)(Linden et al., 2003)(Ricci et al., 2011).In this paper, we propose an approach to solve the problem of personalizing similar products with query as both user and a product.

We had to overcome few challenges to incorporate user’s taste into the system. Firstly, our platform i.e Myntra has about 6 million products available at any given time & the largest category which is T-shirts for Men has about 50k items. Further the data suffers from huge long tail because of which the interaction signals are sparse. For instance, a typical user-item matrix on our platform would have a sparsity of ¡0.1%. On our platform, 20% of products lead to more than 80% of revenue on a daily basis. Secondly, it is very hard to find out a user’s affinity towards all the possible attributes in a particular category. It would not be succinct if we try and represent user’s taste with few commonly known attributes (Sagar Arora, 2017).

Our approach combines the solutions of finding similar product recommendations and user level personalization. We use matrix factorization based approaches for this purpose and thereby overcome the challenges mentioned above.

In the following sections of the work, we describe approaches to solve the problem and discuss few experiments that show how personalizing the similar product recommendations improves key metrics.

2. Related Work

Our work is related to two areas: recommendation systems and personalization systems. There has been significant work done already on recommendations systems in various domains (Ricci et al., 2011) like ecommerce (Linden et al., 2003), news (Das et al., 2007) and music (Van den Oord et al., 2013).

Collaborative filtering based systems have been very popular for recommendations (Linden et al., 2003) (Hu et al., 2008) (Rendle et al., 2009) (Koren et al., 2009). In (Das et al., 2007), a large scale collaborative filtering based system is proposed for personalizing news to a given user. In (Van den Oord et al., 2013), a deep content based music recommendation system is proposed to tackle the lack of user interactions data.

Further improvements to the recommendation algorithms were also presented in the (He and McAuley, 2016), (Kang et al., 2017), (Lian et al., 2018), (Cheng et al., 2016). Our work is focused on using these approaches for personalizing the similar product recommendations.

Context driven recommendations systems have shown to improve the existing performance of recommendations in (Adomavicius and Tuzhilin, 2015). In (Rendle et al., 2011), authors propose a way of incorporating context into the recommendation systems specifically on how, when and why a rating was done by a user. There are also a set of works which solve the cold-start problem, for example in (He and McAuley, 2016) visual features are used.

In (Covington et al., 2016)

, a deep learning based video recommendation system is proposed which marries personalization with recommendation and is one of the closest work we have followed in terms of the objective. In

(Trivedi and Trivedi, 2018), it is shown the personalization is one of the prominent factors effecting key metrics in online shopping.

Note that while our earlier work (Amber Madvariya, 2017) focuses on non-personalized similar product recommendations, this paper’s primary focus is on personalized similar product recommendations.

Figure 3. Overview of the approach. We first generate a candidate set using non-personalized similar product generation algorithm and then refine the results using user preferences.

3. Methodology

Given a user and current product which is being viewed by the user, our objective is to come up with an affinity score denoted by for each product in the catalogue. This score will be used to rank and display the products to the user. We compute this score as a linear combination of the following:

  • representing the similarity of product with the given product .

  • representing the similarity of product with the user’s taste.

Figure 3 summarizes our approach. We first explain the input data that is being collected and then we describe how the above two scores are calculated separately and then combined to generate final ranking.

Interactions data:

In the absence of explicit ratings for products by users on our platform, we depend on implicit signals from the users. These implicit signals include product list_views (number of times a product is seen by the user in the product search listing), clicks (number of times a user has viewed the product details page), add to carts & orders. We assign a rating for a product by the user as a weighted sum of these signals. We consider these weights also as hyperparameters for our approach. Using this we form a user interactions matrix, where each row corresponds to a user and each column corresponds to a product. We use this data as an input for our approach.

For new products and new users, we use the content based methods for recommending similar products and don’t personalize for them.

3.1. Non-personalized similar products (Candidate Generation)

For generating non-personalized similar recommendations, we use collaborative filtering based approach. In our earlier work (Amber Madvariya, 2017), we have already shown how collaborative filtering approaches perform better compared to content based approaches. Further, we have experimented with item-item collaborative filtering approach and matrix factorization based approaches (Rendle et al., 2009)(Hu et al., 2008)(Koren et al., 2009). From our experiments, we found that item-item collaborative filtering approach performs better on our dataset.

In item-item collaborative filtering approach, we use the vectors corresponding to each product from the user-item interactions matrix. The similarity between given two products is then calculated as the cosine similarity between the vectors. From the sorted set, we choose the top results which acts as our candidate set for the next step. Choosing the top results helps in faster response times for production systems.

3.2. User level personalization

For personalization, we have considered two popular matrix factorization based approaches specifically Implicit Alternating Least Squares (ALS) (Hu et al., 2008) and Bayesian Personalized Ranking(BPR) approaches (Rendle et al., 2009). Note that these results can be further improved by using other sophisticated approaches.

These algorithms work by transforming the sparse user interaction matrix into low dimensional latent space vectors for both the users and products. The transformed vectors represent each user and product with their low dimensional dense vectors. The user vector captures the user’s fashion taste in latent space and product vectors captures the hidden attributes in the same space. We briefly explain both of these approaches below:

Implicit Alternating Least Squares (ALS-MF): This algorithm (Hu et al., 2008) is designed to work on implicit ratings and optimizes the modified cost function compared to traditional MF approaches(Koren et al., 2009). Cost function for this method is written as:


In the above equation, represents the latent user vector and represents the latent product vector in dimensions. The preference of the user for the product is given by . represents observed preference score obtained from the implicit signals for the user and product . And, represents the confidence on the implicit signals and is the regularization parameter. and are hyper-parameters and their exact values are determined by cross-validation.

Cost function tries to minimize the difference between the estimated score and observed score across all the user and product combinations.

Bayesian Personalized Ranking (BPR-MF): In the ALS approach the focus is on estimating the point wise score correctly, whereas BPR (Rendle et al., 2009)

works on optimizing the pairwise ranking of the products for a user correctly. For this purpose, the model optimizes the loss function which considers pair of products for each user. The loss function for BPR in general is written as:


where are the triplets of user and product pairs available in the interactions dataset such that user likes product over product . And, denotes the difference of estimated preference scores for the user to the product and product . is the model parameter vector and are model specific regularization parameters. In the case of matrix factorization, the model parameters are user and item vectors.

We feed the user interactions data to the above explained BPR-MF and ALS-MF approaches to generate the user and product latent vectors. Using these vectors we compute a score for each user and product. The score represents the affinity for the user towards the product. We refer this as user-product similarity score. The sorted list based on these scores gives us a personalized product listing for a given set of products.

3.3. Personalized similar products (Final Ranking)

Once we obtain the non-personalized similar products set, we need to incorporate personalization to re-rank these products. One of the ways to get a personalized product listing is to directly use the user-product similarity scores assuming that all the products are equally similar to the query product. But this is seldom the case & thus we used a combination of product-product similarity scores and user-product similarity scores so that we can ensure that the resulting list preserves the context which is the query product as well as the effect of personalization by preserving the user’s taste. We combine the scores as follows:


where is a hyper-parameter which is determined using cross-validation.

4. Experiments & Results

4.1. Dataset

For all our experiments, we use the clickstream and purchase data of users. We split our user interactions data into two non overlapping sets, a training set, which is used to generate the user and product vectors in latent space, and a test set, which is used to evaluate the approaches. Training set consists of data for months and test set consists of data from next month. The training set consists of ~ products with ~ unique users. Sparsity of the interactions matrix is . Note that the users and products will be common in both the training and test set. We present our experimental results for the category “Men-Tshirts”. The results were seen to be consistent across all the other categories.

4.2. Evaluation Metric

As this is a ranking problem, we have used standard Information Retrieval metrics namely Precision, Recall and Mean Average Precision

K. The value of ”K” is chosen to be 15 in our case since the number of recommendations we would show to the end user is the same. Precision and recall are single value metrics and are used to gauge the performance of all the results together irrespective of the order. However, order of results matter in information retrieval & mean average precision (MAP) takes that into account. For computing MAP, the predicted result set is the top 15 products recommended by our algorithm for a given query product. Ground truth set is obtained from the test set. Query product is chosen as the first product a user interacted with & remaining products are assigned to the ground truth set.

4.3. Computing the rating function

As explained earlier, we don’t have explicit ratings on our platform. So, we have used clickstream data to come up with a rating for a product and a user. The rating is computed as a weighted linear combination of the following quantities : list_views, clicks, add_to_carts and purchases. All the events can provide different weights to the rating computation. The frequency of the event or number of times a user performed an event (say clicked on product display page ) can also be factored into the rating function. To understand the change in evaluation metric based on varying weights we performed a grid search. The Table 1 reflects few examples from the results. We conclude that weightage set of (0.25,1,1,1) provides us with the best MAP. The frequency of event was found to be not useful for the model & hence we ignore the frequency from our computation for the rest of the paper.

4.4. Non Personalized Similar Products (Candidate Generation)

To generate similar products (Candidate set), we have done experiments using Item-Item collaborative Filtering, ALS-MF & BPR-MF. From the resultant product vectors, we computed cosine distance between all the products. For each product, we sort based on these scores to get the similar products ranking. We have found that Item-Item collaborative filtering performs best for this purpose. We choose top 100 products against a query product as our candidate set.

4.5. Algorithm and Confidence Optimization

For ALS-MF, confidence parameter determines the weightage of the implicit rating. In the table 1, we show how the weightage of different implicit signals impacts MAP value. Figure 5 shows the variation in MAP with varying confidence and choice of algorithm. Based on the results we decided to stick with BPR-MF algorithm. In our implementations, we have used map-reduce framework for the large scale data processing and (Frederickson, [n. d.]) for implementing the ALS-MF and BPR-MF algorithms.

4.6. Finding personalized similar products (Final Ranking)

The objective was to ensure that the results obtained are relevant for the current context as well as well suited to the taste of the user. As explained in the section3.3, we use a linear combination of non-personalized similarity scores and user level scores. We have done experiments with as hyper-parameter. In the Figure 4, we show the effect of change in . We can see that providing weightage to the similar products and weight to user preferences gives with the optimal MAPK.

Figure 4. Graph shows how the performance varies with change in weight of user level score.
Figure 5. Performance variation with change in confidence parameter for both ALS-MF. We also plot the MAP@K using BPR-MF.
ListViews Clicks Carts Orders Freq MAP@15
0 1 1 1 0 0.0432
0 1 1 1 1 0.0378
0.25 1 1 1 0 0.0437
0.25 1 1 1 1 0.0393
0.25 10 4 1 0 0.0296
0.25 10 4 1 1 0.0295
Table 1. Performance variation by changing weightages of different implicit signals using BPR-MF. We show few data points in this table. Highlighted row works the best.

5. Conclusion

We have presented a method to personalize the similar product recommendations. We have shown how our approach improves the mean average precision metric on a large dataset collected from our e-commerce platform Myntra. Further, we will be deploying this solution in production and validate this by performing A/B test. As future work, we plan to combine this with approaches which use visual features.

6. Acknowledgments

The authors would like to thank Sabbarish R, Ankul Batra, Sagar Arora and Ghani Mohammed Abdulla for their contributions in reviewing this work and for providing valuable inputs to algorithm design, implementation and evaluation.


  • (1)
  • Adomavicius and Tuzhilin (2015) Gediminas Adomavicius and Alexander Tuzhilin. 2015. Context-aware recommender systems. In Recommender systems handbook. Springer, 191–226.
  • Amber Madvariya (2017) Sumit Borar Amber Madvariya. 2017. Discovering Similar Products in Fashion E-commerce. In SIGIR Workshop on eCommerce.
  • Cheng et al. (2016) Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7–10.
  • Covington et al. (2016) Paul Covington, Jay Adams, and Emre Sargin. 2016.

    Deep Neural Networks for YouTube Recommendations. In

    Proceedings of the 10th ACM Conference on Recommender Systems. New York, NY, USA.
  • Das et al. (2007) Abhinandan S Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In Proceedings of the 16th international conference on World Wide Web. ACM, 271–280.
  • Frederickson ([n. d.]) Ben Frederickson. [n. d.]. Fast Python Collaborative Filtering for Implicit Feedback Datasets.
  • He and McAuley (2016) Ruining He and Julian McAuley. 2016. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.
  • Hu et al. (2008) Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on. Ieee, 263–272.
  • Kang et al. (2017) Wang-Cheng Kang, Chen Fang, Zhaowen Wang, and Julian McAuley. 2017. Visually-Aware Fashion Recommendation and Design with Generative Image Models. arXiv preprint arXiv:1711.02231 (2017).
  • Koren et al. (2009) Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009).
  • Lian et al. (2018) Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. arXiv preprint arXiv:1803.05170 (2018).
  • Linden et al. (2003) Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing 7, 1 (2003), 76–80.
  • Rendle et al. (2009) Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In

    Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence

    . AUAI Press, 452–461.
  • Rendle et al. (2011) Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2011. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 635–644.
  • Ricci et al. (2011) Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender systems handbook. Springer, 1–35.
  • Sagar Arora (2017) Divya Alok Sumit Borar Sagar Arora, Amber Madvariya. 2017. Deciphering Fashion Sensibility Using Community Detection. In

    KDD workshop: Machine Learning meets fashion

  • Trivedi and Trivedi (2018) Jay P. Trivedi and Hemant Trivedi. 2018. Investigating the Factors That Make a Fashion App Successful: The Moderating Role of Personalization. Journal of Internet Commerce 17, 2 (2018), 170–187.
  • Van den Oord et al. (2013) Aaron Van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in neural information processing systems. 2643–2651.