SweetRS: Dataset for a recommender systems of sweets

09/10/2017 ∙ by Łukasz Kidziński, et al. ∙ Stanford University 0

Benchmarking recommender system and matrix completion algorithms could be greatly simplified if the entire matrix was known. We built a <sweetrs.org> platform with 77 candies and sweets to rank. Over 2000 users submitted over 44000 grades resulting in a matrix with 28% coverage. In this report, we give the full description of the environment and we benchmark the Soft-Impute algorithm on the dataset.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Context

One of the problems in building any machine learning system is the limited access to ground truth. This problem is particularly prevalent in matrix completion when the matrices are very sparse, such as in the context of product recommendation. In many situations, a large dataset can be trimmed to a dense matrix by choosing specific users and items, yet it may introduce additional bias. In this project, we attempted to collect a dense matrix, by rating commercial products internationally known, such as candy bars and sweets.

We built a basic website sweetrs.org, where users can both rate and add new products. Participants were tasked to rate sweets on the scale from to or click "Never tried" in case they do not know or have not tasted the product. To the date, we collected over ratings from over users on items, giving the coverage of over matrix coefficients. Moreover, we identified a subset of users and products with the coverage of over matrix coefficients.

The project has been developed as a part of a Master Thesis at the University of Warsaw [Kid11].

2 Benchmark

Let be a matrix representing users rating items on the integer scale. Let be a set of all observed indices

. We attempt to approximate unobserved ratings. We estimate prediction error using cross-validation.

As our benchmark method we choose Soft-Impute [MHT10] due to its speed, efficiency and simplicity. Soft-Impute performs thresholded SVD in the presence of missing values.

We investigate how the prediction depends on the size of the training set and on the regularization parameter in Soft-Impute. We test settings for the size of the training set, with where is the ratio of the observed set used for testing.

We use cross-validation for estimating the Normalized Mean Squared Error:

where is the set on which we trained the algorithm and denotes the set difference.

In our preliminary experiments, we identified that analyzing the set is sufficient for finding best integer . Thus, we train the Soft-Impute algorithm on constellations . We center and scale each item before fitting. Next, we estimate NMSE for a given is performed as follows:

  1. randomly choose training elements ,

  2. fit the Soft-Impute model for on given the parameter ,

  3. predict elements for ,

  4. record .

We repeat the procedure times for every . We present mean in Figure 1.

Figure 1: NMSE as a function of for different sizes of the training set. Values were estimated by cross-validation repeated times for each combination of parameters.

3 Discussion

In this report we aimed at providing a benchmark and description of the dataset convenient for testing new recommender system techniques. We achieved over

of variance explained in the case when

of matrix coefficients are observed. We published the dataset and the sample R code as a github repository111https://github.com/kidzik/sweetrs-analysis/.

References

  • [Kid11] Łukasz Kidziński. Statistical foundations of recommender systems. PhD thesis, Master Thesis submitted in Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, 2011.
  • [MHT10] Rahul Mazumder, Trevor Hastie, and Robert Tibshirani. Spectral regularization algorithms for learning large incomplete matrices. Journal of machine learning research, 11(Aug):2287–2322, 2010.