1. Introduction and Context
Kids and adolescents use computers mainly for learning and entertainment purposes (Chiasson and Gutwin, 2005) and this age group is also the one that is highly engaged with music (IPSOS and IFPI, 2016). Therefore, it does not come as a surprise that especially in the age group of 25 years, music streaming portals are growing in popularity (IPSOS and IFPI, 2016). Music service providers offering integrated music recommender systems thus have to be prepared for this young user group.
Considering user properties, including demographics such as gender, age, or country (e.g., (Al-Shamri, 2016; Zhao et al., 2014)), is a widely adopted approach for recommender systems and has been focus of research in the past few years.In the field of music recommender systems, relying on listening histories or ratings is nevertheless still the most common approach (Schedl et al., 2014). Still, recent work (e.g., (Shi et al., 2014; Schedl and Hauger, 2015)) shows that integrating different listener or listening information can substantially improve the quality of music recommendations.
Studies investigating the relationship between age and music preferences are particularly rare. Most researchers draw their samples from the population of university students; hence, samples are mostly homogeneous with respect to age (Laplante, 2014). The few studies that allow to draw conclusions with respect to age, though, found that it is substantially associated with music preferences, particularly in terms of genre (Ter Bogt et al., 2011; Harrison and Ryan, 2010) and suggest to consider the relationship between age and music taste in music recommender systems (Laplante, 2014).
Against this background, the contribution of this paper is two-fold. First, we analyze music preferences of kids and adolescents based on the LFM-1b dataset (Schedl, 2016), which aggregates information about more than one billion listening events by more than Last.fm users. Second, exploiting the same dataset, we study the performance of a collaborative filtering approach, tailoring its recommendations to particular age groups, ranging from 6 to 18 years.
To allow for a clear structure, we first describe the methods and material used for our studies in Section 2. In Section 3, we present our findings on the music taste of kids and adolescents, detailing differences with respect to gender, country, and fine-grained age groups. In Section 3, we report and discuss our findings on the recommendation experiments. Finally, we conclude with a summary and outlook to future work in Section 5.
2. Methods and Material
In the following, we describe the dataset, our approach to model music preferences on the user level, and how we investigate a user group’s preferences and homogeneity of these preferences.
For our analysis, we exploit the LFM-1b dataset (Schedl, 2016) of 1,088,161,692 individual listening events created by 120,175 users of the music platform Last.fm, who listened to 585,095 unique artists after data cleansing as described in (Schedl, 2016). Out of these 120,175 users, 46,120 (38.4%) provide age information in their profile. Considering only those who provide their age, users from 6 to 18 years (inclusive) represent 5,953 (12.9%). Including users up to age 25, this number increases to almost two thirds of the population (30,404 users or 65.9%). Figure 1 shows the distribution of age groups for the top countries in the dataset, i.e., those with at least 100 users.111The country abbreviations comply with the ISO 3166 standard: https://www.iso.org/iso-3166-country-codes.html The stacked bars are sorted according to median age from young to old (left to right). For instance, more than half of the users in Estonia, Poland, Brazil, Belarus, India, and Lithuania are younger than 22.
2.2. Modeling and Analyzing Music Preferences and Homogeneity
To model music preferences on a user level, we gather the top user-generated tags for each artist in the LFM-1b dataset, using the Last.fm API endpoint artist.getTopTags. We index the tags using a dictionary of
main genres from Allmusic, casefold tags and index terms, and describe each artist by a bag-of-words representation of genres. Considering each user’s playcount vector over artists, we compute his or hergenre profile. To this end, each artist’s genre occurrence is multiplied with the respective playcount value of the user for that artist. Summing up these playcount-weighted artists’ genre occurrences on the genre level for each user results in a 20-dimensional feature vector over the 20 genres. We normalize these vectors for each user, so that the user’s genre profile contains the percentage of music listened to from each of the 20 genres. Based on the genre profiles, we measure music preferences for a given user group (e.g., users aged 6 to 12 years) by computing the arithmetic mean over all group members’ genre profiles. We further quantify the homogeneity of preferences within a given user group using Krippendorff’s score of inter-rater agreement (Krippendorff, 2013).
2.3. Recommender Systems Evaluation
To investigate whether music recommender systems perform better when tailoring recommendations to particular age groups we conduct rating or preference prediction experiments, which is a common evaluation approach in recommender systems research. We analyze the performance of a model-based collaborative filtering approach tailoring the recommendations to age groups from 6 and 18 years, and compare results with those realized for adults (aged 19 to 60 years) and the overall population. To this end, we first normalize and scale the playcount values in the user-artist-matrix of the LFM-1b dataset to the range [0, 1000] for each user individually, assuming that higher numbers of playcounts indicate higher user preference for an artist (for the relation between implicit and explicit feedback see, e.g., (Parra and Amatriain, 2011; Jawaheer et al., 2010)
). We apply singular value decomposition according to(Salakhutdinov and Mnih, 2007), equivalent to probabilistic matrix factorization, to factorize the user-artist-matrix and in turn effect rating prediction. In 5-fold cross-validation experiments with random shuffle across all users, we use root mean square error (RMSE) and mean absolute error (MAE) as performance measures.
3. Music Preferences of the Young
In this section, we discuss the overall preferences of young listeners (Section 3.1). Then, we further detail these preferences by considering gender (Section 3.2) and country (Section 3.3) information. Finally, we delve into details on music preferences of various age groups within the young listener population (Section 3.4).
3.1. Overall Music Preferences
shows the arithmetic means and standard deviations (in parentheses) of the genre profiles for the entire Last.fm population (first row), for all young listeners until 18 years (second row), for all adult listeners aged 19 and older (third row), and for categories of different user groups (e.g., all user groups distinguished according to their country or according to their age).Blue and red font is used to indicate, respectively, highest and lowest value per genre within each category of user groups. For instance, when categorizing our target group of young listeners (aged 0 to 18) with respect to country, metal is listened to least in the US (3.20%) and most in Poland (9.12%) and Finland (8.87%). The last column of the table contains Krippendorff’s agreement score , which quantifies homogeneity. Please note that we only show results for genres with an overall share among all users’ listening events of at least 3%. Detailed results for all genres can be provided by the authors upon request.
The first row of the table contains the overall genre distribution of the entire population (irrespective of age). It reveals that the top genres listened to by the entire LFM-1b sample are rock (), alternative (), and pop (). The second row, aggregating young listeners (up to 18 years, inclusive), shows that the top genres are the same as for the overall population, though the preferences for rock () and alternative () are even more pronounced than in the overall population; the opposite for pop (). Furthermore, much higher preferences among the young are observed for metal ( vs. ) and punk ( vs. ), whereas substantially lower preferences exist for rnb ( vs. ), jazz ( vs. ), and blues ( vs. ).
A comparison of the preferences of young listeners up to 18 years (second row) and listeners aged 19 and above (third row) shows a comparable picture: The genres that are preferred more by young listeners compared to adults are, respectively, rock ( vs. ), alternative ( vs. ), pop ( vs. ), metal ( vs. ), rap ( vs. ), and rnb ( vs. ), whereas the genres preferred more by adults than by young listeners are electronic ( vs. ), folk ( vs. ), jazz ( vs. ), and blues ( vs. ).
With an overall agreement score of , moderate homogeneity in genre preferences can be observed for the entire user population, according to (Landis and Koch, 1977). Compared to most analyzed other user groups – with respect to age and/or country – this overall agreement for genre preference is rather low.
Generally, our data suggests that rock is the most preferred genre across all considered user groups except for young listeners in the United Kingdom, who slightly prefer alternative () to rock (). Taking this general perspective, alternative ranks second across all user groups except for young listeners in the United Kingdom. Blues appears to be the least preferred genre, which holds true for most considered user groups except for young listeners from Poland, who like rnb less ( vs. ), from the Netherlands who like jazz less ( vs. ), and from Brazil who appreciate rap less than blues ( vs. ). Further, compared to the overall population and young listeners, adults aged 19 years and older like rnb less than blues ( vs. ).
3.2. Gender-specific Music Preferences
According to our data, rock is the genre most listened to by both male () and female () young listeners. Blues (), rnb () and jazz () are the least preferred genres for the male users in this age group; blues (), jazz () and rap () for females. Further, our data suggests a substantial male preference for metal ( vs. ) and rap ( vs. ); pop ( vs. ) is particularly preferred by female users.
The homogeneity of music taste with respect to genre is substantially higher for females () than for males (). In fact, the homogeneity is higher for the female user group than for any other user group considered in our analysis (Table 1).
3.3. Country-specific Music Preferences
The general preference for rock music among young listeners seems to be consistent across all analyzed countries. A similar picture is shown for the genres alternative and pop. Country-specific differences can be seen for other genres, though.
For instance, in Poland () and Finland () the liking of metal is particularly high compared to other countries score for metal. Metal is also the genre that shows the highest gap in preference between countries, with Polish listeners being most affine () and US listeners liking this genre least (). Other substantial discrepancies between countries are observed for pop with highest share in Sweden () and lowest in Russia (), for electronic with highest share in Russia () and lowest in Brazil (), for alternative with highest share in Poland () and lowest in Finland (), for rnb with highest share in the United Kingdom () and lowest in Russia (), and for rap with highest share in Germany () and lowest in Brazil ().
The highest homogeneity of music preferences can be found for the United Kingdom () and Sweden (), which are higher compared to the overall group of young users () and the overall user population ().
3.4. Music Preferences in Different Age Groups
Comparing the genre preferences of different age groups within the young listener population, our data suggests that the young listener’s high preference for rock music and the rather low preference for blues holds also for the more fine-grained user groups.
Our data further suggests that rnb (), rap (), blues (), and jazz () are most liked by the youngest age group (6,12), although overall with rather low listening shares compared to other genres. The youngest age group (6,12) also appreciates electronic music () the most in comparison to the other age groups, in this case with considerable preference scores. In contrast, rock (), folk (), punk (), alternative (), and metal () are least liked by the youngest group, compared to the older groups. A preference for these genres evolves, however, with increasing age up to 16 years; then it steadily decreases.
Furthermore, results indicate that the preference for folk music tends to rise with increasing age (from to ). The liking of rnb (), rap (), and pop () reach their peak scores for the age group (13,14). Preference for rock (), punk (), alternative music (), and metal () peaks for the age group (15,16); then, the preference scores decrease with increasing age. The opposite is observed for other genres: Preference for electronic music () and jazz () scores lowest for the age group (15,16), for blues () for the age group (13,14); the preference for these genres tends to rise with increasing age.
4. Music Recommendation Experiments
We conduct preference prediction experiments for various age groups as described in Section 2.3 and report error measures in Table 2. An overall performance score is obtained using all user playcounts of the dataset, independent of the users’ age (first row). To assess to which extent tailoring recommendations to different age groups affects recommendation performance, we create subsets of users according to their membership in age groups 6-12, 13-14, 15-16, and 17-18; then we perform the same experiment as described above individually on these subsets. The results for a subset comprising the entire group of 6- to 18-aged users can be found in the second row of Table 2. The third row contrasts these results to the user group of adults (19 to 60 years). The results for the more fine-grained age ranges can be found in the bottom rows. Our discussion focuses on the RMSE values; the insights gained from the RMSE values correspond to the ones gained via MAE.
Our results suggest that the general performance for the whole young group (0,18) substantially differs from that of the overall population (RMSE of vs. ). In addition, RMSE is smaller for all age groups 18 years compared to the error for the overall population. This indicates that kids and adolescents aged 6 to 18 benefit substantially from like-minded peers when recommending items with collaborative filtering, as underpinned by RMSE values as low as to . This observation is in line with findings from development psychology that music is considered a means for socializing with peers during adolescence (Laiho, 2004). The recommendations work particularly well for the youngest age group (6,12) with an RMSE of and for users late in their adolescence (17,18) with an RMSE of .
|All young users (0,18)||6101||7.766||2.940|
|All adult users (19,60)||39514||77.548||76.131|
5. Conclusions and Future Work
We analyzed the music preferences of kids and adolescents aged 6 to 18 years in terms of genre preferences and homogeneity of these preferences, based on the LFM-1b dataset of Last.fm users. We uncovered substantial differences in both preferences and homogeneity between young users, adult users, and the overall user population. Such differences were also found between countries and gender of the young population and between fine-grained age groups. In recommender systems experiments, we found that preference predictions were substantially more accurate for the young user groups than for the adult population. We conclude that tailoring a collaborative filtering systems to users 18 years is beneficial.
A limitation of our approach is that the LFM-1b dataset may not necessarily generalize to the population at large, in particular in terms of age distribution. Still, as listeners up to 18 years are well represented in the dataset and this age group is known to use social media platforms frequently (Chassiakos et al., 2016), we assume that the dataset provides a good indicator. Further in-depth investigation is necessary, especially with respect to the highly varying “music listening culture” in different countries. We will integrate more data sources and deploy additional research instruments (e.g., surveys).
This research is supported by the Austrian Science Fund (FWF): V579 and P25655.
- Al-Shamri (2016) Mohammad Yahya H Al-Shamri. 2016. User profiling approaches for demographic recommender systems. Knowledge-Based Systems 100 (2016), 175–187.
- Chassiakos et al. (2016) Yolanda Linda Reid Chassiakos, Jenny Radesky, Dimitri Christakis, Megan A Moreno, Corinn Cross, et al. 2016. Children and adolescents and digital media. Pediatrics 138, 5 (2016).
- Chiasson and Gutwin (2005) Sonia Chiasson and Carl Gutwin. 2005. Testing the Media Equation with Children. In Proc. of CHI. Portland, OR, 829–838.
- Harrison and Ryan (2010) Jill Harrison and John Ryan. 2010. Musical taste and ageing. Ageing & Society 30, 4 (2010), 649–669.
- IPSOS and IFPI (2016) IPSOS and IFPI. 2016. Music Consumer Insight Report 2016. Technical Report. http://www.ifpi.org/downloads/Music-Consumer-Insight-Report-2016.pdf
- Jawaheer et al. (2010) Gawesh Jawaheer, Martin Szomszor, and Patty Kostkova. 2010. Comparison of implicit and explicit feedback from an online music recommendation service. In Proc. of HetRec. Barcelona, Spain, 47–51.
- Krippendorff (2013) Klaus Krippendorff. 2013. Content Analysis: An Introduction to Its Methodology. SAGE.
- Laiho (2004) Suvi Laiho. 2004. The psychological functions of music in adolescence. Nordic Journal of Music Therapy 13, 1 (2004), 47–63.
- Landis and Koch (1977) J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33, 1 (1977), 159–174.
- Laplante (2014) Audrey Laplante. 2014. Improving music recommender systems: what can we learn from research on music tags?. In Proc. of ISMIR. Tapei, Taiwan, 451–456.
- Parra and Amatriain (2011) Denis Parra and Xavier Amatriain. 2011. Walk the Talk: Analyzing the Relation Between Implicit and Explicit Feedback for Preference Elicitation. In Proc. UMAP. Girona, Spain, 255–268.
- Salakhutdinov and Mnih (2007) Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic Matrix Factorization. In Proc. of NIPS. Vancouver, BC, Canada, 1257–1264.
- Schedl (2016) Markus Schedl. 2016. The LFM-1b Dataset for Music Retrieval and Recommendation. In Proc. of ICMR. New York, NY, 103–110.
- Schedl et al. (2014) M. Schedl, E. Gómez, and J. Urbano. 2014. Music Information Retrieval: Recent Developments and Applications. Foundations and Trends in Information Retrieval 8, 2–3 (2014), 127–261.
- Schedl and Hauger (2015) Markus Schedl and David Hauger. 2015. Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty. In Proc. of SIGIR. Santiago, Chile, 947–950.
- Shi et al. (2014) Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Collaborative Filtering Beyond the User-Item Matrix: A Survey of the State of the Art and Future Challenges. Comput. Surveys 47, 1 (2014), 3:1–3:45.
- Ter Bogt et al. (2011) Tom FM Ter Bogt, Marc JMH Delsing, Maarten van Zalk, Peter G Christenson, and Wim HJ Meeus. 2011. Intergenerational continuity of taste: Parental and adolescent music preferences. Social Forces 90, 1 (2011), 297–319.
- Zhao et al. (2014) Xin Wayne Zhao, Yanwei Guo, Yulan He, Han Jiang, Yuexin Wu, and Xiaoming Li. 2014. We know what you want to buy: a demographic-based system for product recommendation on microblogs. In Proc. of SIGKDD. New York, NY, 1935–1944.