Personalization, Privacy, and Me

by   Reshma Narayanan Kutty, et al.

News recommendation and personalization is not a solved problem. People are growing concerned of their data being collected in excess in the name of personalization and the usage of it for purposes other than the ones they would think reasonable. Our experience in building personalization products for publishers while adhering to safeguard user privacy led us to investigate more on the user perspective of privacy and personalization. We conducted a survey to explore people's experience with personalization and privacy and the viewpoints of different age groups. In this paper, we share our major findings with publishers and the community that can inform algorithmic design and implementation of the next generation of news recommender systems, which must put the human at its core and reach a balance between personalization experiences and privacy to reap the benefits of both.



There are no comments yet.


page 2

page 3


The Users' Perspective on the Privacy-Utility Trade-offs in Health Recommender Systems

Privacy is a major good for users of personalized services such as recom...

Human Aspects and Perception of Privacy in Relation to Personalization

The concept of privacy is inherently intertwined with human attitudes an...

Privacy-Adversarial User Representations in Recommender Systems

Latent factor models for recommender systems represent users and items a...

Personalized News Recommendation: A Survey

Personalized news recommendation is an important technique to help users...

Toward the Next Generation of News Recommender Systems

This paper proposes a vision and research agenda for the next generation...

A Survey Investigating Usage of Virtual Personal Assistants

Despite significant improvements in automatic speech recognition and spo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Personalization today comes at a huge price, including giving out unnecessary personal information, which is at risk of being exposed in case of data breaches and other unauthorized access. With businesses increasingly incorporating personalization technology, the data and usage have started getting more attention by users and regulators. With Google and other big players banning third-party cookies that target personal information from their browsers, the awareness and need to consider user privacy is more than ever for news outlets and digital publishers.

Data breaches and information leak from companies have made privacy a major concern in recent years. Laws, like General Data Protection Regulation (GDPR) (European Parliament and Council of the European Union, 2016) and California Consumer Privacy Act (CCPA) (State of California, USA, 2018), are in place to strictly govern companies for their usage of customer data. Even when user identities are anonymized, their activities can identify them (Brasher, 2018), revealing protected categories such as race, gender, sexual orientation, or religion.

To design better personalization products we need to understand how growing privacy concerns affect reader experiences (Wadle et al., 2019). Whether, for example, different age groups take or do not take action to protect their privacy online. How enjoyable it is to receive personalized content online if privacy is not (fully) respected (Hanson et al., 2020), and how aware people are of how much of their data are collected and/or necessary in exchange for a personalized experience (Stuart A. Thompson and Charlie Warzel, 2019).

To strengthen our study, outside our quantitative research, we also conducted a survey to understand more about consumer behavior, identify user pain points, and learn about the current state of users accessing digital content in relationship with their privacy concerns. The purpose of the study is to understand user awareness about data privacy and the transparency in the usage of user data.

In the rest of the paper we report the main lessons learned with the aim to help the community, both in academia and industry, to reach a balance between personalization experiences and privacy protection for the next generation of recommender systems we envision.

2. Results

The survey was shared across social media, including LinkedIn and Twitter, and garnered 270 responses over a period of 1 month during July 2020. 84% of the respondents were Millennials and Generation Z (Gen Z), and 16% were over the age of 35.

Survey recruitment strategy: The survey mainly targets digital users who access content online and are used to giving out personal information in exchange for content personalization. Social media platform was used to share the survey across even though no personal information was captured through the survey.

The results of our recent survey shed light on what readers today really care about when they access media online. The main insights are as follows.

Personalization remains a favorite among younger age groups

When asked if they like seeing personalized content recommendations while visiting an online magazine or news media website, 52% of Millennials/Gen Z responded that their preference for personalized content is very high, or high (see Figure 1), while this percentage is of 30% in older age groups.

Figure 1. Personalization preference of Millennials and Generation Z respondents.

Pie chart showing the reuslts for the personalization preference of Millennials and Generation Z respondents.

Privacy, a recurring concern

We asked people whether data privacy is a concern to them and, if so, what aspect(s) of it worries them.

Figure 2. Readers’ thoughts on privacy online.

Pie chart showing the results for readers’ thoughts on privacy online.

The answers reveal that personal data privacy remains a common concern, with 78% of the respondents worried about their privacy on digital platforms (see Figure 2). The major concerns were that personal data were being collected by apps and websites and being sold/shared to third party apps.

More data than necessary collected

When asked if they thought companies collect more or less data from them than required to offer a good personalization experience, our respondents felt that more data were collected than necessary in the name of personalization.

Over 63% of the respondents believed that more than enough data were collected while only 10% agreed that enough data were collected for personalization.

23% did not know how much data were being collected, which leads us to believe that there is little transparency in the relationship between data collection and its use for personalization (see Figure 3).

Figure 3. What readers think about data used in the name of personalization.

Pie chart showing the results for what readers think about data used in the name of personalization.

Stop stalking

Cross-website cookie usage was identified as a common privacy concern among many respondents because of targeted advertisements on social media based on browsing patterns. Another common concern was being asked for mandatory sign-ins or unnecessary personal information. We asked readers about a time they felt their privacy was violated and most of the answers centered around targeted ads.

Examples of anonymous responses received from our participants:

“One search for a product results in me being pestered with ads for the same product in all web sites I visit”

“My search results from an e-commerce site shows up on other websites”

“Mostly the ads whenever I search something on chrome appears on my ads, it’s scary like someone is stalking me always”

“Companies need to act ethically, but for that to happen, regulations must be put in place by governments. A balance absolutely must be found between innovation and personal privacy.” – Anonymous respondent.

Ad-blockers to the rescue

When asked if they run Ad blockers in their browsers, 62% of the respondents agreed to do so, as well as other tools such as Tor or VPN softwares to preserve privacy (see Figure 4).

Figure 4. Usage of ad-blockers.

Pie chart showing the reuslts for the usage of ad-blockers.

Key Takeaways

We can summarize the key takeaways as follows:

  • Millennials and Gen Z readers favor personalization while older groups are still wary of it.

  • Privacy online is a concern across all age groups. Not only on older age brackets.

  • There is no transparency in how the data collected are used for personalization and that more data are collected in the name of personalization.

  • People use their own tools to protect themselves online like ad-blockers, VPN and privacy preserving browsers.

  • There is a need to reach a balance between personalization experiences and privacy to reap the benefit of both.

3. Conclusion

People should hold the power in the tech industry, particularly in deciding what data they want to share and how that data are to be used.

With digitalization, publishers and brands have realized that they need to become customer centric with their offerings and move away from ad-based monetization that targets user information and cookies. Building trust and transparency have shown to help publishers increase customer loyalty. Regulations such as GDPR and CCPA have made indiscriminate collection dangerous for companies, forcing them to stop or restrict services in some cases. The need to offer personalized recommendations, however, is still on the rise.

As our research shows, people are becoming more aware of the need to safeguard their personal data from being collected in the name of personalization, and of the dangers and implications of exposure. Readers enjoy personalization but expect an experience where they need not worry about their data being misused.

Techniques like Federated Learning of Cohorts (FLoC) do not require personal information but use cohort behavior analysis to personalize content. On-device machine learning can enable personalization while giving the power to users allowing all processing on the user’s device. This opens news doors for digital publishers, for example, to engage with their readers while building trust and adhering to privacy regulations.


  • E. Brasher (2018) Addressing the failure of anonymization: guidance from the european union’s general data protection regulation. Columbia Business Law Review. External Links: Link Cited by: §1.
  • European Parliament and Council of the European Union (2016) General Data Protection Regulation – GDPR. Note: 2021-07-10 Cited by: §1.
  • J. Hanson, M. Wei, S. Veys, M. Kugler, L. Strahilevitz, and B. Ur (2020) Taking data out of context to hyper-personalize ads: crowdworkers’ privacy perceptions and decisions to disclose private information. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–13. External Links: ISBN 9781450367080, Link Cited by: §1.
  • State of California, USA (2018) California Consumer Privacy Act – CCPA. Note: 2021-07-10 Cited by: §1.
  • Stuart A. Thompson and Charlie Warzel (2019) Twelve Million Phones, One Dataset, Zero Privacy. Note: 2021-07-13 Cited by: §1.
  • L. Wadle, N. Martin, and D. Ziegler (2019) Privacy and personalization: the trade-off between data disclosure and personalization benefit. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization, UMAP’19 Adjunct, New York, NY, USA, pp. 319–324. External Links: ISBN 9781450367110, Link, Document Cited by: §1.