Fig. 2 shows the proportion of users who mentioned children in their public posts at least once in 2016. The proportion of women increases sharply until 31-32 years old and then gradually falls. The peak matches the average age of women at first childbirth, which is 30 years in Saint Petersburg interfax20176highest . The proportion of men who mention children is significantly lower and steadily increases with age.
In almost all cohorts of users, sons are mentioned by a larger proportion of both men and women. This difference cannot be explained by the sex ratio at birth alone (1.06 in Russia) and thus indicates gender preference in sharing information about children. Those users who mention children at least once also write slightly more posts about sons. There are 2.3 posts about sons per woman and 2.1 posts about daughters per woman (). Men write 1.7 posts about sons and 1.5 posts about daughters on average (
). As a result of these two tendencies, there are more posts about sons than about daughters on the social network. The exact estimate of the gap in the number of posts depends on the set of words that are chosen as synonyms for the words “son” and “daughter” (see SI Text for detailed analysis). From our most conservative estimate, women write 15% more posts about sons than about daughters, and men write 43% more posts about sons.
We also found that posts featuring sons are more rewarded, that is they get more “likes”, than those featuring daughters. Average numbers of “likes” are presented in Table 1. Here three patterns can be distinguished. First, women “like” posts more often than men. Second, there is a gender homophily in “likes”, i.e. women prefer posts written by women and men prefer those written by men. Third, both women and men more often “like” posts which mention sons.
|Written by women||Average number of likes|
|by women||by men|
|Written by men||Average number of likes|
|by women||by men|
Studies of gender preference in parental practices usually have to rely on self-reports, e.g. reports about time spent with children baker2016boy ; harris1991fathers ; aldous1998fathering ; lundberg2005division . Self-report studies have some benefits, but their results are affected by various biases including social desirability bias or recall bias. Mentions in posts are directly observable and present a clear and simple metric, which can be used on easily accessible data to measure parents’ gender bias. We used this metric on a large dataset of public posts of more than six hundred thousand users and found that both men and women exhibited son preference on the social networking site: sons were mentioned significantly more often than daughters. This result is remarkably stable, and holds true across age cohorts, different measures, and sets of words. We also found that writing posts in which sons are mentioned is more rewarded: these posts get around 1.5 times more likes than stories featuring daughters.
Son preference in traditional societies and developing countries is a well-known phenomenon. Our results confirm that son preference is also prevalent in countries not immediately associated with gender disparity333Russia is above average in the ranking of countries by gender parity wef2016global ..
Gender preference in “sharenting” may seem quite harmless in comparison with such layers of gender disparity as sex-selective abortions or underinvestment in girls. However, son bias online may affect girls as they may feel underappreciated and less visible. It may also have broader effects on gender parity. Even moderate bias might accumulate given the widespread popularity of social media. Son preference in “likes” can additionally amplify the bias, acting as social media’s built-in positive feedback loop. Millions of users are exposed to a gender biased news feed on an everyday basis and, without even noticing, get the reaffirmation that it is normal to pay more attention to sons.
Previous studies have shown that children’s books are dominated by male central characters mccabe2011gender ; hamilton2006gender . In textbooks, females get fewer lines of text, fewer named characters, and fewer mentions than men blumberg2008invisible . Additionally, in movies there are on average twice as many male characters as female ones in front of the camera smith2015inequality . While female coverage on Wikipedia compares favorably with some other lists of notable people wagner2015s , there are still 4 times more articles about men than women wiki2018gender . Gender imbalance in public posts may send yet another message that girls are less important and interesting than boys and deserve less attention, thus presenting an invisible obstacle to gender equality.
Counting mentions of children
We used the API of VK to download all public posts of users from Saint Petersburg that were made in 2016. We then computed vector representations of Russian words by training a fastTextbojanowski2016enriching model on the collected corpus. We used this model to identify words similar to “son” and “daughter”, namely the closest words in the vector space measured by cosine distance. We manually excluded unrelated words. For instance, both the words “son” and “granddaughter” are unsurprisingly semantically close to the word “daughter” according to the model. However, these are not synonyms to the word “daughter” and we do not treat them as mentions of daughters. After exclusion of unrelated words we obtained a list of the 30 closest synonyms to the word “daughter” and the 30 closest synonyms to the word “son”. The posts that included at least one of these words were considered as posts mentioning children. The use of word embeddings trained on the VK corpus allowed us to take into account words or their forms that cannot be found in dictionaries but which are used by the users of the social network, e.g. “sooon” instead of “son”. We performed an additional analysis to make sure that our results were not driven by a particular choice of words (see SI Text). We also removed potentially fake accounts and filtered posts that were not made by users themselves (see SI Text for details on data preprocessing) and then computed the proportion of users who mentioned children at least once in their posts, and the average number of such mentions per user.
Support from the Basic Research Program of the National Research University Higher School of Economics is gratefully acknowledged.
-  Xu Tian, Xiaohua Yu, and Stephan Klasen. Gender discrimination in China revisited: a perspective from family welfare. Journal of Chinese Economic and Business Studies, 16(1):95–115, 2018.
-  Karsten Hank and Hans-Peter Kohler. Gender preferences for children in Europe: Empirical results from 17 FFS countries. Demographic research, 2(1), 2000.
-  Pauline Rossi and Léa Rouanet. Gender preferences in Africa: A comparative analysis of fertility choices. World Development, 72:326–345, 2015.
Sex preference in South Asia: Sri Lanka an outlier.Asia-Pacific Population Journal, 10(3):5–16, 1995.
-  John Bongaarts. The implementation of preferences for male offspring. Population and Development Review, 39(2):185–208, 2013.
-  Edgar Dahl, Manfred Beutel, Burkhart Brosig, and Klaus-Dieter Hinsch. Preconception sex selection for non-medical reasons: a representative survey from Germany. Human Reproduction, 18(10):2231–2234, 2003.
-  Edgar Dahl, Ruchi S Gupta, Manfred Beutel, Yve Stoebel-Richter, Burkhard Brosig, Hans-Rudolf Tinneberg, and Tarun Jain. Preconception sex selection demand and preferences in the United States. Fertility and sterility, 85(2):468–473, 2006.
-  Sara Raley and Suzanne Bianchi. Sons, daughters, and family processes: Does gender of children matter? Annual Review of Sociology, 32:401–421, 2006.
-  Géraldine Duthé, France Meslé, Jacques Vallin, Irina Badurashvili, and Karine Kuyumjyan. High sex ratios at birth in the Caucasus: Modern technology to satisfy old desires. Population and Development Review, 38(3):487–501, 2012.
-  Barbara D Miller. Female-selective abortion in Asia: Patterns, policies, and debates. American anthropologist, 103(4):1083–1095, 2001.
-  World Development Report 2012: Gender equality and development. The World Bank, 2012.
-  Batool Zaidi and S Philip Morgan. In the pursuit of sons: Additional births or sex-selective abortion in Pakistan? Population and development review, 42(4):693–710, 2016.
-  Onur Altindag. Son preference, fertility decline, and the nonmissing girls of Turkey. Demography, 53(2):541–566, 2016.
-  Silvia Helena Barcellos, Leandro S Carvalho, and Adriana Lleras-Muney. Child gender and parental investments in India: Are boys and girls treated differently? American Economic Journal: Applied Economics, 6(1):157–89, 2014.
-  Vani K Borooah. Gender bias among children in India in their diet and immunisation against disease. Social science & medicine, 58(9):1719–1731, 2004.
-  Lina Song. In search of gender bias in household resource allocation in rural China. IZA Discussion papers, (3464), 2008.
-  Bela Ganatra and Siddhivinayak Hirve. Male bias in health care utilization for under-fives in a rural community in western India. Bulletin of the World Health Organization, 72(1):101, 1994.
-  Michael Baker and Kevin Milligan. Boy-girl differences in parental time investments: Evidence from three countries. Journal of Human Capital, 10(4):399–441, 2016.
-  Kathleen Mullan Harris and S Philip Morgan. Fathers, sons, and daughters: Differential paternal involvement in parenting. Journal of Marriage and the Family, 53(3):531–544, 1991.
-  Joan Aldous, Gail M Mulligan, and Thoroddur Bjarnason. Fathering over time: What makes the difference? Journal of Marriage and the Family, 60(4):809–820, 1998.
-  Shelly J Lundberg. The division of labor by new parents: does child gender matter? IZA Discussion papers, 2005.
-  Gordon B Dahl and Enrico Moretti. The demand for sons. The Review of Economic Studies, 75(4):1085–1120, 2008.
-  Francine D Blau, Lawrence M Kahn, Peter Brummund, Jason Cook, and Miriam Larson-Koester. Is there still son preference in the United States? Technical report, National Bureau of Economic Research, 2017.
-  Shelly Lundberg, Sara McLanahan, and Elaina Rose. Child gender and father involvement in fragile families. Demography, 44(1):79–92, 2007.
-  Andreas Diekmann and Kurt Schmidheiny. Do parents of girls have a higher risk of divorce? An eighteen-country study. Journal of Marriage and Family, 66(3):651–660, 2004.
-  Hans-Peter Kohler, Jere R Behrman, and Axel Skytthe. Partner + children = happiness? The effects of partnerships and fertility on well-being. Population and development review, 31(3):407–445, 2005.
-  Alicia Blum-Ross and Sonia Livingstone. “Sharenting,” parent blogging, and the boundaries of the digital self. Popular Communication, 15(2):110–125, 2017.
-  Anna Brosch. When the child is born into the Internet: Sharenting as a growing trend among parents on Facebook. 43(1):225–235, 2016.
-  Interfax. The highest average age of women at first childbirth in Russia was recorded in Saint Petersburg. http://www.interfax.ru/russia/570614, 2017. [Accessed 21.03.2018].
-  Global Gender Gap Report 2016. The World Economic Forum, 2016.
-  Janice McCabe, Emily Fairchild, Liz Grauerholz, Bernice A Pescosolido, and Daniel Tope. Gender in twentieth-century children’s books: Patterns of disparity in titles and central characters. Gender & society, 25(2):197–226, 2011.
-  Mykol C Hamilton, David Anderson, Michelle Broaddus, and Kate Young. Gender stereotyping and under-representation of female characters in 200 popular children’s picture books: A twenty-first century update. Sex Roles, 55(11-12):757–765, 2006.
-  Rae Lesser Blumberg. The invisible obstacle to educational equality: Gender bias in textbooks. Prospects, 38(3):345–361, 2008.
-  Stacy L Smith, Marc Choueiti, Katherine Pieper, Traci Gillig, Carmen Lee, and Dylan DeLuca. Inequality in 700 popular films: Examining portrayals of gender, race & LGBT status from 2007 to 2014. USC Annenberg, 2015.
-  Claudia Wagner, David Garcia, Mohsen Jadidi, and Markus Strohmaier. It’s a man’s Wikipedia? Assessing gender inequality in an online encyclopedia. In ICWSM, pages 454–463, 2015.
-  Swedish Ministry for Foreign Affairs. Wikigap. https://meta.wikimedia.org/wiki/WikiGap, 2017. [Accessed 21.03.2018].
-  Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606, 2016.
Sample and data preprocessing
VK is the largest European social networking site, with more than 100 million active users. It was launched in September 2006 in Russia and provides functionality similar to Facebook. According to VK’s Terms of Service: “Publishing any content on his / her own personal page, including personal information, the User understands and accepts that this information may be available to other Internet users taking into account the architecture and functionality of the Site”. VK provides an application programming interface (API) that enables downloading of information systematically from the site. In particular, it is possible to download user profiles from particular educational institutions and within selected age ranges. For each user, it is possible to obtain a list of their public posts. VK’s support team confirmed to us that the data downloaded via their API may be used for research purposes.
One limitation of VK is that its API returns no more than 1000 users for any request. To collect data on users from Saint Petersburg on a larger scale we created a list of all high schools in Saint Petersburg and then accessed IDs of users from each age cohort (from 18 to 50 years) who graduated from each of these schools. As each of these requests returns less than 1000 users, we were able to collect information about all users who indicated their high school on VK. Note that not all users in the sample currently live in Saint Petersburg, and not everyone on VK indicated their (former) high school in their profile. However, this approach allowed us to collect a large sample of VK users in a systematic way. Another advantage of our approach is that it provides an opportunity to effectively remove fake profiles. To achieve this we did not include in the final sample the users who had no friends on the social networking site from their high school. The data were collected as part of the Digital Trace project and the data collection procedure was approved by the Institutional Review Board of the National Research University Higher School of Economics.
We made sure that only posts with authentic content were included in the analysis. We excluded reposts and posts with exactly the same content made by multiple users. We also did not include posts containing URLs to accounts for potential automatic posting and advertisements by websites or VK applications (e.g. invitations to visit a web site or to beat the score in a game). Not all the posts in the resulting sample are necessarily about users’ own children. Some posts include mentions of children of friends or relatives, or jokes, etc. By our estimate, the proportion of such posts is around 9%, and their removal does not affect the observed son bias (see SI Topic analysis).
The exact estimate of son bias might depend on the selection of words that are counted as mentions of children. However, we found that the observed preference for sons holds true irrespective of the choice of words. To show this, we selected the N most frequent synonyms and forms of the words “daughter” and “son” from our corpus. We then used these sets of words to compute the proportion of users who mentioned children at least once (Fig. S1a), as well as the total number of posts with mentions of children (Fig. S1b). The son bias holds true for all . Any changes for larger N are negligible. The list of the most frequent words along with the number of occurrences of each word is presented in Table S1.
We identified the main topics of posts with mentions of children by analyzing a sample of posts from one age cohort (30 years old). We coded all the men’s posts (879 posts) from this cohort and randomly selected 20% (1521) of the women’s posts. At the first stage of coding, for each post we wrote down the category which most fully grasped the post’s content. At the second stage we collapsed similar categories into broader ones.
Only 9% of the posts are not related to the users’ own children (8% among women’s posts and 18% among men’s). These posts include mentions of other people’s kids as well as jokes, news, and stories about pets. After filtering out the irrelevant posts, the son bias for women was unchanged, and for men it remained significant: women wrote 15% more posts about sons than about daughters and men wrote 34% more posts about sons in this age group.
Among relevant posts, the most common topics were reports of spending time with children (27% of posts), expressions of positive feelings, mostly love, affection, or pride (26%), and celebrations of births and birthdays (19%; see Fig. S2 for examples). These three categories accounted for 72% percent of all posts about user’s own children. Note that the distribution of topics most likely depends on the age of a child, and might be different for other cohorts.