Judging a Book by its Description : Analyzing Gender Stereotypes in the Man Bookers Prize Winning Fiction

07/25/2018 ∙ by Nishtha Madaan, et al. ∙ ibm 0

The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying and quantifying such stereotypes and bias in the Man Bookers Prize winning fiction. We consider 275 books shortlisted for Man Bookers Prize between 1969 and 2017. The gender bias is analyzed by semantic modeling of book descriptions on Goodreads. This reveals the pervasiveness of gender bias and stereotype in the books on different features like occupation, introductions and actions associated to the characters in the book.



There are no comments yet.


page 3

page 4

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Gender, racial and ethnic stereotypes in many aspects of society is an undesirable yet pervasive phenomenon. In this work, we analyze and quantify gender-based stereotypes in description (summary) of fiction books shortlisted for Man Bookers Prize during the period 1969 to 2017. We show how gender bias and stereotyping is present in these books. The trends are improving but still far from optimal. We also look at gender of the authors which point to a gender imbalance problem shown in figure 1. However, in this work we focus on gender of characters more than gender of authors.

The motivation for considering books has been three fold:

  1. The data is very diverse in nature. Hence finding how gender stereotypes exist in this data becomes an interesting study.

  2. The data-set is large. We analyze 275 books which cover all the books shortlisted for Man Bookers prize since 1969. So it becomes a good first step to develop computational tools to analyze the existence of stereotypes over a period of time.

  3. These books are a reflection of society. It is a good first step to look for such gender bias in this data so that necessary steps can be taken to remove these biases.

While many regular book readers would have had similar hunches, to best of our knowledge we are the first ones to use Text Analytics, NLP and Graph based algorithms to study this computationally.

We focus on following tasks to study gender bias in Man Bookers Winning Fiction.

  1. Occupations and Gender Stereotypes- How are males portrayed in their jobs versus females? How are these levels different? How does it correlate to gender bias and stereotype? As shown in previous studies done on Hollywood and Bollywood story plots and scripts. Gender stereotyping with respect to occupations is one of the most pervasive biases that cuts across countries and age groups. This is evidenced by our previous work analyzing Bollywood movie story-lines Madaan et al. (2018a).

  2. Appearance - How are males and females described on the basis of their appearance? How do the descriptions differ in both of them? How does that indicate gender stereotyping?

  3. Mentions - How many males and females are mentioned in the fiction?

  4. Descriptions - How do the descriptions of a male and a female differ in the books?

Detection of such bias is only the first step. We are also developing various algorithms to debias such text. We have developed a focused de-biaser with respect to gender stereotyping in occupations Madaan et al. (2018b). The occupation bias has also been recently noted in machine translation systems Caliskan-Islam et al. (2016).

In parallel, we are also working on generalized reasoning based algorithm DeCogTeller to debias complete text. Early results are positive and encouraging!

Figure 2: Adjectives used with males and females
Figure 3: Verbs used with males and females
Figure 4: Occupations of males and females

2 Past Work

Analysis of gender bias in machine learning

in recent years has not only revealed the prevalence of such biases but also motivated much of the recent interest and work in de-biasing of ML models. Zhao et al. (2017) have pointed to the presence of gender bias in structured prediction from images. Fast et al. (2016); Madaan et al. (2018a) notice these biases in movies while Gooden and Gooden (2001); Millar (2008) notice the same in children books and music lyrics.

While there are recent works where gender bias has been studied in different walks of life Soklaridis et al. (2017),MacNell et al. (2015), Carnes et al. (2015), Terrell et al. (2017), the analysis majorly involves information retrieval tasks involving a wide variety of prior work in this area. Fast et al. (2016) have worked on gender stereotypes in English fiction particularly on the Online Fiction Writing Community. The work deals primarily with the analysis of how males and females behave and are described in this online fiction. Furthermore, this work also presents that males are over-represented and finds that traditional gender stereotypes are common throughout every genre in the online fiction data used for analysis.
Apart from this, various works where Hollywood movies have been analyzed for having such gender bias present in them Anderson and Daniels (2017). Similar analysis has been done on children books Gooden and Gooden (2001) and music lyrics Millar (2008) which found that men are portrayed as strong and violent, and on the other hand, women are associated with home and are considered to be gentle and less active compared to men. These studies have been very useful to uncover the trend but the derivation of these analyses has been done on very small data sets. In some works, gender drives the decision for being hired in corporate organizations Dobbin and Jung (2012). Not just hiring, it has been shown that human resource professionals’ decisions on whether an employee should get a raise have also been driven by gender stereotypes by putting down female claims of raise requests. While, when it comes to consideration of opinion, views of females are weighted less as compared to those of men Otterbacher (2015). On social media and dating sites, women are judged by their appearance while men are judged mostly by how they behave Rose et al. (2012); Otterbacher (2015); Fiore et al. (2008). When considering occupation, females are often designated lower level roles as compared to their male counterparts in image search results of occupations Kay et al. (2015).

2.1 Debiasing Algorithms

De-biasing the training algorithm

as a way to remove the biases focuses on training paradigms that would result in fair predictions by an ML model. In the Bayesian network setting, Kushner et al. have proposed a latent-variable based approach to ensure counter-factual fairness in ML predictions. Another interesting technique (

Beutel et al. (2013) and Zhang et al. (2016)

) is to train a primary classifier while simultaneously trying to ”deceive” an adversarial classifier that tries to predict gender from the predictions of the primary classifier.

De-biasing the model after training as a way to remove bias focuses on ”fixing” the model after training is complete. Bolukbasi et al. (2016) in their famous work on gender bias in word embeddings take this approach to ”fix” the embeddings after training.

De-biasing the data at the source fixes the data set before it is consumed for training. This is the approach we take in this paper by trying to de-bias the data or suggesting the possibility of de-biasing the data to a human-in-the-loop. A related task is to modify or paraphrase text data to obfuscate gender as in Reddy and Knight (2016) Another closely related work is to change the style of the text to different levels of formality as in Rao and Tetreault (2018).

Please note that most of these approaches are proposed for numerical data. Detecting and De-biasing text is an upcoming area with very less work till now.

3 Data and Experimental Study

3.1 Data

Figure 5: Total Character Mentions showing mentions of male and female characters. Female mentions are presented in pink and Male mentions in blue

The data-set contains 275 books for 1969-2017 time period. For each year we consider short listed books. The data-set consist of textual descriptions of books shortlisted for Man Bookers Prize from Goodreads. This text data is analysed using text analytics algorithms as explained in next section.

3.2 Task and Approach

In this section, we discuss the tasks we perform on the books data extracted from Goodreads. Further, we define the approach we adopt to perform individual tasks and then study the inferences. We define different tasks corresponding to our analysis.

To make books analysis ready, we used OpenIE Fader et al. (2011) for performing co-reference resolution on books text. Co-reference task involves finding all expressions in text which maps to same entity. For example, consider a small snippet – John went to market. He bought fruits. In these sentence co-reference will map He to John. The co-referenced textual description is used for following analyses.

  1. Character Mentions in the Book Descriptions - We extracted mentions of male and female characters in the books description. The motivation to find mentions is how many times males have been referred to in the book versus how many times females have been referred to in the book. This helps us identify if the female has an important role in the book or not. In Figure 5 it is observed that, a male is mentioned around 30 times in a book while a female is mentioned only around 15 times. Moreover, there is a consistency of this ratio from 1969 to 2017(for almost 50 years)!

  2. Character Appearance in Books Data - We analyzed how male characters and female characters have been addressed. This involves extracting adjectives associated with male characters and female characters. To extract adjectives linked to a particular character, we use IBM Watson Natural Language Understanding API Machines (2017). In Fig 2 we present the adjectives associated with males and females. When we look at adjectives, males are often represented as rich and wealthy while females are represented as beautiful and attractive in books description.

  3. Character Descriptions in Books Data - We analyze how the male characters and the female characters have been introduced in the textual description. This involves extracting verbs associated with the male and female characters. To extract verbs linked to a particular character, we use Stanford POS tagger De Marneffe et al. (2006). In Fig 3 we present the verbs associated with males and females. When we look at verbs, males are often represented as powerful while females are represented as fearful.

  4. Occupation as a stereotype - We perform a study on how occupations of males and females are represented. To perform this analysis, we collated an occupation list from multiple sources over the web comprising of  350 occupations. We then extracted an associated ”noun” tag attached with character member of the story using Stanford Dependency Parser De Marneffe et al. (2006) which is later matched to the available occupation list. In this way, we extract occupations for each character. We group these occupations for male and female characters for all the collated books data. Figure 4 shows the occupation distribution of males and females. From the figure it is clearly evident that, males are given higher level occupations than females. Our analysis shows that when it comes to occupation like ”teacher” or ”whore”, females are high in number. But for ”professor” and ”doctor” the story is totally opposite.Detailed occupations are shown in table 1

    Top Occupations in Males Top Occupations in Females
    Doctor/Physician/Surgeon/Psychologist Teacher/Lecturer
    Professor/Scientist Nurse
    Business/Director Whore/Hooker
    Church Agent/ Clergymen Child wife/ Child Bride
    Poet Maid
    Thief Secretary
    Table 1: Occupations in Male and Female Characters in Books

    4 Wind of Change

    Our system discovered at least 6-7 books in last four years where females play central role in textual description of the story. Few notable examples being - Do Not Say We Have Nothing written by- Madeleine Thien, How to be Both written by- Ali Smith, We Are All Completely Beside Ourselves written by- Karen Joy Fowler, Eileen written by- Ottessa Moshfegh, We Need New Names written by- NoViolet Bulawayo, A Spool Of Blue Thread written by-Anne Taylor, The Lowland written by-Jhumpa Lahiri. We also note that over time such biases are decreasing - still far away from being neutral but the trend is encouraging. Incidently all these books are written by female authors!.

4 Wind of Change

Our system discovered at least 6-7 books in last four years where females play central role in textual description of the story. Few notable examples being - Do Not Say We Have Nothing written by- Madeleine Thien, How to be Both written by- Ali Smith, We Are All Completely Beside Ourselves written by- Karen Joy Fowler, Eileen written by- Ottessa Moshfegh, We Need New Names written by- NoViolet Bulawayo, A Spool Of Blue Thread written by-Anne Taylor, The Lowland written by-Jhumpa Lahiri. We also note that over time such biases are decreasing - still far away from being neutral but the trend is encouraging. Incidently all these books are written by female authors!.

5 Bias Removal Tool- DeCogTeller

The system enables the user to enter some biased text and generate unbiased version of that text snippet. For this task, we take a news articles data set and train word embedding using Google word2vec Mikolov et al. (2013). This data acts as a fact data which is used later to check for gender specificity of a particular action as per the facts. Apart from interchanging the actions, we have developed a specialized module to handle occupations. Very often, gender bias shows in assigned occupation { (Male, Doctor), (Female, Nurse)} or { (Male, Boss), (Female, Assistant)}.

We give a holistic view of our system which is described in a detailed manner as follows-

  1. Data Pre-processing - We perform data pre-processing of the words in the fact data. (a) We look search Wordnet Miller (1995) to find whether the word in the fact data is present, and remove the word if not found. (b) We perform word stemming using the Stanford stemmer.

  2. Generating word vectors

    - We train Google word2vec on the pre-processed data, and generate word embedding.

  3. Extraction of analogical pairs - The next task is to find analogical pairs from fact data which are analogous to the pair. E.g., if we take an analogical word pair and we associate a vector to the pair, then, representing man as m and woman as w, the task is to find


    In Equation 1, if we replace man and woman vectors by he () and she () respectively, the above equation becomes


    The intent is to capture word pairs such as doctor or nurse where in most of the data, doctor is close to he and nurse is closer to she. Therefore for , we get

    Another example of found in our data is . We generate all such pairs and store them in our knowledge base. To have refined pairs, we used a scoring mechanism to filter important pairs. If

    where is the threshold parameter, then add the word pair to knowledge base otherwise ignore. Equivalently, after normalizing and , we calculated cosine distance as , which is algebraically equivalent to the above inequality.

    In our system, we extract plausible analogical word pairs by selecting candidates (the and described in Equation 2) for each character appearing in the sentence, jointly using IBM’s and UIUC’s semantic role labeler Machines (2017)Punyakanok et al. (2008), and picking the objects associated with that character via some labeled role.

  4. Classifying word pairs -

    Introducing word pair interchangeability- A pair of words are interchangeable for gender, if their roles, actions or relationships can be exchanged without breaking gender-related practical plausibility. For instance, in the pair (doctor, nurse), being a doctor and a nurse are gender-neutral roles, so the word pair can be interchanged. Contrarily, in (king, queen), such interchange is non-plausible (male queens and female kings are non-plausible).

    Performing interchange- In order to perform word pair interchange, we determine which pairs extracted in the above step correspond to gender neutral and which ones correspond to gender specific. To do this, we first extract the words from knowledge base extracted from test data and find how close they are to different genders. We find the cosine distance of the words in the word pair with and respectively, and if any word is close enough within a threshold to any of or then we label that word gender-specific. If both the words are far, then we label as gender-neutral.

  5. Action and Relationship Extraction from Test Data - After we have gender specific and gender neutral words from the fact data, we extract actions and relationships associated with books characters, from the test data. We extract the gender information for each characters in the books by using baby names census lists, and using this information we perform co-referencing on the textual description using Stanford OpenIE Fader et al. (2011). Next, we collate actions and relationships corresponding to each character.

  6. Bias detection using Actions - At this point we have the actions extracted from biased data corresponding to each gender. We can now use this data against fact data to check for bias, which is shown in our demo.

  7. Bias Removal - To ensure making practically meaningful exchanges (e.g., exchange a prominent male character with a prominent

    female character for practicality), we construct a knowledge graph for each character using relations from

    Stanford dependency parser. We use this graph to calculate the between-ness centrality for each character, and interchange only pairs where the centrality scores are within an empirically set threshold.

Figure 6: Text is de-biased and knowledge-graph is visualized.

6 Conclusion

This paper presents an analysis study which aims to extract existing gender stereotypes and biases from Man Bookers Prize Winning fiction data containing  275 books. The analysis is performed at sentence at multi-sentence level studying the bias in data. We observed that while analyzing occupations for males and females, higher level roles are designated to males while lower level roles are designated to females. We use this rich information extracted from Goodreads to study the dynamics of the data and to further define new ways of removing such biases present in the data. As a part of future work, we aim to extract summaries from this data which are bias-free. In this way, the next generations would stop inheriting bias from previous generations.


  • Anderson and Daniels (2017) Hanah Anderson and Matt Daniels. 2017. https://pudding.cool/2017/03/film-dialogue/.
  • Beutel et al. (2013) Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web, pages 119–130. ACM.
  • Bolukbasi et al. (2016) Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, pages 4349–4357.
  • Caliskan-Islam et al. (2016) Aylin Caliskan-Islam, Joanna J Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. arXiv preprint arXiv:1608.07187.
  • Carnes et al. (2015) Molly Carnes, Patricia G Devine, Linda Baier Manwell, Angela Byars-Winston, Eve Fine, Cecilia E Ford, Patrick Forscher, Carol Isaac, Anna Kaatz, Wairimu Magua, et al. 2015. Effect of an intervention to break the gender bias habit for faculty at one institution: a cluster randomized, controlled trial. Academic medicine: journal of the Association of American Medical Colleges, 90(2):221.
  • De Marneffe et al. (2006) Marie-Catherine De Marneffe, Bill MacCartney, Christopher D Manning, et al. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC, volume 6, pages 449–454. Genoa Italy.
  • Dobbin and Jung (2012) Frank Dobbin and Jiwook Jung. 2012. Corporate board gender diversity and stock performance: The competence gap or institutional investor bias?
  • Fader et al. (2011) Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In

    Proceedings of the Conference on Empirical Methods in Natural Language Processing

    , pages 1535–1545. Association for Computational Linguistics.
  • Fast et al. (2016) Ethan Fast, Tina Vachovsky, and Michael S Bernstein. 2016. Shirtless and dangerous: Quantifying linguistic signals of gender bias in an online fiction writing community. In ICWSM, pages 112–120.
  • Fiore et al. (2008) Andrew T Fiore, Lindsay Shaw Taylor, Gerald A Mendelsohn, and Marti Hearst. 2008. Assessing attractiveness in online dating profiles. In Proceedings of the SIGCHI conference on human factors in computing systems, pages 797–806. ACM.
  • Gooden and Gooden (2001) Angela M Gooden and Mark A Gooden. 2001. Gender representation in notable children’s picture books: 1995–1999. Sex roles, 45(1-2):89–101.
  • Kay et al. (2015) Matthew Kay, Cynthia Matuszek, and Sean A Munson. 2015. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pages 3819–3828. ACM.
  • Machines (2017) International Business Machines. 2017. https://www.ibm.com/watson/developercloud/developer-tools.html.
  • MacNell et al. (2015) Lillian MacNell, Adam Driscoll, and Andrea N Hunt. 2015. What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4):291.
  • Madaan et al. (2018a) Nishtha Madaan, Sameep Mehta, Taneea Agrawaal, Vrinda Malhotra, Aditi Aggarwal, Yatin Gupta, and Mayank Saxena. 2018a. Analyze, detect and remove gender stereotyping from bollywood movies. In Conference on Fairness, Accountability and Transparency, pages 92–105.
  • Madaan et al. (2018b) Nishtha Madaan, Gautam Singh, Sameep Mehta, Aditya Chetan, and Brihi Joshi. 2018b. Generating clues for gender based occupation de-biasing in text. arXiv preprint arXiv:1804.03839.
  • Mikolov et al. (2013) Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  • Millar (2008) Brett Millar. 2008. Selective hearing: gender bias in the music preferences of young adults. Psychology of music, 36(4):429–445.
  • Miller (1995) George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41.
  • Otterbacher (2015) Jahna Otterbacher. 2015. Linguistic bias in collaboratively produced biographies: crowdsourcing social stereotypes? In ICWSM, pages 298–307.
  • Punyakanok et al. (2008) V. Punyakanok, D. Roth, and W. Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics, 34(2).
  • Rao and Tetreault (2018) Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may i introduce the yafc corpus: Corpus, benchmarks and metrics for formality style transfer. arXiv preprint arXiv:1803.06535.
  • Reddy and Knight (2016) Sravana Reddy and Kevin Knight. 2016. Obfuscating gender in social media writing. In Proceedings of the First Workshop on NLP and Computational Social Science, pages 17–26.
  • Rose et al. (2012) Jessica Rose, Susan Mackey-Kallis, Len Shyles, Kelly Barry, Danielle Biagini, Colleen Hart, and Lauren Jack. 2012. Face it: The impact of gender on social media images. Communication Quarterly, 60(5):588–607.
  • Soklaridis et al. (2017) Sophie Soklaridis, Ayelet Kuper, Cynthia Whitehead, Genevieve Ferguson, Valerie Taylor, and Catherine Zahn. 2017. Gender bias in hospital leadership: a qualitative study on the experiences of women ceos. Journal of Health Organization and Management, 31(2).
  • Terrell et al. (2017) Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill, Chris Parnin, and Jon Stallings. 2017. Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ Computer Science, 3:e111.
  • Zhang et al. (2016) Fei Zhang, Patrick PK Chan, Battista Biggio, Daniel S Yeung, and Fabio Roli. 2016.

    Adversarial feature selection against evasion attacks.

    IEEE transactions on cybernetics, 46(3):766–777.
  • Zhao et al. (2017) Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457.