Evaluating gender portrayal in Bangladeshi TV

11/14/2017
by   Md. Naimul Hoque, et al.
Eastern University
MIT
0

Computer Vision and machine learning methods were previously used to reveal screen presence of genders in TV and movies. In this work, using head pose, gender detection, and skin color estimation techniques, we demonstrate that the gender disparity in TV in a South Asian country such as Bangladesh exhibits unique characteristics and is sometimes counter-intuitive to popular perception. We demonstrate a noticeable discrepancy in female screen presence in Bangladeshi TV advertisements and political talk shows. Further, contrary to popular hypotheses, we demonstrate that lighter-toned skin colors are less prevalent than darker complexions, and additionally, quantifiable body language markers do not provide conclusive insights about gender dynamics. Overall, these gender portrayal parameters reveal the different layers of onscreen gender politics and can help direct incentives to address existing disparities in a nuanced and targeted manner.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/15/2020

Uncovering Gender Bias in Media Coverage of Politicians with Machine Learning

This paper presents research uncovering systematic gender bias in the re...
04/06/2021

Towards measuring fairness in AI: the Casual Conversations dataset

This paper introduces a novel dataset to help researchers evaluate their...
06/27/2019

A Utility-Preserving GAN for Face Obscuration

From TV news to Google StreetView, face obscuration has been used for pr...
12/22/2021

Quantifying Gender Biases Towards Politicians on Reddit

Despite attempts to increase gender parity in politics, global efforts h...
11/30/2018

Understanding Unequal Gender Classification Accuracy from Face Images

Recent work shows unequal performance of commercial face classification ...
08/13/2020

Analyzing Who and What Appears in a Decade of US Cable TV News

Cable TV news reaches millions of U.S. households each day, meaning that...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Defining a global framework for change, representatives of 189 nations hammered out comprehensive commitments under 12 critical areas of concern in Beijing Declaration and Platform for Action (BPfA), and media is one of the 12 unprecedented scopes. BPfA emphasizes the strategies of changing stereotyping and inequality of women’s access to and participation in all forms of communication (UN (1996)), but how far have we been able to address gender disparity in South Asian media? This paper intends to assess gender parity in Bangladeshi TV using some quantitative markers. We utilize machine learning based techniques to revisit media as a global critical area of concern.

Previous qualitative research done on different TV media (e.g. (GSDRC, 2014), (Tomlinson, 2017), (Ahmed, 2011) etc.) suggest that screen time, skin color, and body language provide substantial markers to determine gender discrimination. Body language, its variability, and expressiveness are sometimes referred to as the markers of stereotypical gender portrayal in media. Women are assumed to be more submissive or expressive depending on the genre of media. For example, Bangladeshi advertisements are hypothesized to have a presence of highly expressive and light skinned women as a way to attract more consumers (Krishen et al., 2014), (Xie & Zhang, 2013)

, and on the other hand, many TV dramas are hypothesized to portray women as submissive (both in roles and body language). We investigate head pose and eye gaze direction and variability as a preliminary way to start exploring the body language used in Bangladeshi media. Classifiers trained to detect face, gender, and skin color can be used to analyze some of the other markers. Using our results, we demonstrate the potential of machine learning to provide a commentary on the current state of gender portrayal in Bangladeshi TV.

2 Dataset and feature extraction

For our initial analysis of the subject matter, we have collected a dataset of 202 videos. Among these, 82 are TV dramas, 70 are political talk shows and 50 are TV advertisements. The videos are collected from different time periods (from 1986 to 2017), and across different genres of drama and advertisements. A larger dataset would yield better results, but comprehensive video archives are hard to find in the context of Bangladesh. For our current dataset, we have extracted features from the TV dramas and talkshows using a frame sampling rate of 1 fps. And in the case of advertisements, we have used 4 fps as they are much shorter in duration.

2.1 Face detection

Face detection is the most important part of our process as most of our analysis are based on the attributes extracted from face. For face detection, we have used dlib 111dlib.net, an open source Python/C++ library. Dlibs facial landmark detector is built on the work of (Kazemi & Sullivan, 2014) and returns a total of 68 points which in turn can be used to detect mouth, eyes, jaws, and nose.

2.2 Face color estimation

For face color estimation we first extract the jaw of the detected face from the frame. We do that to minimize any kind of noisy i.e dominant color around the face. After that, we apply -means clustering on the pixels of the jaw. The pixels encompassing the jaw will eventually fall under the largest cluster and so we take the center of the largest cluster as our estimated face color.

2.3 Head pose and eye gaze estimation

We next extract head pose (up, down, left, right) of a person with respect to camera. We used Perspective-n-Point (OpenCV PnP function) algorithm with . The 2D points are taken from dlibs 68 facial landmarks and the corresponding 3D model is taken from OpenGL 222http://aifi.isr.uc.pt/Downloads/OpenGL/glAnthropometric3DModel.cpp

. Eye gaze vectors are detected using OpenFace

(Baltrušaitis et al., 2016) based on the work of (Wood et al., 2015).

2.4 Gender detection

For gender detection from image, we have used a trained Convolutional neural network model

(Levi & Hassner, 2015). The authors have claimed 86% accuracy rate on Adience benchmark (Eidinger et al., 2014) but we have created our own test set of 1000 faces extracted from our video sources. Accuracy rate for our test set is 89%.

3 Analysis and discussion

Gender Studies experts have previously stated that cross-border consumerism is different according to geographic, cultural, religious, and political situation. Bangladeshi entertainment sector largely portrays females in different traditional identities and roles in private domain, mostly in family life (Ahmed, 2011). This finding draws a huge contrast from the portrayal of female characters in western media where the presence of women is usually lower than in South Asian media industry (Tomlinson, 2017).

This notion about Bangladeshi women’s media presence, however, changes across the type of media, as found in our analysis. Figure 1 shows the aggregated screen time for males and females in ads, political talk shows, and drama. Women are disproportionately given more screen time in advertisements, and much less time in the talk shows. Screen time in ads makes a case for the consumerism driven portrayal of women in Bangladeshi TV. It also shows an absence of women in providing political commentary, which is traditionally a male dominated field in South Asia. Males and females share similar screen time in the TV drama category. Screen time as a marker for gender balance depicts the larger demand of female role in entertainment sector like advertisements and drama rather than active and positive role playing participation like political talk shows.

Figure 1: Male and Female aggregate screen time.

Figure (a)a shows that there is a greater tendency among both men and women in Bangladeshi media to pose their faces upwards in ads and serials. However, women performers consistently look more downwards (and less upwards) compared to males across all of our chosen categories of media. "Up" is defined as any direction above the horizontal axis, and "Down" is any direction below it. The aggregated counts of both directions from all frames are used to calculate the percentages here. We additionally investigated the nature of eye gaze in the three categories, as shown in figure (b)b. This provides another lens to the analysis: even though performers’ faces tend to look more upwards than downwards, their eyes tend to look downwards more often. Females also consistently look downwards more than males.

(a) Head pose proportions among males and females.
(b) Eye gaze directions for males and females.
Figure 4: Head pose and eye gaze proportions.

The aggregate pose direction counts do not represent a definitive marker for gender representation since the differences in each direction are not significantly different, although it does represent a general trend of head pose and eye gaze among all performers and speakers. In order to explore the nature of such body language and expressiveness among actors and speakers, we investigate the variability in these poses in figure 7. The variations in the head poses reveal a slightly greater level of activity and expressions among females, with higher variability in ads and drama category compared to males (figure (a)a). Eye gaze directions vary quite much in the advertisement videos we collected among both males and females.

(a)
(b)
Figure 7: Box Plots for the distribution of the normalized Y vector (up, down) of (a) Head pose and (b) eye gaze.
(a) Silouette vs plot for face color clusters.
Drama Ads Talkshow
Female
C P B
(60%) (34.9%)
(40%) (66.7%)
C P B
(57%) (36.5%)
(43%) (69.2%)
C P B
(46%) (38.3%)
(54%) (65.4%)
Male
(58%) (33.3%)
(42%) (63.2%)
(50%) (36.3%)
(50%) (69.0%)
(39%) (38.1%)
(61%) (61.6%)
C = Cluster centroid color, P = Percentage of data points in the cluster,
B = Brightness value in HSB
(b) Cluster centroids of each face color cluster (k = 2).
Figure 10:

Face color cluster analysis. The cluster centroid colors are followed by the proportion of faces in each cluster, and the brightness values of the centroid colors in HSB space.

Clustering the face colors (to find the aggregate lighter and darker tones of skins present) provides us a way to test the qualitative hypothesis that light skinned women and/or light-toned makeups are prevalent in Bangladeshi media. Figure (a)a shows the Sillouette vs. (number of clusters) plot when using the -means algorithm. For each category, gives the best results, which splits the face colors into a binary dark and lighter tone partitions. In Figure (b)b, we show the centroids of each cluster as a representative color of that cluster. Contrary to some qualitative remarks (e.g. (Krishen et al., 2014)), our results show that darker tones are more prevalent in TV, with the exception of talk shows.

4 Conclusion

In this paper, we have demonstrated the potential of machine learning to test some qualitative markers popularly used by gender researchers to identify gender discrimination in TV media. We extracted quantifiable markers from videos in three different categories of media in Bangladeshi TV, leveraging face, gender, and pose detection classifiers. We then analyzed these features to explore the nature of disparity in gender representation. We found that screen time for different genders vary across different categories, and darker toned skins are more prevalent, contrary to the existing remarks. We also found that head pose and eye gaze (two metrics used to understand body language) are not definitive markers for understanding gender disparity, but they reveal some consistent patterns for Bangladeshi women performers that deserve a comprehensive exploration. With a bigger dataset, we should be able to understand the detailed and longitudinal nature of gender discrimination in media using different classifiers. This also calls for better preservation strategies of TV shows in proper archives, as it is difficult to find curated and comprehensive compilations of Bangladeshi TV shows.

References

  • (1)
  • Ahmed (2011) Ahmed, H. S. (2011), ‘Bangladesh: media marketing of beauty and female stereotypes [online]’.
    http://www.violenceisnotourculture.org/content/bangladesh-media-martketing-beauty-female-stereotypes
  • Baltrušaitis et al. (2016) Baltrušaitis, T., Robinson, P. & Morency, L.-P. (2016), Openface: an open source facial behavior analysis toolkit, in ‘Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on’, IEEE, pp. 1–10.
  • Eidinger et al. (2014) Eidinger, E., Enbar, R. & Hassner, T. (2014), ‘Age and gender estimation of unfiltered faces’, IEEE Transactions on Information Forensics and Security 9(12), 2170–2179.
  • Google (2016) Google (2016), ‘Using technology to address gender bias in film’.
    https://www.google.com/intl/en/about/main/gender-equality-films/
  • GSDRC (2014) GSDRC (2014), ‘Gender and media [online]’.
    http://www.gsdrc.org/go/topic-guides/gender/gender-and-media
  • Kazemi & Sullivan (2014) Kazemi, V. & Sullivan, J. (2014), One millisecond face alignment with an ensemble of regression trees, in

    ‘Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition’, pp. 1867–1874.

  • Krishen et al. (2014) Krishen, A. S., LaTour, M. S. & Alishah, E. J. (2014), ‘Asian females in an advertising context: exploring skin tone tension’, Journal of Current Issues & Research in Advertising 35(1), 71–85.
  • Levi & Hassner (2015) Levi, G. & Hassner, T. (2015), Age and gender classification using convolutional neural networks, in ‘IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) workshops’.
    http://www.openu.ac.il/home/hassner/projects/cnn_agegender
  • Tomlinson (2017) Tomlinson, K. (2017), ‘Pointing the needle to a future in which gender is not destiny [online]’.
    http://www.bbc.co.uk/blogs/mediaactioninsight/entries/2fd5dda8-c0ca-4f96-95f0-798f88ad2ac9
  • United Nations (1996) United Nations (1996), ‘Report of the fourth world conference on women’, United Nations publication, New York, Sales No. 96.IV.13 pp. 99–103.
    https://www.un.org/esa/gopher-data/conf/fwcw/off/a–20.en
  • Wood et al. (2015) Wood, E., Baltrusaitis, T., Zhang, X., Sugano, Y., Robinson, P. & Bulling, A. (2015), Rendering of eyes for eye-shape registration and gaze estimation, in ‘Proceedings of the IEEE International Conference on Computer Vision’, pp. 3756–3764.
  • Xie & Zhang (2013) Xie, Q. & Zhang, M. (2013), ‘White or tan? a cross-cultural analysis of skin beauty advertisements between china and the united states’, Asian Journal of Communication 23(5), 538–554.