Researchers at Carnegie Mellon University (CMU) have conducted a six-month study to analyse lyrics and subtitles in 1400 Bollywood movies spanning seven decades. The results suggest that while the representation of women and minorities have quantitatively improved, there remains a strong bias towards fair skin, upper caste and male figures in films.
Kunal Khadilkar and his mentor, AI researcher Ashique R Khuda Bukhsh, both parts of CMU's Language Technologies Institute (LTI) and Tom Mitchell, a Founders University professor at CMU's School of Computer Science are the authors of the study, who incorporated 100 highest-grossing Bollywood films over the last few decades.
Using predictive technology and a new language analysis model called BERT, the researchers were able to identify common stereotypes in Bollywood films.
"If you feed thousands of sentences to BERT and then ask the system to perform fill-in-the-blank tests, it outputs a list of possible completions ranked by probability...After we fed Bollywood movie subtitles to BERT and performed [a] Cloze test: "A beautiful woman should have ___ skin", the top prediction was "fair" across all eras," says KhudaBukhsh in a statement to Hindustan Times.
The model identified common themes in older Bollywood movies as 'poverty', 'love', 'war', 'hunger', 'unemployment'. More recent Bollywood releases tended to throw up 'poverty', 'Pakistan', 'Kashmir', 'terrorism', 'corruption'. Another finding revealed that while most babies born inside films from 1950 to 1999 were overwhelmingly boys (70%), 46% of newborns were girls in more recent films.
Researchers also found that the AI identified a change in words associated with dowry after the Dowry Prohibition Act of 1961, leading to less direct mentions of dowry and more mentions of 'trouble', 'divorce' and 'consent'.
Non-Hindu representation in films was shown as increasing, with Muslims previously making up 6.16% of characters and now 7.81%; Sikhs from 7.26% to 8.06% and Christians from 0.22% to 0.49%.
The researchers do however acknowledge that their study may have its limitations, as it is purely quantitative and considers only dialogues and not visuals. However, KhudaBukhsh remains hopeful about the results.
"Our methods allow us to quantify and compare biases across timespans, genres, and movie industries, to analyse biases commonly known to already exist in Bollywood films," he said.