Zeyneb Kaya

Zeyneb Kaya

Saratoga High School

Women in the Workplace: Analyzing Gender Biases in Corporate Email Communications

INTRODUCTION Gender disparities in the workplace hinder women in career advancement and equity. Communications within companies reflect gender norms that affect corporate structures and demeaning discrimination. Gender biases are exhibited in different forms, including unequal treatment, associations of gender with certain concepts, and stereotyping language. Text analysis of language use provides methods towards building inclusive and diverse organizations. We study gender biases in corporate interactions through the Enron Corpus, a uniquely available database of ~500K real workplace emails of Enron employees. METHODS The methods of our research took different angles of measuring linguistic gender biases to provide a comprehensive analysis on the role of gender in the workplace. We first determined whether the receiver’s gender affected the language use in the email. Computational text analysis through LIWC generated linguistic and psychological features from emails’ bodies and enabled the analysis of imbalances in language use. Model explainability methods were used to extract the characteristics distinguishing emails to men from those to women. We further examined gender disparities through the asymmetric associations of gender with profession words. We used word embeddings to examine representation biases and the organizational flaws they reflect. Finally, we identified sexist and problematic emails in the workplace and created a tool to flag such phrases during email composition. We applied this tool to the Enron Corpus to observe the prevalence of biased statements in emails. RESULTS Our results found a significant presence of bias in workplace emails from all three angles. Linguistic properties of emails were found to hold high predictive power for receiver gender, our model attaining an F1-score of 0.86. Emails directed to men were more likely to have achievement-oriented language, while those to women were more likely to contain prosocial language. Occupation words held substantial genderedness in the emails as well. This suggests greater implications on how roles are distributed in the workplace and how stereotypes affect women’s careers. Our model was highly effective at detecting sexism in workplace statements, attaining a test F1-score of 0.94. When applied to the Enron dataset, we identified a significant distribution of toxic statements among workplace emails, finding nearly 10% of statements to show bias. With such pervasiveness in workplaces, integration of this tool is valuable through its potential to address sexism in emails and prevent such frequency. CONCLUSION Our extensive analysis reveals gender biases on multiple levels that affect women’s careers. In our contributions, we establish the pervasiveness of gender bias, present methods crucial to understand the problem, and propose an NLP tool towards building inclusive workplaces.

Bio: Zeyneb Kaya is a high school student at Saratoga High School, and has conducted research at institutions such as San Jose State University, Stanford University, and UC Santa Barbara in the areas of natural language processing and computational linguistics. Her research focuses on applications of NLP in the social sciences, including bias, communications, and low-resource langauges. She is the founder of the nonprofit organization, Romeyka Everlasting, where she works towards developing computational methods for the documentation and analysis of the endangered dialect, Romeyka. She is also a Stanford Women in Data Science Ambassador, and continues to study what language reflects about society. She has been awarded a winner of the Congressional App Challenge (2021), and was a UCSB GRITx Talk speaker (2022).