About the Project

We spent our term replicating the research conducted in the 2016 paper produced by Bolukbasi et al. titled, “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings”, where that team aimed to expose the implicit gender biases found in the way that we use our language through word embeddings and to also determine whether a soft or hard debiasing algorithm would be more effective in eliminating that gender bias within word embeddings.

What's a Word Embedding?

An object for text analysis and text generation through mapping text into individual word vectors. Machine learning allows relationships between words and their surrounding text to be highlighted. Applications for word embeddings include consumer feedback parsing, spam detection, and information retrieval(ex. search engines). For our purposes, we use vector mathematics to expose biased relationships between words unrecognized or unproven through other forms of text analysis. We will specifically focus on the gender bias.

Hard Debiasing

A methodology of debiasing word embeddings allows us to identify a direction within the embedding that represents gender; to neutralize the bias within gender neutral words; and to equalize those gender specific word pairs.

Analogies

Analogies are used to identify gender stereotypes within the embedding and to detect a change in the gender bias after running neutralize and equalize on the embedding. Expressed as "He is to king, as she is to queen".

- Our Team -

Computer Science seniors a part of the Class of 2023 at Carleton College

Aishwarya Varma

Linear SVM

Aldo Polanco

Debiasing

Angela Ellis

Analogy Generation

Darryl York III

Data Visualization