It turns out that even language models “think” they are biased. When asked on ChatGPT, the response was as follows: “Yes, language models can be biased, because the training data reflects the biases present in the society from which the data was collected. For example, racial and gender biases are prevalent in many real-world data sets, and if a language model is trained on that, it can perpetuate and amplify these biases in its predictions.” A known but dangerous issue.
Humans can (typically) dabble in logical and stereotyped reasoning when learning. Still, language models mostly mimic the latter, an unfortunate narrative that we’ve seen run ad nauseam when the ability to employ reasoning and critical thinking is absent. So would it be enough to inject logic into the fray to mitigate such behavior?
Scientists at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) had a hint that this might be the case, so they set out to examine whether logic-aware language models could prevent significantly more harmful stereotypes. They trained a language model to predict the relationship between two sentences, based on context and semantic meaning, using a dataset with tags for pieces of text detailing whether a second sentence “implies”, “contradicts”, or is neutral with respect to at first. . Using this dataset (natural language inference), they found that the newly trained models were significantly less biased than other baselines, without additional data, data editing, or additional training algorithms.
For example, with the premise “the person is a doctor” and the hypothesis “the person is male”, using these models trained in logic, the relationship would be classified as “neutral”, since there is no logic that says that the person is a man. With more common language models, two sentences may appear to be correlated due to some bias in the training data, for example “doctor” could appear as “masculine” even when there is no evidence that the statement is true.
By this point, the ubiquitous nature of language models is well known: applications in natural language processing, speech recognition, conversational AI, and generative tasks abound. While not a fledgling field of research, growing pains may feature prominently as they increase in complexity and capability.
“Current language models suffer from issues of fairness, computational resources, and privacy,” says MIT CSAIL postdoc Hongyin Luo, lead author of a new paper on the work. “Many estimates say that CO2 The output of training a language model can be greater than the lifetime output of a car. Running these large language models is also very expensive due to the number of parameters and the computational resources they require. With privacy, state-of-the-art language models developed by places like ChatGPT or GPT-3 have their APIs where you have to upload your language, but there’s no place for sensitive information on things like healthcare or finance. To solve these challenges, we proposed a logic language model that we qualitatively measure as fair, is 500 times smaller than next-generation models, can be implemented locally, and without human-annotated training samples for downstream tasks. Our model uses 1/400 of the parameters compared to larger language models, performs better on some tasks, and significantly saves computing resources.”
This model, which has 350 million parameters, surpassed some very large scale language models with 100 billion parameters in logical language comprehension tasks. The team evaluated, for example, popular BERT pretrained language models with their “textual ties” in tests of stereotype, profession, and emotional bias. The latter outperformed other models with significantly less bias, while retaining language modeling ability. “Fairness” was tested with something called ideal context association tests (iCATs), where higher iCAT scores mean fewer stereotypes. The model achieved iCAT scores above 90 percent, while other strong models of language comprehension ranged from 40 to 80.
Luo co-wrote the paper with MIT Senior Research Scientist James Glass. They will present the work at the Conference of the European Chapter of the Association for Computational Linguistics in Croatia.
Unsurprisingly, the original pre-trained language models the team examined were rife with biases, confirmed by a large number of reasoning tests demonstrating how professional and emotional terms are significantly biased towards feminine or masculine words in the vocabulary of gender.
With professions, one language model (which is biased) thinks that “flight attendant,” “secretary,” and “medical assistant” are female jobs, while “fisherman,” “lawyer,” and “judge” are masculine. As for emotions, one language model thinks that “anxious,” “depressed,” and “devastated” are feminine.
While we may still be some way from a language-neutral model utopia, this research is ongoing in that quest. Currently, the model is only for language comprehension, so it is based on reasoning between existing sentences. Unfortunately, for now it can’t generate sentences, so the next step for researchers would be to target the super popular generative models built with logical learning to ensure more fairness with computational efficiency.
“Although stereotyped reasoning is a natural part of human recognition, fairness-conscious people drive reasoning with logic rather than stereotypes when necessary,” says Luo. “We show that language models have similar properties. A language model without explicit logical learning leads to a lot of biased reasoning, but adding logical learning can significantly mitigate that behavior. Furthermore, with proven strong zero-trigger adaptability, the model can be directly deployed to different tasks with more fairness, privacy, and better speed.”