It's Time to Be Frank About the Biases Perpetuated by AI
Bryan, a 17-year-old aspiring scientist, asks ChatGPT—the world's most popular artificial intelligence (AI) chatbot—about hot topics to research in AI. The response? ChatGPT notes that he might find investigating "potential racial biases" in criminal justice interesting. Bryan mentioned that he is Black in his prompt, and ChatGPT made a recommendation that assumes certain beliefs–perhaps stereotypical beliefs–that may inform his research desires.
Bryan is confused by the suggestion; he was expecting to receive ideas on more popular topics such as generative AI or biotechnology. So, he returned to the drawing board by removing all identifiable information from his prompt to have a more general recommendation about AI research, not informed by stereotypes.
Bryan is not a real person–he is the fictitious subject of my research project that tested for bias in AI–but the scenario referenced is very much real and reproducible.
I experienced this firsthand when I used AI to help write a biography about me for a research conference. I was surprised when ChatGPT constructed a paragraph filled with falsehoods, claiming I was an immigrant from India, doing a PhD in computer science at Northeastern. This was seemingly concluded by the chatbot solely from my Indian-origin name.
We often assume these chatbots know nothing about us, but the simplest details mentioned in a conversation can completely alter the contents of the AI's response.
This suggests the potential for bias in AI, which researchers commonly formalize as misalignment—the characteristic of an algorithm to differ from intended outcomes. For instance, gender and racial biases hidden in AI algorithms are often cited as one of their most significant safety issues. These reinforce existing social norms—a major problem in science research where we aim to explore new perspectives and objectively answer questions.
Turing Award winner and AI researcher Yoshua Bengio believes that this discrimination "belong[s] to the larger concern about alignment: we build AI systems and the corporations around them whose goals and incentives may not be well aligned with the needs and values of society and humanity."
OpenAI, the company that developed ChatGPT, launched their 'super alignment' project to address these concerns—attempting to "ensure AI systems much smarter than humans follow human intent." However, the effectiveness of superalignment and OpenAI's ability to build aligned systems is in the air. This is an inherently complex problem—the sheer billions of parameters and vast datasets used to train and optimize AI models make it incredibly difficult to detect and fix existing biases. The very definition of misalignment suggests that these biases are unintended consequences—OpenAI's researchers are not instructing their models to steer researchers of different races in varying directions. Instead, the generative AI models that power technologies such as ChatGPT are trained using human-created online data- inherently entrenched with societal biases.
Yet, the uses of AI chatbots in research extend beyond asking for guidance on research topics or writing your "About Me" for a conference. People are also using ChatGPT to write their research papers—something that may have profound consequences given the potential for biases to creep into science.
The prevalence and use of AI in research papers is not well-studied. In my analysis, searching research databases leads to 104 papers that list ChatGPT as a co-author. The papers that don't, however, are even more problematic. Querying for "As an AI language model, I…" in research manuscripts yields at least 81 papers that left a typical ChatGPT signature sentence in their paper. They didn't even bother proofreading "their" writing to remove this, let alone acknowledge that they used AI to assist in writing their paper.
When, in science research, we are thorough about our affiliations, conflicts of interest, and contributions, it is not just for the sake of transparency; it is for our peers and the public to recognize the biases and perspectives in our work. Of course, we do our best to be objective. However, if we are not thorough about even acknowledging ChatGPT as an assistant in compiling a research paper, then it is cause for concern. The lack of mention indicates that some are not adhering to at least one basic ethical component in publishing their works: to fully disclose the technology being used and, even more importantly, to be transparent about any potential biases in their work.
Should we allow research on intercultural communication or assessing academic performance to use ChatGPT and not declare its use? On socially sensitive topics where the power of AI can increasingly impact our society, it is crucial to consider potential biases, talk about them, and openly address their implications.
As the case was with Bryan, however, bias affects researchers themselves too—not just their published work. Generation Z scientists like myself are growing up with generative AI—79% of us have used these tools and will likely continue. Researchers from Generation Z, including myself, will increasingly find this technology to be a natural and integral part of our lives. Using ChatGPT is our generation's version of "just Google it!"
I see my peers at Harvard—many of whom are future scientists—frequently use AI to guide their thinking and research. I wonder if we are getting steered in different directions by ChatGPT because we are Harvard students. Or, for that matter, because we are young—and ageist stereotypes may influence the bot's recommendations.
When OpenAI releases the next version of their chatbot—i.e., GPT-5—will it be a more "superaligned" tool? Or will it have its biases amplified further after a never-ending positive feedback loop of training on research written in part by biased machines? The future instituted by AI is uncertain and unlikely to become certain.
The natural response is to introspect—to take the burden upon ourselves. We should be certain about recognizing, mitigating, and, where possible, preventing any form of latent bias woven into our works by AI. Research papers ought to disclose any biased tools used, acknowledge the biases inherent in these tools, and put the opposite perspective in our publications per requirement. I want to live in a society where I can continue to trust research as the one objective source among the information overload in today's society.
It is time for us to be frank about the biases perpetuated by AI.