Stochastic Parrots: The Hidden Bias of Large Language Model AI

EDRM - Electronic Discovery Reference Model
Contact

EDRM - Electronic Discovery Reference Model

The AI Video and illustrations in this article were all created, written and directed by Ralph Losey. The video is followed by citations to the underlying article and a transcript. Click on the image below to see the You Tube video.

Left Click on Image to be redirected to YouTube video. Image by Ralph Losey with Midjourney.

Article Underlying the Video. The seminal article on the dangers of relying on stochastic parrots was written in 2021, On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, 3/1/21) by a team of AI experts, Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell. This article arose out of the 2021 Conference of the Association for Computing Machinery (ACM) on Fairness, Accountability, and Transparency (ACM Digital Library).

Transcript of Video

GPTs do not think anything like we do. They just parrot back pre-existing human word patterns with no actual understanding. The words generated by a GPT in response to prompts is sometimes called speech by a stochastic parrot.

According to the Oxford dictionary, Stochastic is an adjective meaning “randomly determined; having a random probability distribution or pattern that may be analyzed statistically but may not be predicted precisely.”

Wikipedia explains stochastic is derived from the ancient Greek word, stókhos, meaning ‘aim or guess’ and today refers to “the property of being well-described by a random probability distribution.”

Wikipedia also explains the meaning of a stochastic parrot.

In machine learning, the term stochastic parrot is a metaphor to describe the theory that large language models, though able to generate plausible language, do not understand the meaning of the language they process.

The stochastic parrot characteristics are a source of concern when it comes to the fairness and bias of GPT speech. That is because the words the GPTs are trained on, that they parrot back to you in clever fashion, come primarily from the internet. We all know how messy and biased that source is.

In the words of one scholar, Ruha Benjamin, “Feeding AI systems on the world’s beauty, ugliness, and cruelty, but expecting it to reflect only the beauty is a fantasy.

Keep both of your ears wide open. Talk to the AI parrot on your shoulder, for sure, but keep your other ear alert. It is dangerous to only listen to a stochastic parrot, no matter how smart it may seem.

The subtle biases of GPTs can be an even greater danger than the more obvious problems of AI errors and hallucinations. We need to improve the diversity of the underlying training data, the curation of the data, and the Reinforcement Learning from Human Feedback, RLHF. It is not enough to just keep adding more and more data, as some contend.

This view was forcefully argued in 2021 in an article I recommend. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (FAccT ’21, 3/1/21) by AI ethics experts, Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell.

We need to do everything we can to make sure that AI is a tool for good, for fairness and justice, not a tool for dictators, lies and oppression.

Let’s keep the parrot’s advice safe and effective. For like it, or not, this parrot will be on our shoulders for many years to come! Don’t let it fool you! There’s more to life than the crackers that Polly wants!

Written by:

EDRM - Electronic Discovery Reference Model
Contact
more
less

EDRM - Electronic Discovery Reference Model on:

Reporters on Deadline

"My best business intelligence, in one easy email…"

Your first step to building a free, personalized, morning email brief covering pertinent authors and topics on JD Supra:
*By using the service, you signify your acceptance of JD Supra's Privacy Policy.
Custom Email Digest
- hide
- hide