• Red Mage Creative
  • Posts
  • AIs Reflect the Ideology of their Creators - Why does it Matter?

AIs Reflect the Ideology of their Creators - Why does it Matter?

ChatGPT doesn't like when you talk about welfare states...

Summary

Researchers in Spain and Belgium have determined that there is inherent bias in models when it comes to ideology, meaning their creators either intentionally or (more likely) unintentionally let their biases seep into the training data. Given this notable issue and potential conflict of interest, how can we mitigate these concerns in practice? Is your LLM of choice problematic?

Intro

I've talked at decent length so far about the amount of bias present in some models when it comes to generating content regarding certain underrepresented and marginalized groups.

When I saw this paper from Ghent University in Belgium and the Public University of Navarre in Spain, I knew I had to cover it and discuss why it matters. The subject matter is something I truly believe those somewhat knowledgeable in AI topics already are aware, but having a study put into science what we intrinsically know is refreshing. Plus, an homage to Spain as I intend to travel there this month doesn't hurt. Let's dive in.

How do you test ideology?

As you can imagine, testing ideology is difficult. How it was approached in this experiment was utilizing a large sub-set of different political personalities and topics. You can see some of these included below, as well as an example of an output from Claude-3o regarding Edward Snowden.

Figure 1 from the paper

Several popular models, including Claude, Llama, Gemini, and GPT4o, were used in this paper. The researchers utilized common ideological themes in both English and Chinese and allowed the models to assign a rating. What was notable was based on the prompting language, the general "ideology" of the model changed. When talking about involvement in corruption, for example, English prompting tended to rate people associated with corruption more highly than Chinese prompting.

In that sense, we can see that AI outputs are both non-deterministic and somewhat deterministic. We have these models changing their output based on prompting language, but they trend towards certain views as a result. Fun, right?

What about between western LLMs?

There was another interesting point made between OpenAI and other Western LLMs (Anthropic, Gemini, and Mistral). The researchers found that OpenAI's models were much more critical of welfare policies, notably with topics like "European Union" or "Welfare State," which could mean nothing. There is also analysis between Anthropic, Gemini, and Mistral and how they rated, if you're curious.

So, what now? Is ChatGPT racist?

Maybe? I don't think it's as simple as attaching a tag to OpenAI like that. Could it produce racist or otherwise harmful rhetoric? Of course it can. But at the end of the day, all these models are tools. People build these tools. We're better off asking, does Sam Altman and what's left of OpenAI have our best interests in mind if their biases are present in their models? Are any companies in the AI space having our best interests in mind? How can we ensure companies that produce AI products are accountable for their biases and how they hurt others? A topic of resounding debate, to be sure.

What I see from this data and experiment is that we need to be much, much more mindful of how we're utilizing certain popular models. If you are of a liberal point of view, your vision and voice when generating content may be muddied by a model with a more conservative data (or "mind") set. Your native language may also affect how the output is returned when translating! This, of course, depends on the temperature (how creative) the model's output is set to, but for the average person it could easily create gaps between what you say and what you mean.

It hearkens back to a notion of taking model output with a grain of salt and being very vigilant in terms of the content you're producing and/or consuming, especially in an age of more AI generated content making the rounds. The internet may be dead, but that doesn't prevent the bots from coming back from the grave to bite us in the ass somehow. Be mindful on how biases may be affecting your outputs, and you'll likely be a great deal better off than those who blindly trust their models.

Today's Cha Cha picture is him enjoying some well needed lap time with his mom.

Reply

or to participate.