Friday 26 June

Free to read  ·  Free to subscribe  ·  Free to join

Notes

Why AI needs a bias

By Toby MacLachlan· Building the AI students should use for homework· 6 min read

The shortcomings of GenAI are pretty well rehearsed now: hallucination (no, that's just wrong), counting (how many Rs in strawberry?), sycophancy, and bias.

It's easy enough to ignore sycophancy, catch hallucination, and count for yourself. But bias isn't quite so black and white.

Especially as the word bias comes pre-loaded. Bias is polysemic - it has two meanings. People tend to use it as a pejorative, to mean unfair or prejudiced. But its second meaning (used in research) is: a weighting in one particular direction, without a moral dimension. It's interesting (to me at least) that AI neatly straddles the two: it's entirely built on weightings, and yet also has a moral dimension because of its training material and readership. This is perhaps why AI bias is hard to talk about, because people reach by instinct for the moral definition, but engineers understandably use the technical one, and the talk past each other.

I'd like to argue that prioritising and weighing information (i.e. bias) is an important part of any computing or language system, and that there is therefore good and bad bias.

Where the bias comes from

The excellent Victoria Hedlund recently posted an image of ChatGPT's varying explanations of how a light bulb worked to an American and a Russian.

https://www.linkedin.com/posts/victoriamhedlund_biasgirl-biasaware-chatgpt-share-7472719539215056897-Mv0F/

Her published research (SSR, 2026) found the same effect by gender: models explain concepts differently when asked to explain them to a boy or a girl. These biases are there because they are there in the training data that defined the probabilistic relationships between words, which the GenAI carries into its outputs.

The problem here is pretty apparent. In fairness, the big AI labs are aware of it. Anthropic's main test, "Paired Prompts," does a similar experiment to Hedlund's: they pose the same question from two opposing positions and check whether the answer changes. Google admits that its models can amplify the biases already present in their training data, and OpenAI owns that even its own "ideal" reference answers never score a perfect zero. Neutrality, or content without bias, is a direction of travel.

Why a finite mind needs bias

AIs are, of course, trained on words written and spoken by humans, and that is where they get their bias from. The human mind needs bias, rather like the machines do. Our senses gather an estimated 11 million bits of information every second, but the conscious mind's capacity is around 40 bits or 0.0004% of the actual information. This is because 11 million bits would be utterly overwhelming. Bias, or the ability to apply weight to certain information over the rest, is a feature and a benefit of a finite mind, not a flaw.

If you're interested in what biases you have, I highly recommend the Harvard Implicit Association Test here. You get to pick from a list of uncomfortable biases, including bias for or against age, skin colour, gender, religions, and many more! It's a hard thing to game. It turns out I'm slightly biased against larger people, and some skin tones.

Don't be afraid: everyone has bias. The important thing is to understand it.

The reason we must understand our own biases is because humans have the advantage of being able to self-correct. We have plastic minds that learn from experience. AI does not (at least not within a single AI, and not yet).

Good bias and bad bias

So is there a difference between good bias and bad bias? If we were to split the term down the middle, we'd separate prejudicial bias (the boy/girl problem, which is bad) from directional bias, the built-in lean towards good reasoning rather than the supply of answers. This idea is as old as the hills. Aristotle, in the Nicomachean Ethics, (not too long and worth a read) says: "We ought also to take into consideration our own natural bias... we should force ourselves off in the contrary direction, because we shall find ourselves in the mean after we have removed ourselves far from the wrong side." His point is that no one is neutral, and a well-calibrated mind leans deliberately against its own inclinations in order to finish in the right place.

Ironically that is precisely what the big AI labs do. When OpenAI writes a Model Spec setting out how its model should behave, or when Anthropic trains "character traits" of even-handedness into Claude, it is not removing bias. It is deliberately engineering one. A model taught to treat opposing views with equal depth has been given a deliberate lean towards fairness. That is a bias, and one of the good ones: an Aristotelian corrective, bent the other way to come out straight. They just don't frame it that way, because "we engineered a bias into the model" is a harder thing to publish than "we made it neutral." This is therefore the work of AI builders: to engineer bias as a feature.

Perhaps we have an innate sense of the need for this, which would be why AI's sycophancy (i.e. failing to push or correct us) feels all the more uncomfortable.

The case for a little randomness

AI often feels human because there is a degree of unpredictability to it. Ask it to finish "I think therefore I..." and you'll get "am" every time, but ask it something weird like "the pelican sat on the pine cone because..." and you'll get a different answer on every attempt.

I asked ChatGPT 3 times to complete the sentence...

This randomness is engineered. It's what stops the output being machine-like and repetitive and allows the model to range across creative possibilities. Some will be right and some will be wrong, and therefore some will be biased and some will not. But interestingly the randomness is not totally random! The mechanism that varies an answer also pulls it towards the mean. Individual results will vary, but across the board AI users will experience the statistically most likely next word. So: the more you lean on AI, the more your own thinking gets pulled towards the average (whatever that might be).

Why this matters more for machines than for us

In humans, we tell children that it's good to fail, and that practice makes perfect. We say that because what we actually mean is that it's good to make mistakes, because human minds learn from them. The mistake updates us. Next time we reach for a different answer, not a random one, because the last one taught us something. AI's randomness generates variation but no learning. This is why bias in AI is just as present but slightly more worrying than in humans. The labs can now measure bias, reduce it, even open-source the test for it. What they cannot yet do is let the model learn on the fly: for now, learning happens only when a new model ships. Until that changes, the only safeguard is not the model improving itself, it is the human staying in the loop to catch what the model cannot. For now, both the people using AI and the people building it have to decide which biases are worth keeping.

Daniel Scullane noted this

The conversation

Get the best thinking on school leadership — weekly, by email.