Stop Treating AI Like a Magician: Probabilities, Errors, and Deterministic Loops

AI is a magic trick.

AI is being communicated in the wrong way, and it is not helping anyone. It is not magic, it is probability, and once you see it that way, you can actually build systems that make it useful.

Marketing first, applications later?

AI has often been presented from a marketing-centric viewpoint, much like how magicians of old would showcase their illusions on stage to captivate audiences, leading them to believe in extraordinary abilities. However, the truth is, it's often just a clever deception. AI is logical and comprehensible once you understand the mechanisms at play.

And I believe that's the primary issue. Not the models themselves, nor the GPUs, or even the latest benchmark figures. It's the framing of the narrative.

Because when something is framed as magic, people react accordingly. They either worship it, fear it, or they outsource their thinking to it. Then they act surprised when it behaves like what it actually is, which is a probabilistic machine.

AI should be spoken about as probabilistic outcomes

With AI, it should be spoken about as probabilistic outcomes. That needs to be the focus.

A good way to think about it is this: if you give it a prediction to make, it will use previous data to predict what is going to happen next time.

So if you say to the AI, what do you predict, heads or tails, 50% of the time it will say heads, and 50% of the time it will say tails.

People get annoyed when you say that, because they want it to be more mystical than that. They want the magician. They want the stage. They want the smoke machine. But the coin flip example is the cleanest way I know to get the point across, because it strips away the theatre and leaves you with the mechanism.

Now scale that up, and you have the whole game

All you need to think about now is scaling that to something that is more complex.

When something is more complex, we typically think about it as having more dimensions. Now specifically, it is more embedding dimensions in a vector space.

All that really means is, if a coin has 2 outcomes and a die has 6, 2 coins have 4 outcome's, and two dice have 36. When it comes to a Language Model, it has about 50,000 to 100,000 outcomes and it makes a probabilistic prediction on what that outcome should be, based on what came before it.

That is it. That is the whole trick. More possible next steps, more context, more dimensions, more ways the probability can distribute itself. The words sound fancy, but the mental model stays the same.

Language models are just next-token prediction, but on steroids

In the circumstances where we are using language or we are using code, it is predicting the next thing to say in a sentence, very similar to autocomplete. But it is like autocomplete with vast amounts of increased complexity.

This is where people get confused, because they think, “Well, it is writing paragraphs, it is reasoning, it is doing my job.” But if you keep the probabilistic framing, it becomes much easier to understand both why it can be brilliant and why it can be catastrophically wrong.

It is predicting what comes next. Over and over again. With a huge amount of context. That produces outputs that look like intelligence, and maybe sometimes it is, kind of. However, really, it is just a very convincing statistical continuation. As a footage, the debate, is perhaps, this is how human intelligence works, but I doubt it. Read Daniel Kahneman’s Thinking, Fast and Slow, or speak to any human for more than 30 seconds, and you’ll realise we run on feelings, not probability or logic.

So where is AI useful?

If we think about it in a probabilistic way, where is it useful?

It is useful in int's entirety when you only need the answer to probably be right.

This is the bit that most marketing material conveniently forgets to mention. They sell certainty. They sell “replace your team” energy. They sell the magician. However, the real strength is that it can do the bulk of the work most of the time, and be accurate enough often enough that you can leverage its speed and ability to run without sleep, and without taking holidays or lunch.

The issue with large context and increased complexity

An artefact of complexity is that sometimes the more complexity you add, the more answers that are similar in probability exist.

It could go slightly this way or slightly that way, which is going to be largely determined now by the context or the prompts that you have already given. This will tweak it slightly to have slightly different probabilistic outcomes.

This is why tiny changes in prompting can produce wildly different results. It is not because the model has mood swings, but because you are nudging the probability distribution. You are changing the context, so you are changing what "next" most likely means. Furthermore, models are set with random seeds. All this means is that people like OpenAI and Anthropic add a random number generator along with your prompt, to try and create more interesting outputs (another part of the magician repertoire).

How to we account of the complexity error rate?

If AI can do the bulk of the work most of the time, but it will have a certain level of error. Overtime, that level of error can be incredibly impactful.

It can either cascade, meaning that if introduced at a fundamental level of what you are doing, by the end of what you are doing, you get almost complete nonsense, complete hallucinations.

This is the part where people get burned. They let it do something foundational, it gets one key thing wrong early on, and then everything built on top of that wrong thing becomes more wrong. Not always obviously wrong either. Sometimes it is wrong in a way that looks coherent. Which is worse.

About “hallucinations”

When someone says hallucination, what they mean is, the AI gave a probabilistic outcome that I was not expecting, or deem to be incorrect, but in fact, it's just spurting a dark corner of it's training data.

It is not strictly a hallucination. That was probabilistically the right outcome given the input, but it just was not the right outcome in terms of the expectations of the person of what they wanted to achieve.

This is why the word “hallucination” annoys me. It frames the model as if it is seeing things that are not there, like it is broken or delusional. Most of the time it is not broken. It is doing exactly what it is designed to do: produce a plausible continuation. The mismatch is between probabilistic plausibility and human requirements.

Use AI all day long at 85%

In terms of usefulness, if your work only needs to be something like 85% accurate, use AI all day long. End-to-end, no guardrails, let it fly.

Examples could be... Drafting. Brainstorming. Producing first passes. Getting the shape of something. Doing the heavy lifting when perfection is not required, or writing questions for the 1% club (ITV game show)... wallop.

If you need specificity, you need have rules and parameters

If you need specificity, you have to introduce rules, and you have to introduce parameters that force the AI to loop until it gets a probabilistic outcome that can be deemed acceptable.

This is the real shift from “AI as a tool you chat to” into “AI as a component inside a system”. People want prompts. What you actually want is constraints, checks, gates, and retries.

Reinforcement learning without a human in the loop (a simple example)

Here's an example of how they are doing reinforcement learning without a human in the loop; mathematics is another example.

It has an answer, so you can go back to the AI and say, "Nope, try again, nope, try again," until it gets that answer.

Next, you bottleneck the AI, and once it gives you the right answer, you let it through the gates, so it can do the next task.

That is the key idea: you do not need a human to say, "This is good" if you can define "good" deterministically.

I Speak From Experience. My Productivity is 100x.

I've been building, experimenting, and refining frameworks, and workflows using this exact method, and when you think about AI in this way, the kind of constraints you can build become clearer, and clearer.

Essentially, I build a set of deterministic rules that force, and encompass the AI, and force it to try again until it gets it right.

Watchers, scripts, and checking the ducking code

For example, you get the AI to do a large piece of work. It gets it around 85% right, and with the right rules and constraints, you identify the 15% that it got wrong.

If the things that it got wrong are able to be analysed deterministically using things like pattern recognition, for example have you labelled all of the buttons in the code, or does the answer appear in the question?

You can write a deterministic watcher script that says, once the AI has finished check the ducking code, has it got all the labels? If it does not, the script goes back to the AI and says, here are the mistakes that you have made, correct them.

When you do this the AI has a much smaller, less complex set of probabilities that it needs to focus on (that incorrect 15%).

This is the practical version of the whole philosophy. Let it generate. Then verify. Then feed back the deltas. Reduce the search space. Constrain the next attempt. Repeat until you get something you can ship.

Bulk work first, then grind down the error

At the beginning you're say... Here is everything. Very complex. Lots of things can go wrong.

Then you move onto... OK, that's 85% correct. Here is the 15% that you got wrong. Go fix that.

Then your probability space is smaller, so it will get 90% of that 15% correct, leaving only 1.5% error left, with an even smaller probability space.

You keep doing that until your work is practical enough to be good enough.

This is how you stop the cascade. You do not let the errors sit there and compound. You force the system to surface them, isolate them, and then re-run the model on a narrower task where the probability distribution is easier to control.

In programming and maths, you can get close to 100%

In fields like programming and maths, you can achieve a working product that is practically 100% perfect. Since everything on a computer involves programming and maths, all computer-based tasks can eventually be automated, FYI.

And that's the crux of it. That's what I've been doing.

I'm building workflows that make the AI, or force the AI, to go back and fix its mistakes. I don't have to do it myself. It just loops around until it's done.

This writing was a whats app voice note...

What you're reading write now... It's a voice note I did on what's app. I've built a system where you drop in the voice note, and out pops the blog. My point, you're reading an example of what I'm saying right now.

So how I have done it? Scripts, a programmable browser, and quite a bit of experimenting

I've been building, and refining in a few ways. First you need to be clear on what you want, second you need to close the loop.

Essentially I use a collection of CRON jobs, scripts, pattern recognition, and technologies like a programmable browser.

This allows AI to go into the browser try it's own application. If the application does not work, it goes back again, and it loops around until it works. Seeing it's own errors, and doing it's own debugging.

On top of that there's a script, that enforces a todo list. For example, create a profile, sign into the account, create a new post, leave a comment. It does it all in loop until it is done, and it works on loop 24-7 autonomously.

This is where it gets interesting, because now you are not just generating text. You are running a system. The model is embedded inside an automated loop that interacts with real interfaces, checks outcomes, and retries. That is the difference between a demo and something you can actually rely on.

There is a downside: it is incredibly energy expensive

This I must say is an incredibly energy expensive way to do things.

In some tasks it would be much less energy expensive if you just used a deterministic approach in the first place, or a human, fulled by coffee and peanut butter and jam sandwiches.

However, the difficulty is, it is very, very difficult to do things deterministic in that way because the resources are in short supply, and ultimately it's much slower.

It is easier, more reliable, and quicker to do something probabilistically and then refine it deterministically, if you know what you're doing.

That is the trade. You spend energy and compute to 100x human time and human effort. You accept that the model is sloppy, then you wrap it in rules that are not sloppy. You let probability do the exploration, then you let determinism do the enforcement.

And that is it

That is all I am going to say. There you go.

If you like that, leave a comment and let us have a discussion about it.

Wallop.

AI Is A Magic Trick: To Use It Productively, You Need To Stop Being Fooled!