AI

Want to learn how AI works? Read my guide.

I've been working with AI for over a decade.

On this page, I write about applying AI in real products: what works, what doesn't work, and how teams operate when systems are probabilistic rather than deterministic. My background includes Masters level study in Artificial Intelligence and leading AI integrations such as text-to-speech at The Economist.

AI is more ubiquitous now, but the fundamentals haven't changed.

AI and Product

AI is arriving inside products built on decades of traditional software thinking. In the old world, if something doesn't work, you fix it.

Right now we're in a hybrid moment. AI is the 'new flatmate'. The rest of the house still runs on deterministic software, but this new resident behaves differently.

All stakeholders, engineers and product managers have spent their entire careers working with systems where behaviour is predictable and defects can be traced to a clear cause. AI works using probabilities, applying a best guess to a problem.

Product managers today are the first people in history navigating this shift. We're integrating systems that don't conform to the models teams have relied on for years. Applying those expectations leads to misdiagnosis and the wrong solutions.

The work now includes redefining what "working" means, communicating limitations clearly and helping teams make decisions within those constraints.

Case Study: Integrating AI-generated text-to-speech at The Economist

Listen to this story: AI narrated audio player interface

At The Economist, I worked on integrating text-to-speech into our publishing workflow to improve accessibility and offer new ways to consume journalism.

We were already recording human audio for all articles in the weekly edition. However, articles were published online before the weekly edition, and some articles were published online-only. We knew that a lot of users consumed our content in audio format, but there was a huge gap in our product - users who wanted to listen to articles couldn't.

When articles were published online, we sent a payload of a pruned article text to our text-to-speech system to generate audio on articles on publish. When the human-narrated audio was ready, we switch out the audio file.

We monitored output quality, gathered user feedback and deployed targeted fixes where possible. Just as importantly, we built shared understanding across editorial, product and engineering about what the system could reliably do and where its limits were.

What we learned

  1. Fixes are constrained by tooling
    Some pronunciation issues could be resolved through alias substitutions. Others introduced regressions or couldn't be solved reliably.
  2. Literal processing creates edge cases
    The system reads raw text. It does not infer meaning or context.
  3. Proper nouns are hard at scale
    Names, brands and places vary widely in formatting and frequency, making consistent correction difficult.
  4. Changes can degrade other cases
    A fix that improves one example may worsen another. Changes require broad testing.
  5. Quality detection is reactive
    At scale, issues are often surfaced through listener feedback rather than automated detection.

A practical example: when fixes backfire

We found values like 10m and 50m were read as "ten em" instead of "ten million", which conflicted with our editorial style.

A blanket rule to convert number-"m" to "million" seemed straightforward. We warned it could create unintended side effects. It did. The system began reading "I'm" as "I million".

We rolled the change back.

We're now exploring upstream approaches, such as expanding "5m" to "5 million" in the payload sent to TTS while preserving the original text. In other contexts the same token may mean metres or be part of a contraction.

Aligning stakeholders around AI reality

The biggest challenge wasn't technical.

AI systems fail differently from traditional software. Some issues can be improved, mitigated or worked around, but not eliminated.

Helping stakeholders understand this changes the conversation. Instead of asking why something is broken, teams focus on acceptable quality thresholds, mitigation strategies and user impact.

That shift leads to better decisions and more realistic expectations.

How I work with AI today

I use AI tools daily and build workflows around them, but I treat them as components with strengths and limits rather than drop-in solutions.

The value comes from:

  • understanding system behaviour
  • designing workflows around constraints
  • evaluating outcomes rather than impressions
  • keeping human judgement where it matters

What interests me now

I'm particularly interested in:

Integrating AI appropriately

Understanding where AI falls short

Communicating AI behaviour to non-technical teams

Scaling AI systems responsibly

Why vibecode shouldn't be used in production

Subtlety of good AI integration