When to Call the LLM?

When to Call the LLM?

My AI pipeline was slower than it needed to be. When building a Book-to-Audio converter, the obvious culprit was the LLM cleaning every paragraph of OCR-mangled text. The fix seemed simple: add a gate that skips clean paragraphs entirely. It worked. But then the timing data revealed a second hidden slowdown that the optimization itself had exposed — one that was invisible until the gate made it visible.


Introduction to Genetic Programming

Introduction to Genetic Programming

What if you could write a program without actually writing it? Genetic Programming does exactly that — it evolves code the same way nature evolves species: randomly, ruthlessly, and surprisingly effectively. In this post I walk through how it works, build up a toy version from scratch, and then turn it loose on a hidden mathematical function to see if it can find the answer on its own. Spoiler: it can. But the more interesting question is how.


The No Free Lunch Theorem: Why No Learning Algorithm Is Universally Best

The No Free Lunch Theorem: Why No Learning Algorithm Is Universally Best

There's a mathematical proof that says no algorithm — no matter how clever, how sophisticated, or how well-designed — can outperform random guessing when averaged across all possible problems. Not A*, not neural networks, not humans in the loop. Every advantage on one class of problems is paid for, dollar for dollar, somewhere else. It sounds like it should be false. It isn't. And once you understand why, you'll never look at machine learning the same way again.


Inside the Qwen3-TTS Engine Code (Qwen3-TTS, Part 2)

Inside the Qwen3-TTS Engine Code (Qwen3-TTS, Part 2)

What does it actually take to plug a brand-new AI voice engine into an existing codebase without breaking everything else? This post pulls back the curtain on the code behind Book2Audio's Qwen3-TTS integration — from the two-method abstraction that makes any TTS engine swappable, to the GPU memory tricks that squeeze a 1.7B model onto a consumer laptop.


Refactoring the Book2Audio Parsers

Refactoring the Book2Audio Parsers

The unglamorous work of software is making things consistent — and this update to Book2Audio is exactly that. Two parsers that did the same job differently have been brought into alignment, sharing a single text-cleaning pipeline and a unified paragraph accumulation strategy. It's the kind of refactor that doesn't change what the tool does today, but makes possible something much more interesting tomorrow: an LLM-based cleaning step that can fix the OCR errors, broken page splits, and stray footnotes that rule-based cleaning can never quite catch.


What Exactly Is an Inductive Bias?

What Exactly Is an Inductive Bias?

Every learning algorithm is making a bet. It can't prove its predictions from the data alone — it's sneaking in assumptions, whether it admits them or not. Name those assumptions precisely enough, and something surprising emerges: there's no such thing as induction. It's deduction in disguise. This post unpacks what that means, why stronger assumptions lead to better generalization and more spectacular failures, and what it reveals about neural networks that most people never think to ask.


Adding EPUB Support to Book2Audio

Adding EPUB Support to Book2Audio

Book2Audio started as a PDF converter — but the best books often come as EPUBs. This post walks through how we added EPUB support, what it took to strip a RAG-focused parser down to something clean enough for audio, and the small details that turn out to matter when you're reading a book aloud: chapter titles that actually get spoken, footnote markers that don't interrupt the flow, and a debug mode that shows you exactly what the parser heard before you commit to a three-hour conversion run.


Implementing Qwen3-TTS in My PDF-to-Audiobook Pipeline (Qwen3-TTS, Part 1)

Implementing Qwen3-TTS in My PDF-to-Audiobook Pipeline (Qwen3-TTS, Part 1)

What if you could turn any PDF into an audiobook and fine-tune the narrator's voice with nothing but a plain English instruction? Alibaba just open-sourced Qwen3-TTS, and it's worth paying attention to — nine built-in speakers, natural language style control, and weights you can run on a laptop GPU. But does it actually sound good enough to listen to for hours? The answer might surprise you.


Machine Learning 101: The Key Concepts Behind Every Learning Algorithm

Machine Learning 101: The Key Concepts Behind Every Learning Algorithm

Machine learning textbooks have their own vocabulary. But behind the jargon lies a process that would be deeply familiar to Karl Popper: conjecture and refutation. This post is a short reference guide


Book2Audio: Reviving My PDF-to-Audiobook Project (and Fighting Dependency Hell Along the Way)

Book2Audio: Reviving My PDF-to-Audiobook Project (and Fighting Dependency Hell Along the Way)

Converting PDFs to audiobooks sounds simple. It is not. Memory crashes, dependency hell, and software updates that break more than they fix — but the result is a working open-source pipeline that turns any PDF into audio you'd actually want to listen to. And the best parts are still coming.


Follow Us

Latest Posts

subscribe to our newsletter