<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="https://www.mindfiretechnology.com/blog/rss/xslt"?>
<rss xmlns:a10="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Mindfire Technology</title>
    <link>https://www.mindfiretechnology.com/blog/</link>
    <description>Welcome to our blog, where we share technical and business knowledge based on real life experiences.</description>
    <generator>Articulate, blogging built on Umbraco</generator>
    <item>
      <guid isPermaLink="false">2756</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/when-to-call-the-llm/</link>
      <category>System.String[]</category>
      <title>When to Call the LLM?</title>
      <description>&lt;p&gt;I've been building a Book-to-Audio pipeline. See, for example, &lt;a href="https://www.mindfiretechnology.com/blog/archive/refactoring-the-book2audio-parsers/"&gt;here.&lt;/a&gt; The idea is simple: take a PDF or
EPUB, parse it into clean paragraphs, and feed those paragraphs to a
text-to-speech engine. Simple enough. But scanned books are messy. OCR artifacts
like &lt;code&gt;Whenin&lt;/code&gt; (a word join from &amp;quot;When in&amp;quot;), hyphenated line breaks carried over
from print, footnote markers embedded mid-sentence — the raw text often needs
cleaning before it's speakable.&lt;/p&gt;
&lt;p&gt;So I added an LLM-based cleaner. You pass it a paragraph as well as page context, it returns a cleaned version and a classification: body, footnote, or drop. It works well. The problem is I was calling it on every paragraph, which took forever using a local LLM.&lt;/p&gt;
&lt;p&gt;The Book2Audio repo is found &lt;a href="https://github.com/brucenielson/Book2Audio"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Was That Actually a Problem?&lt;/h2&gt;
&lt;p&gt;If I was just burning tokens for Claude or ChatGPT, time wouldn't be an issue. But token budget would be.&lt;/p&gt;
&lt;p&gt;Either way, a typical chapter has hundreds of paragraphs. The LLM takes roughly
five seconds per call. Most paragraphs — the ones that say &amp;quot;He has refused to
pass Laws of immediate importance&amp;quot; — are already perfectly clean. Calling the
LLM on those is waste, pure and simple, because any 'improvement' from the LLM would necessarily be a departure from what the book actually said. For an audiobook converter, that's not desirable.&lt;/p&gt;
&lt;p&gt;The more interesting question is: &lt;em&gt;when do we actually need the LLM?&lt;/em&gt; The answer
is when the text contains something the parser couldn't fix — OCR artifacts,
broken word joins, corrupted characters. Not when it contains the word &amp;quot;justice.&amp;quot;&lt;/p&gt;
&lt;p&gt;So I built a gate.&lt;/p&gt;
&lt;h2&gt;The Gate: &lt;code&gt;_all_words_valid&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The idea is straightforward. Before calling the LLM, check whether every word in
the paragraph is already valid English. If they all are, skip the LLM entirely.
The paragraph is clean enough.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def _all_words_valid(text: str) -&amp;gt; bool:
    for token in text.split():
        stripped = re.sub(r&amp;quot;[,;:.!?()'\&amp;quot;—–]&amp;quot;, '', token.lower())
        if not stripped or not word_validator.is_valid_word(stripped):
            return False
    return True
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;word_validator.is_valid_word()&lt;/code&gt; is doing real work here: it checks the NLTK
English words corpus, then falls back to Porter stemming and WordNet
lemmatization. So &amp;quot;running&amp;quot;, &amp;quot;armies&amp;quot;, &amp;quot;endeavoured&amp;quot; — these all pass. Just
checking a raw dictionary wouldn't cut it. English inflection is too irregular.&lt;/p&gt;
&lt;p&gt;The result: the Declaration of Independence, which has clean prose, now sends
only a handful of paragraphs to the LLM — the ones with genuine OCR artifacts —
instead of all of them.&lt;/p&gt;
&lt;h2&gt;What Was Actually Slow?&lt;/h2&gt;
&lt;p&gt;Once I had the gate in place and added timing debug statements, the results were
revealing:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;[TIMING] validation=6.24s (1 skipped) | llm=9.26s (1 calls) | total_timed=15.50s&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Six seconds on word validation for &lt;em&gt;one paragraph&lt;/em&gt;? That can't be right. The
validation logic is fast. What was happening is that NLTK loads its word corpus,
stemmers, and WordNet lemmatizer lazily — on first use. The first call to
&lt;code&gt;is_valid_word()&lt;/code&gt; was triggering the entire NLTK initialization chain mid-paragraph,
and that takes six seconds.&lt;/p&gt;
&lt;p&gt;The fix is a warm-up call at the start of processing:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if self._cleaner is not None:
    word_validator.is_valid_word('warm')
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;After that, the timing looked quite different:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;[TIMING] validation=0.01s (14 skipped) | llm=31.11s (7 calls) | total_timed=31.12s&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Validation: effectively free. LLM: the clear bottleneck, as expected.&lt;/p&gt;
&lt;h2&gt;What the Remaining LLM Calls Reveal&lt;/h2&gt;
&lt;p&gt;Of the seven paragraphs still going to the LLM, only two were genuine OCR
artifacts. The other five were false positives — cases where the validator
correctly identified a token it didn't recognise, but the token was actually fine:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;self-evident&lt;/code&gt; — the hyphen confused the token stripper&lt;/li&gt;
&lt;li&gt;&lt;code&gt;endeavoured&lt;/code&gt; (twice) — valid British spelling, not in the American NLTK corpus&lt;/li&gt;
&lt;li&gt;&lt;code&gt;offences&lt;/code&gt; — same problem&lt;/li&gt;
&lt;li&gt;&lt;code&gt;compleat&lt;/code&gt; — archaic spelling, borderline&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is worth sitting with for a moment. The gate is working correctly in the
sense that it's catching tokens it genuinely can't validate. The problem is that
&amp;quot;can't validate&amp;quot; and &amp;quot;needs LLM cleaning&amp;quot; are not the same thing. A British
spelling is not an OCR artifact. We're using the wrong signal.&lt;/p&gt;
&lt;p&gt;A better gate would know about British English. A simpler gate would split
hyphenated words and check each part. Neither fix is difficult. I haven't made
them yet.&lt;/p&gt;
&lt;h2&gt;What I'd Do Differently&lt;/h2&gt;
&lt;p&gt;The timing instrumentation taught me something I should have measured from the
start: &lt;em&gt;where is the time actually going?&lt;/em&gt; I had assumed the LLM was the
bottleneck. It was — but only after fixing a completely separate problem (NLTK
lazy loading) that was masking everything else.&lt;/p&gt;
&lt;p&gt;This is a mundane lesson dressed up as a profound one: don't optimise until you
measure. But there's a sharper version of it. The six-second NLTK startup cost
was invisible in normal use because it happened once per run. It only became
visible when we added the gate — because the gate called &lt;code&gt;is_valid_word()&lt;/code&gt; on the
&lt;em&gt;first paragraph&lt;/em&gt; instead of waiting for the LLM to warm up later. The
optimisation revealed the bottleneck it was supposed to solve.&lt;/p&gt;
&lt;p&gt;That's not irony. That's just what measurement does. You don't know what's slow
until something makes the slow thing visible.&lt;/p&gt;
&lt;h2&gt;Where This Leaves Things&lt;/h2&gt;
&lt;p&gt;The pipeline is meaningfully faster. Most paragraphs skip the LLM entirely.
The remaining LLM calls are concentrated on the paragraphs that actually need
attention. The test suite now runs in about 30 seconds instead of... considerably
longer.&lt;/p&gt;
&lt;p&gt;What remains: British spellings, hyphenated words, and the general question of
what &amp;quot;valid English&amp;quot; really means for a validator that's supposed to be catching
OCR artifacts, not judging orthographic conventions. That's a problem worth
thinking about carefully before reaching for a fix.&lt;/p&gt;
&lt;p&gt;However, some caution is warranted. The gate is a heuristic. It will miss things.
A paragraph full of correctly-spelled words can still be semantically broken in
ways no word list will catch. The LLM is still there for the hard cases. That's
probably the right architecture — use it as a last resort, not a default.&lt;/p&gt;
&lt;h2&gt;Lesson's Learned&lt;/h2&gt;
&lt;p&gt;Of course this would all go a lot faster if I wasn't using  a local LLM, in this case &lt;code&gt;llama3.1:8b&lt;/code&gt;. That is a future idea. But I wanted to be able to run this locally if needs be. &lt;/p&gt;
&lt;p&gt;The idea of a gate to reduce calls to the LLM sped things up quite a bit. The checking if the LLM didn't make a mistake will be a topic for a future post. But let's just say that LLMs can sometimes get creative. So a shorter leash may be necessary. Using NLTK -- the poor thing seems almost defunct since the advent of LLMs -- really does save some time on the verification that the LLM didn't hallucinate. Overall, I'm pleased with the progress.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Wed, 27 May 2026 09:12:39 -0600</pubDate>
      <a10:updated>2026-05-27T09:12:39-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2004</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/introduction-to-genetic-programming/</link>
      <category>System.String[]</category>
      <title>Introduction to Genetic Programming</title>
      <description>&lt;blockquote&gt;
&lt;p&gt;Genetic programming addresses the problem of automatic programming,
namely, the problem of how to enable a computer to do useful things
without instructing it, step by step, on how to do it. (John Koza in
&lt;a href="https://www.amazon.com/gp/product/155860510X/ref=as_li_qf_asin_il_tl?ie=UTF8&amp;amp;tag=thelightrebor-20&amp;amp;creative=9325&amp;amp;linkCode=as2&amp;amp;creativeASIN=155860510X&amp;amp;linkId=c41aac0cef5626f12c27973ab8867839"&gt;Genetic Programming: An Introduction&lt;/a&gt;, p. vii)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Genetic Programming is a type of Machine Learning that builds programs via a simulated form of Darwin's evolution by natural selection. Starting with a code base I got from Toby Segaran's excellent book &lt;a href="https://www.amazon.com/gp/product/0596529325/ref=as_li_qf_asin_il_tl?ie=UTF8&amp;amp;tag=thelightrebor-20&amp;amp;creative=9325&amp;amp;linkCode=as2&amp;amp;creativeASIN=0596529325&amp;amp;linkId=d7617eec991cfcd38512b71857914b03"&gt;Collective Intelligence&lt;/a&gt; (which might be one of the best and briefest books on Machine Learning I've seen) I will layout how Genetic Programming works and then perform a couple of experiments to see if I can improve the starting code base. I hope to go further in future posts.&lt;/p&gt;
&lt;p&gt;The actual code I used can be found in this &lt;a href="https://colab.research.google.com/drive/1tmLp2-bJF5MuY78qBFGL8Xu_4DQQdJnU"&gt;Python Notebook on Google Colaboratory&lt;/a&gt; and you can even tweak and run the code yourself.&lt;/p&gt;
&lt;h2&gt;Darwin's Program&lt;/h2&gt;
&lt;p&gt;In Genetic Programming we have a hidden function we want to automatically write a program to solve. All of life is a function, so this might be anything: predicting chances of a heart attack, find the best strategy playing a game, etc. For our purposes, we're going to be using a simple hidden function (taken from Segaran's book):&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;X^2 + 2Y + 3X + 5&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;What we'll do is write a function to randomly feed various X and Y parameters into that function and get back the correct solution. These results will all be stored in a table. That table will look something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/gptable.png" alt="Hidden Function Table" /&gt;&lt;/p&gt;
&lt;p&gt;Of course in reality it will have a lot more rows. But the goal of our Genetic Programming algorithm is to build up a population of code that, over generations, gets better and better at predicting the correct result using the data in that table. With some luck, the Genetic Programming algorithm will find the actual hidden function and perfectly predict the results.&lt;/p&gt;
&lt;h2&gt;Our Toy Programming Language&lt;/h2&gt;
&lt;p&gt;For this experiment, we're going to stick with a very simple programming model that is made up of the following functions: add, multiply, if, &amp;gt;, subtract. In addition, it will be able to specify (for each of these functions) a parameter or a constant. While this is hardly a complete programming language, it is sufficient to find our hidden function and it should give you a pretty good idea how genetic programming works.&lt;/p&gt;
&lt;h2&gt;Programming Trees&lt;/h2&gt;
&lt;p&gt;Now you might be wondering how in the world can we evolve a program? If you're asking this, you probably have in mind typing a program into a text editor or Visual Studio and then randomly throwing statements together. Of course, if you did this, there would be primarily just bad syntax. To avoid this problem, we're going to use a trick where the program has to exist in a tree. So imagine a program that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/gptree.png" alt="Programming Tree" /&gt;&lt;/p&gt;
&lt;p&gt;This programming tree is equivalent to our hidden function and would perfectly predict each result without error.&lt;/p&gt;
&lt;h2&gt;Building a Population&lt;/h2&gt;
&lt;p&gt;So how do we actually evolve toward a tree like that? We start with a population of randomly generated trees. Most of them will be terrible — they'll wildly mispredict the results in the table. But &amp;quot;terrible&amp;quot; is still useful information. We score each program by running it against the data and measuring how far off its predictions are. This score is called the &lt;strong&gt;fitness&lt;/strong&gt; of the program. Low error = high fitness.&lt;/p&gt;
&lt;p&gt;Now here's where Darwin comes in.&lt;/p&gt;
&lt;p&gt;We take the fittest programs and use them to breed the next generation. There are two main operations for doing this:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mutation&lt;/strong&gt; — Take a program tree and randomly change one of its nodes. Maybe swap a &lt;code&gt;multiply&lt;/code&gt; for an &lt;code&gt;add&lt;/code&gt;, or replace a constant with a parameter. This is the equivalent of a random genetic mutation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Crossover&lt;/strong&gt; — Take two program trees and swap a random subtree between them. This is the equivalent of sexual reproduction — combining the best features of two successful programs in hopes of producing something even better.&lt;/p&gt;
&lt;p&gt;Run enough generations of selection, mutation, and crossover, and the population tends to converge on better and better solutions. Sometimes it finds the exact hidden function. Sometimes it finds a different function that happens to produce the same outputs for all the data in the table (a perfectly valid solution, even if it isn't the &amp;quot;original&amp;quot; one).&lt;/p&gt;
&lt;h2&gt;The Experiment&lt;/h2&gt;
&lt;p&gt;I ran the algorithm against our hidden function &lt;strong&gt;X^2 + 2Y + 3X + 5&lt;/strong&gt; with a population of 500 programs over 100 generations. The results were encouraging. Within a few dozen generations the best program in the population had dramatically reduced its error score, and by the end it had converged on a solution that perfectly predicted every row in the table.&lt;/p&gt;
&lt;p&gt;You can see the exact results — and run the code yourself — in the &lt;a href="https://colab.research.google.com/drive/1tmLp2-bJF5MuY78qBFGL8Xu_4DQQdJnU"&gt;Python Notebook&lt;/a&gt;. I'd encourage you to play with the population size and generation count and see how it affects both accuracy and speed.&lt;/p&gt;
&lt;h2&gt;What's Next&lt;/h2&gt;
&lt;p&gt;This is just the starting point. A few obvious questions worth exploring in future posts: Can we improve the algorithm's speed by being smarter about how we generate the initial population? Does a larger population always produce better results, or does it just produce slower ones? And perhaps most interestingly — how does Genetic Programming hold up against other Machine Learning approaches on harder problems?&lt;/p&gt;
&lt;p&gt;All theories, including this algorithm, are only as good as the problems we test them against. So the next step is to make the problem harder.&lt;/p&gt;
</description>
      <pubDate>Wed, 20 May 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-05-20T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2749</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/the-no-free-lunch-theorem-why-no-learning-algorithm-is-universally-best/</link>
      <category>System.String[]</category>
      <title>The No Free Lunch Theorem: Why No Learning Algorithm Is Universally Best</title>
      <description>&lt;p&gt;In my &lt;a href="https://www.mindfiretechnology.com/blog/archive/what-exactly-is-an-inductive-bias/"&gt;previous post on inductive bias&lt;/a&gt;, I ended with an open question: is there an &amp;quot;optimal&amp;quot; inductive bias -- one whose search space is universal but whose search strategy is still efficient and tractable for any problem?&lt;/p&gt;
&lt;p&gt;Aren't humans an &amp;quot;optimal&amp;quot; general learner compared to, say, existing machine learning algorithms? So intuitively it seems like the answer should be &amp;quot;yes, there is an optimal learner.&amp;quot;&lt;/p&gt;
&lt;p&gt;As it turns out, there is a mathematical proof known as the &amp;quot;No Free Lunch Theorem&amp;quot; that proves the answer is actually &amp;quot;no.&amp;quot; It is one of the most important results in the theory of optimization.&lt;/p&gt;
&lt;p&gt;In 1997, David Wolpert and William Macready published a paper called &lt;a href="https://www.cs.ubc.ca/~hutter/earg/papers07/00585893.pdf"&gt;&amp;quot;No Free Lunch Theorems for Optimization&amp;quot;&lt;/a&gt; that proved something remarkable: averaged over all possible problems, no optimization strategy performs better than any other. Including random search. Including random guessing. Including humans in the loop. &lt;em&gt;Every&lt;/em&gt; strategy that gains an advantage on some class of problems pays for it with equal disadvantage on another class.&lt;/p&gt;
&lt;p&gt;And when they say &amp;quot;strategy,&amp;quot; they mean it broadly. As Ho and Pepyne put it in their &lt;a href="https://faculty.cc.gatech.edu/~isbell/reading/papers/nfl-optimization-explanation.pdf"&gt;accessible explanation of the theorem&lt;/a&gt;: &amp;quot;Strategies include methods involving search, adaptation, learning, voting, feedback, dynamic programming, evolution, randomization, and even humans in the loop. In short, the concept of strategy covers any method for coming up with a solution to an optimization problem. Nothing can be more general or more inclusive&amp;quot; (Ho and Pepyne, 2001).&lt;/p&gt;
&lt;p&gt;This sounds absurd. We know that some algorithms work better than others in practice. How can all strategies be equally good? The answer lies in those three words: &amp;quot;all possible problems.&amp;quot; Let me show you why.&lt;/p&gt;
&lt;h2&gt;A Simple Pathfinding Problem&lt;/h2&gt;
&lt;p&gt;Imagine a robot that needs to find the shortest path from point A to point D through a small network:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;    B
   / \
  A   D
   \ /
    C
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There are two possible routes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Path 1:&lt;/strong&gt; A → B → D&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Path 2:&lt;/strong&gt; A → C → D&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each path has a total distance. The robot does not know the distances in advance -- it must pick a path and find out. The goal is to pick the shorter one.&lt;/p&gt;
&lt;h2&gt;The Universe of All Possible Problems&lt;/h2&gt;
&lt;p&gt;Let us say each path's total distance can be Short (1), Medium (5), or Long (10). A &amp;quot;problem&amp;quot; is a specific assignment of distances to both paths. In the formal framework, each such assignment is a function -- labeled f0, f1, and so on -- that maps each path to a distance. Since each path can independently be 1, 5, or 10, there are 3 x 3 = 9 possible functions. Here they all are:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The P-Matrix: All 9 Possible Problems&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Problem:    f0   f1   f2   f3   f4   f5   f6   f7   f8
Path 1:      1    5   10    1    5   10    1    5   10
Path 2:      1    1    1    5    5    5   10   10   10
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This table is what Ho and Pepyne call the &lt;strong&gt;P-matrix&lt;/strong&gt;. The rows represent the available choices. The columns represent every possible problem -- every possible assignment of distances to paths. The entries are the distances.&lt;/p&gt;
&lt;p&gt;Most of these nine problems will never occur in the real world. Some of them might correspond to a real map. Others are pure mathematical fiction -- worlds where both paths are equally short, or where the path that looks longer on a map is actually shorter. The P-matrix does not care about physical plausibility. It enumerates &lt;em&gt;everything&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Now consider two strategies:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strategy 1:&lt;/strong&gt; Always take Path 1.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strategy 2:&lt;/strong&gt; Always take Path 2.&lt;/p&gt;
&lt;p&gt;Strategy 1 gets the Path 1 distance on every problem: 1, 5, 10, 1, 5, 10, 1, 5, 10. &lt;strong&gt;Total: 48.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Strategy 2 gets the Path 2 distance on every problem: 1, 1, 1, 5, 5, 5, 10, 10, 10. &lt;strong&gt;Total: 48.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The totals are identical. Averaged over all nine possible problems, neither strategy is better than the other.&lt;/p&gt;
&lt;h2&gt;Where Each Strategy Wins&lt;/h2&gt;
&lt;p&gt;The totals are the same, but the individual problems tell a more interesting story.&lt;/p&gt;
&lt;p&gt;Strategy 1 wins on f3, f6, and f7 -- problems where Path 1 is shorter than Path 2. Strategy 2 wins on f1, f2, and f5 -- problems where Path 2 is shorter. They tie on f0, f4, and f8.&lt;/p&gt;
&lt;p&gt;On the problems where Strategy 1 wins, it wins by a combined total of (5-1) + (10-1) + (10-5) = 18.&lt;/p&gt;
&lt;p&gt;On the problems where Strategy 2 wins, it wins by a combined total of (5-1) + (10-1) + (10-5) = 18.&lt;/p&gt;
&lt;p&gt;The gains and losses cancel perfectly. This is not a coincidence.&lt;/p&gt;
&lt;h2&gt;Why the P-Matrix Makes This Inevitable&lt;/h2&gt;
&lt;p&gt;Look at the P-matrix again. The columns enumerate &lt;em&gt;every possible&lt;/em&gt; combination of distances. In the Path 1 row, every possible distance (1, 5, 10) appears exactly three times. The same is true for the Path 2 row.&lt;/p&gt;
&lt;p&gt;Ho and Pepyne point out that mathematically this is a &lt;strong&gt;counting matrix&lt;/strong&gt; -- a matrix whose columns count through all possible value assignments. The key property of a counting matrix is that &lt;strong&gt;all row sums are equal&lt;/strong&gt;. You can verify this by inspection: both rows sum to 48. No row can have a higher total than any other when the columns enumerate every possible assignment. This is a mathematical certainty. We showed this for our small 2 x 9 matrix, but the property holds for any size. As long as the columns enumerate every possible combination of values, the matrix is a counting matrix and the row sums will always be equal.&lt;/p&gt;
&lt;p&gt;And &lt;em&gt;that&lt;/em&gt; is the No Free Lunch theorem. No matter what strategy you use -- no matter how sophisticated, how clever, how well-informed -- if you sum its performance across &lt;em&gt;all possible problems&lt;/em&gt;, you get the same total as any other strategy. The P-matrix is a counting matrix, and counting matrices have equal row sums. There is no way around this.&lt;/p&gt;
&lt;h2&gt;But Real Algorithms Do Work Better&lt;/h2&gt;
&lt;p&gt;If all strategies are truly equal, why do real algorithms outperform random guessing in practice?&lt;/p&gt;
&lt;p&gt;Because real problems are not drawn uniformly from all possible problems. The real world has structure.&lt;/p&gt;
&lt;p&gt;Consider the A-star algorithm, one of the most effective pathfinding algorithms ever developed. A-star uses a heuristic to decide which paths to explore first. In our network, if B is geographically close to D, A-star's heuristic estimates that the path through B is likely shorter. It explores that path first.&lt;/p&gt;
&lt;p&gt;This heuristic relies on a specific assumption about the world: that geographic proximity correlates with travel distance. In Euclidean space, this is guaranteed. If B is close to D as the crow flies, then the travel distance from B to D cannot be wildly longer than the straight-line distance. This property -- essentially the triangle inequality -- is what makes A-star's heuristic &lt;em&gt;admissible&lt;/em&gt;, meaning it never overestimates the true remaining distance.&lt;/p&gt;
&lt;p&gt;In the real world, this assumption holds. Roads may be winding, but a point that is one mile away as the crow flies is never a thousand miles away by road. A-star exploits this structure to prune bad paths without exploring them, which is what makes it fast and effective.&lt;/p&gt;
&lt;p&gt;But now imagine a world with a trans-dimensional hopper -- a device that warps space so that two points that are far apart as the crow flies can have a travel distance of nearly zero--or even a physically impossible negative distance. In this world, node C might be geographically far from D, but the hopper road from C to D is absurdly short. A-star's heuristic looks at C, estimates a long remaining distance based on the straight-line measurement, and concludes &amp;quot;that direction is not worth exploring.&amp;quot; It prunes the path through C -- the path that, thanks to the hopper, is actually the shortest.&lt;/p&gt;
&lt;p&gt;A-star does not just miss the optimal path. It &lt;em&gt;confidently excludes&lt;/em&gt; it. Its heuristic, which is so reliable in the real world, becomes actively misleading in a world where the relationship between straight-line distance and travel distance is broken. But a random search strategy -- which assigns no meaning to geographic proximity and just tries paths arbitrarily -- would have an equal chance of stumbling onto the hopper path.&lt;/p&gt;
&lt;p&gt;This is the No Free Lunch theorem made concrete. A-star's inductive bias is the assumption that geometry is well-behaved. That assumption makes it brilliant in our world and blind in the hopper world. The P-matrix contains both kinds of worlds, and the gains and losses cancel.&lt;/p&gt;
&lt;h2&gt;The Connection to Neural Networks&lt;/h2&gt;
&lt;p&gt;The same logic applies to every learning algorithm we have discussed in this series.&lt;/p&gt;
&lt;p&gt;Recall from the &lt;a href="https://www.mindfiretechnology.com/blog/archive/what-exactly-is-an-inductive-bias/"&gt;inductive bias post&lt;/a&gt; that Mitchell characterized backpropagation's inductive bias as &amp;quot;smooth interpolation between data points.&amp;quot; A neural network trained with backpropagation assumes that the underlying function is smooth -- that nearby inputs produce nearby outputs. This is what allows it to generalize from training data to new examples.&lt;/p&gt;
&lt;p&gt;But the NFL theorem tells us: for every smooth function where this assumption helps, there exists an anti-smooth function where it hurts by exactly the same amount. A function where nearby inputs map to wildly different outputs will fool the neural network into confidently predicting smooth transitions that do not exist. On that function, random guessing would do just as well.&lt;/p&gt;
&lt;p&gt;The neural network's situation is identical to A-star's. Its inductive bias -- smoothness -- makes it powerful on the kinds of problems the real world actually presents. But that power comes at a cost: poor performance on problems that violate the assumption. The P-matrix contains both kinds, and the row sums are equal.&lt;/p&gt;
&lt;h2&gt;The Connection to Popper&lt;/h2&gt;
&lt;p&gt;This is Mitchell's &amp;quot;futility of bias-free learning&amp;quot; -- from our &lt;a href="https://www.mindfiretechnology.com/blog/archive/induction-is-a-myth-the-futility-of-unbiased-learning/"&gt;first post in this series&lt;/a&gt; -- generalized to all of optimization.&lt;/p&gt;
&lt;p&gt;Mitchell showed that a single learning algorithm with no inductive bias cannot generalize at all. The NFL theorem shows something broader: not only do you need an inductive bias to generalize, but &lt;strong&gt;no single inductive bias is universally best&lt;/strong&gt;. Every bias helps on some problems and hurts on others. There is no free lunch.&lt;/p&gt;
&lt;p&gt;And this is Karl Popper's point yet again. There is no universal method of discovery. There is no algorithm that works for everything. Every act of learning, every act of optimization, every act of scientific discovery requires prior assumptions about the structure of the problem. Those assumptions are what make progress possible -- but they are also what make us fallible. For some class of problems, they necessarily fail.&lt;/p&gt;
&lt;p&gt;The question is never whether your algorithm has assumptions. It always does. The question is whether those assumptions match the world you are actually in.&lt;/p&gt;
&lt;p&gt;The No Free Lunch Theorem was first proved by &lt;a href="https://www.cs.ubc.ca/~hutter/earg/papers07/00585893.pdf"&gt;Wolpert and Macready (1997)&lt;/a&gt;. The P-matrix framework and counting matrix explanation is from &lt;a href="https://faculty.cc.gatech.edu/~isbell/reading/papers/nfl-optimization-explanation.pdf"&gt;Ho and Pepyne (2001)&lt;/a&gt;. All references to Mitchell are from &lt;a href="https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf"&gt;&lt;em&gt;Machine Learning&lt;/em&gt; (McGraw-Hill, 1997)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Thu, 14 May 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-05-14T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2751</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/inside-the-qwen3-tts-engine-code-qwen3-tts-part-2/</link>
      <category>System.String[]</category>
      <title>Inside the Qwen3-TTS Engine Code (Qwen3-TTS, Part 2)</title>
      <description>&lt;p&gt;This is the follow-up to my &lt;a href="https://www.mindfiretechnology.com/blog/archive/implementing-qwen3-tts-in-my-pdf-to-audiobook-pipeline-qwen3-tts-part-1/"&gt;previous post on adding Qwen3-TTS to Book2Audio&lt;/a&gt;. That post covered how to use Qwen3-TTS from the command line and how I refactored the code to support multiple TTS engines. This post walks through the actual engine code — how it's structured, what each piece does, and how Qwen3-TTS works under the hood.&lt;/p&gt;
&lt;p&gt;All the code discussed here is &lt;a href="https://github.com/brucenielson/Book2Audio/tree/8e7e547b8f97625c62d82f55bbe43d286daceb73"&gt;available in my GitHub repo&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;The Engine Abstraction&lt;/h2&gt;
&lt;p&gt;The starting point is a simple abstract base class that defines what any TTS engine needs to do:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from abc import ABC, abstractmethod
import numpy as np

class TTSEngine(ABC):
    @abstractmethod
    def generate(self, text: str) -&amp;gt; np.ndarray:
        ...

    @property
    @abstractmethod
    def sample_rate(self) -&amp;gt; int:
        ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Two methods, that's it. &lt;code&gt;generate&lt;/code&gt; takes a string of text and returns a numpy array of audio samples. &lt;code&gt;sample_rate&lt;/code&gt; returns the sample rate in Hz so the caller knows how to save the audio correctly. Any TTS backend — Kokoro, Qwen, or something else entirely — just needs to implement these two things.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;AudioGenerator&lt;/code&gt; then wraps any engine and handles the model-agnostic parts:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class AudioGenerator:
    def __init__(self, engine: TTSEngine) -&amp;gt; None:
        self._engine = engine

    def generate(self, text: str) -&amp;gt; np.ndarray:
        return self._engine.generate(text)

    def save(self, audio: np.ndarray, output_file: str) -&amp;gt; None:
        sf.write(output_file, audio, self._engine.sample_rate)

    def generate_and_save(self, text: str, output_file: str) -&amp;gt; None:
        self.save(self.generate(text), output_file)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The key thing here is &lt;code&gt;save&lt;/code&gt; — it pulls &lt;code&gt;sample_rate&lt;/code&gt; from the engine rather than hardcoding it. Different engines could theoretically produce audio at different sample rates, and this handles that transparently. &lt;code&gt;generate_and_save&lt;/code&gt; is just a convenience method that chains the two together.&lt;/p&gt;
&lt;h2&gt;Loading the Model&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;QwenCustomVoiceEngine&lt;/code&gt; constructor handles model loading. If you don't pass in a pre-loaded model, it figures everything out from the &lt;code&gt;model_size&lt;/code&gt; parameter:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;QWEN_MODEL_SIZES = {
    '0.6b': 'Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice',
    '1.7b': 'Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice',
}

def __init__(self,
             speaker: str = 'vivian',
             language: str = 'Auto',
             instruct: str | None = None,
             model_size: str = '0.6b',
             model: Qwen3TTSModel | None = None) -&amp;gt; None:
    if model is None:
        model_id = QWEN_MODEL_SIZES.get(
            model_size.lower(), QWEN_MODEL_SIZES['0.6b']
        )
        attn_impl = 'sdpa'
        try:
            import flash_attn
            attn_impl = 'flash_attention_2'
        except ImportError:
            pass
        device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
        model = Qwen3TTSModel.from_pretrained(
            model_id,
            device_map=device,
            dtype=torch.bfloat16,
            attn_implementation=attn_impl,
        )
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;QWEN_MODEL_SIZES&lt;/code&gt; dictionary maps friendly size names to full Hugging Face model identifiers. This way the rest of the code just passes around &amp;quot;0.6b&amp;quot; or &amp;quot;1.7b&amp;quot; instead of long model strings. It also means that if Qwen releases new checkpoints, there's only one place to update. Note to self: I should really allow you to pass a full model name here and only use this dictionary for shorthands. I need to implement that still. &lt;/p&gt;
&lt;p&gt;There are a few things worth noting about the &lt;code&gt;from_pretrained&lt;/code&gt; call.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;device_map&lt;/code&gt; controls where the model runs. The code checks &lt;code&gt;torch.cuda.is_available()&lt;/code&gt; and uses the GPU if present, falling back to CPU otherwise. This is the same pattern that the Kokoro engine uses, so both engines behave consistently.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;dtype=torch.bfloat16&lt;/code&gt; halves the memory footprint compared to full float32 precision with negligible quality loss. For the 1.7B model, this is the difference between fitting on a consumer GPU and not fitting at all.&lt;/p&gt;
&lt;p&gt;The attention implementation check is about GPU memory efficiency. FlashAttention 2 is an optimized attention algorithm that reduces VRAM usage during inference. But it requires the &lt;code&gt;flash-attn&lt;/code&gt; package, which compiles from source and needs the CUDA Toolkit and a C++ compiler installed — a nontrivial setup on Windows. If it's not available, the engine falls back to PyTorch's built-in scaled dot product attention (&lt;code&gt;sdpa&lt;/code&gt;), which works fine but uses a bit more VRAM. The code handles this gracefully: try the import, use it if it's there, move on if it's not. To be frank, I never got flash attention working — getting the CUDA Toolkit and C++ compiler set up on Windows was more than I wanted to take on right now. So the code is there for when I get around to it, but for now I use sdpa.&lt;/p&gt;
&lt;p&gt;The constructor also accepts a pre-loaded model via the &lt;code&gt;model&lt;/code&gt; parameter. This is useful for testing — you can inject a mock — and it also means you could share a single model instance across multiple engine objects if you needed to.&lt;/p&gt;
&lt;h2&gt;Generating Speech&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;generate&lt;/code&gt; method is where text actually becomes audio:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def generate(self, text: str) -&amp;gt; np.ndarray:
    kwargs = {
        'text': text,
        'language': self._language,
        'speaker': self._speaker,
    }
    if self._instruct is not None:
        kwargs['instruct'] = self._instruct

    wavs, sr = self._model.generate_custom_voice(**kwargs)
    self._sample_rate = sr
    return wavs[0]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It assembles the keyword arguments for &lt;code&gt;generate_custom_voice&lt;/code&gt;, conditionally including the &lt;code&gt;instruct&lt;/code&gt; parameter, makes the call, and returns the first waveform.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;instruct&lt;/code&gt; parameter is only included when it's not &lt;code&gt;None&lt;/code&gt;. This matters because the 0.6B model doesn't support instruction control — only the 1.7B CustomVoice model does. Passing &lt;code&gt;instruct&lt;/code&gt; to the 0.6B model won't cause an error, but it will be silently ignored.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;generate_custom_voice&lt;/code&gt; returns a list of waveforms because the Qwen3-TTS API supports batched generation — you can pass a list of strings and get back multiple audio arrays in one call. For our paragraph-at-a-time use case, we always pass a single string and take &lt;code&gt;wavs[0]&lt;/code&gt;. However, I should really change things to allow this code to handle everything at once as an option. As I mentioned in the previous post, Qwen3-TTS produces a somewhat different voice each time you call it, which makes for a questionable audiobook experience when you're generating paragraph by paragraph. Batching everything into a single call might help with that consistency.&lt;/p&gt;
&lt;p&gt;The sample rate is captured from the return value rather than hardcoded, though in practice Qwen3-TTS always returns 24000 Hz. By reading it from the response, the code stays correct even if a future model version changes the rate.&lt;/p&gt;
&lt;h2&gt;Wiring It Together&lt;/h2&gt;
&lt;p&gt;The CLI entry point in &lt;code&gt;book_to_audio.py&lt;/code&gt; ties everything together. When the user passes &lt;code&gt;--engine qwen&lt;/code&gt;, a small factory function creates the right engine:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def _create_engine(args):
    if args.engine == 'qwen':
        return QwenCustomVoiceEngine(
            speaker=args.speaker,
            language=args.language,
            instruct=args.instruct,
            model_size=args.model_size,
        )
    else:
        return KokoroEngine(voice=args.voice)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;That engine gets wrapped in an &lt;code&gt;AudioGenerator&lt;/code&gt;, which gets handed to &lt;code&gt;BookToAudio&lt;/code&gt;, which does the actual document processing. &lt;code&gt;BookToAudio&lt;/code&gt; doesn't know or care whether it's using Kokoro or Qwen — it just calls &lt;code&gt;generate&lt;/code&gt; and gets audio back.&lt;/p&gt;
&lt;p&gt;This is the payoff of the strategy pattern. Adding a third engine later — say, for voice cloning with the Qwen3-TTS Base model — means writing a new engine class, adding an option to &lt;code&gt;_create_engine&lt;/code&gt;, and nothing else changes.&lt;/p&gt;
&lt;h2&gt;What's Next&lt;/h2&gt;
&lt;p&gt;Voice cloning is the natural next step. The Qwen3-TTS Base model can clone a voice from just a few seconds of reference audio, which opens up the possibility of generating an entire audiobook in a specific narrator's voice. That will be a separate engine class since it uses a different model and a different API (&lt;code&gt;generate_voice_clone&lt;/code&gt; instead of &lt;code&gt;generate_custom_voice&lt;/code&gt;), but the abstraction is already in place to support it.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Thu, 07 May 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-05-07T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2754</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/refactoring-the-book2audio-parsers/</link>
      <category>System.String[]</category>
      <title>Refactoring the Book2Audio Parsers</title>
      <description>&lt;p&gt;This is a progress update on &lt;a href="https://github.com/brucenielson/Book2Audio/tree/af9fb73f1f1be8b68465b7d73dec5c35efd4a45a"&gt;Book2Audio&lt;/a&gt; — a tool that converts PDF and EPUB books into audio files using text-to-speech.&lt;/p&gt;
&lt;h2&gt;What We Did&lt;/h2&gt;
&lt;p&gt;Book2Audio has two parsers: &lt;code&gt;DoclingParser&lt;/code&gt; for PDFs and &lt;code&gt;EpubParser&lt;/code&gt; for EPUBs. Both do similar things — extract text from a document, clean it up, and chunk it into paragraphs — but they were implemented quite differently under the hood.&lt;/p&gt;
&lt;p&gt;This update brings them into full alignment across several dimensions.&lt;/p&gt;
&lt;h3&gt;Shared Paragraph Accumulation&lt;/h3&gt;
&lt;p&gt;We extracted the shared paragraph accumulation logic into a new &lt;code&gt;TextProcessor&lt;/code&gt; class. Both parsers now produce &lt;code&gt;RawChunk&lt;/code&gt; objects and hand them off to &lt;code&gt;TextProcessor&lt;/code&gt;, which handles the decisions about when to accumulate and when to emit a paragraph. Chunking behavior is now consistent across PDF and EPUB sources.&lt;/p&gt;
&lt;h3&gt;Unified Text Cleaning&lt;/h3&gt;
&lt;p&gt;Previously, PDF cleaning happened in &lt;code&gt;DoclingParser&lt;/code&gt; and EPUB cleaning happened in &lt;code&gt;EpubParser&lt;/code&gt;, with two separate cleaning functions that overlapped significantly. We merged these into a single &lt;code&gt;clean_text&lt;/code&gt; function in &lt;code&gt;general_utils.py&lt;/code&gt; and moved all cleaning into &lt;code&gt;TextProcessor&lt;/code&gt; itself. Parsers now pass completely raw text — they extract and label, nothing more.&lt;/p&gt;
&lt;p&gt;Cleaning now happens in a single upfront pass over all chunks before the accumulation loop runs. This gives the cleaner full context and sets up nicely for the LLM cleaning step described below.&lt;/p&gt;
&lt;p&gt;We also took the opportunity to fix a long-standing bug in &lt;code&gt;is_sentence_end&lt;/code&gt; — it wasn't recognizing curly quote characters as sentence-ending, which caused incorrect paragraph accumulation in some cases.&lt;/p&gt;
&lt;h3&gt;Consistent Parser Design&lt;/h3&gt;
&lt;p&gt;We also made &lt;code&gt;EpubParser&lt;/code&gt; consistent with &lt;code&gt;DoclingParser&lt;/code&gt; in several other ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Both now accept either a file path or a pre-loaded document object, making unit testing much simpler and eliminating the need for &lt;code&gt;patch()&lt;/code&gt; in tests&lt;/li&gt;
&lt;li&gt;Both are configured entirely at construction time — no configuration passed into &lt;code&gt;run()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Both live in a &lt;code&gt;parsers/&lt;/code&gt; package and inherit from a shared &lt;code&gt;BaseParser&lt;/code&gt; abstract base class&lt;/li&gt;
&lt;li&gt;The CSV-based section skipping was removed from &lt;code&gt;EpubParser&lt;/code&gt; — the caller loads the CSV and passes the result in, consistent with how &lt;code&gt;DoclingParser&lt;/code&gt; handles page ranges&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Why It Matters&lt;/h2&gt;
&lt;p&gt;The immediate benefit is cleaner, more testable code and consistent behavior between the two parsers. But the real motivation is what comes next: an LLM-based text cleaning step.&lt;/p&gt;
&lt;p&gt;The plan is to add a DSPy module that operates on a sliding window of raw chunks before accumulation. It will:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Remove footnotes that slipped into paragraph text&lt;/li&gt;
&lt;li&gt;Fix OCR errors and encoding artifacts that rule-based cleaning can't handle contextually&lt;/li&gt;
&lt;li&gt;Join words and paragraphs that were incorrectly split across page breaks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The window is bounded by the same logic already used for accumulation — section headers are hard boundaries, so the LLM never sees text across a section break. This keeps the LLM's job well-scoped and the context meaningful.&lt;/p&gt;
&lt;p&gt;Having both parsers produce output through the same &lt;code&gt;TextProcessor&lt;/code&gt; pipeline means the LLM cleaner will work identically regardless of whether the source was a PDF or an EPUB. And for books available in both formats, the cleaner EPUB output can serve as reference data for training the LLM to clean up the noisier PDF version.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Thu, 30 Apr 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-04-30T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2747</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/what-exactly-is-an-inductive-bias/</link>
      <category>System.String[]</category>
      <title>What Exactly Is an Inductive Bias?</title>
      <description>&lt;p&gt;In a &lt;a href="https://www.mindfiretechnology.com/blog/archive/induction-is-a-myth-the-futility-of-unbiased-learning/"&gt;previous post&lt;/a&gt;, I walked through Tom Mitchell's proof that a learner with no prior assumptions cannot generalize at all. The conjunctive restriction on our hypothesis space was doing all the real work -- without it, the algorithm was paralyzed. I promised a follow-up that would define that concept more precisely.&lt;/p&gt;
&lt;p&gt;Mitchell calls it the learner's &lt;em&gt;inductive bias&lt;/em&gt;. He defines this as:&lt;/p&gt;
&lt;p&gt;&amp;quot;...the inductive bias of a learner as the set of additional assumptions B sufficient to justify its inductive inferences as deductive
inferences.&amp;quot; (Mitchell, p. 44)&lt;/p&gt;
&lt;p&gt;Here is his formal definition:&lt;/p&gt;
&lt;p&gt;The inductive bias of a learner L is any minimal set of assertions B such that for any target concept c and corresponding training examples D, (B + D + x) deductively entails L(x, D) for all new instances x. (Mitchell, p. 44)&lt;/p&gt;
&lt;p&gt;Compare this to Wikipedia's definition of an 'inductive bias':&lt;/p&gt;
&lt;p&gt;The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered (&lt;a href="https://en.wikipedia.org/wiki/Inductive_bias"&gt;link&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;In plain language: the inductive bias is whatever you'd have to add to the training data so that &lt;strong&gt;the learner's predictions follow by pure deduction&lt;/strong&gt;. It is &lt;em&gt;the gap between what the data says and what the learner concludes&lt;/em&gt;. It may be explicit, or implicit.&lt;/p&gt;
&lt;p&gt;In the previous post, we saw that the Candidate-Elimination algorithm's inductive bias was the assertion &lt;strong&gt;&amp;quot;the target concept can be expressed as a conjunction of the attributes Genre, Mood, and Pacing.&amp;quot;&lt;/strong&gt; Feed that assertion to a deductive theorem prover along with the training data, and you get the same output as the so-called &amp;quot;inductive&amp;quot; learning algorithm. The &amp;quot;induction&amp;quot; was deduction plus an unstated assumption. This is why Popper was actually correct that there is no 'induction' per se. The so-called 'inductive' algorithms work just fine in real life -- but what is actually happening under the hood is always equivalent to deduction once the background knowledge is taken into consideration.&lt;/p&gt;
&lt;p&gt;This definition is not limited to one algorithm. Every learner has an inductive bias, and identifying it tells you exactly what assumptions the learner is smuggling in.&lt;/p&gt;
&lt;h2&gt;Comparing Learners by Their Bias&lt;/h2&gt;
&lt;p&gt;As Mitchell puts it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One advantage of viewing inductive inference systems in terms of their inductive bias is that it provides a nonprocedural means of characterizing their policy for generalizing beyond the observed data (Mitchell, p. 44)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To understand what he means, consider a &lt;strong&gt;&amp;quot;Rote-Learner&amp;quot;&lt;/strong&gt; algorithm. All it does is store every training example in memory, and when you ask it to classify a new instance, it looks for an exact match. If it has seen that exact instance before, it returns the stored classification. Otherwise, it refuses to classify it at all. This algorithm has no inductive bias -- and it is misleading to even call it a &amp;quot;learner&amp;quot; because it never actually learns anything. It just stores data points. It is the &amp;quot;unbiased learner&amp;quot; from the previous post, and as we saw there, it is completely useless for generalization. (Mitchell, p. 45)&lt;/p&gt;
&lt;p&gt;Now compare this to &lt;strong&gt;the Candidate-Elimination algorithm&lt;/strong&gt; from the previous post. It classifies a new instance only when every hypothesis in the version space agrees on the answer. Its inductive bias is a single assumption: &lt;strong&gt;the target concept is contained in the hypothesis space H&lt;/strong&gt;, which is the space of conjunctions between attributes. (i.e. conjunction means the attributes can be &amp;quot;AND&amp;quot;ed but not &amp;quot;OR&amp;quot;ed together.) Because it has a stronger bias than the Rote-Learner, it can classify instances the Rote-Learner cannot. But the correctness of those classifications depends entirely on whether the bias is true -- whether the target concept really is in H. If it is not, the version space may collapse entirely -- or worse, the algorithm may converge on the wrong answer. (Mitchell, p. 45).&lt;/p&gt;
&lt;p&gt;Mitchell also describes &lt;strong&gt;the Find-S algorithm&lt;/strong&gt;, which only tracks the S boundary -- the most specific hypothesis consistent with the positive examples. It ignores negative examples entirely and uses that single hypothesis to classify everything: if the hypothesis covers a new instance, it predicts positive; otherwise, it predicts negative. This gives it an &lt;em&gt;even stronger bias than the Candidate-Elimination algorithm&lt;/em&gt;. In addition to assuming the target concept is in H, it assumes that all instances are negative unless the opposite is entailed by its other knowledge (Mitchell, p. 45). Where Candidate-Elimination would say &amp;quot;I don't know&amp;quot; when the version space is split, Find-S always has an answer: negative. The advantage is that it now always gives an answer, unlike the Candidate-Elimination algorithm. But this comes at the cost of sometimes being confidently wrong. This makes it the most aggressive of the three -- and the most dependent on its assumptions being correct.&lt;/p&gt;
&lt;p&gt;The pattern is clear: stronger bias means more generalization, but also more risk. As Mitchell notes, &amp;quot;more strongly biased methods make more inductive leaps, classifying a greater proportion of unseen instances&amp;quot; (Mitchell, p. 45). But those leaps are only as good as the assumptions that enable them.&lt;/p&gt;
&lt;p&gt;The pattern here generalizes. &lt;strong&gt;All learning algorithms can be characterized in terms of their inductive bias.&lt;/strong&gt; Some biases are categorical restrictions that completely rule out certain concepts -- like the assumption that H contains the target concept (e.g. Candidate-Elimination algorithm). Others merely express preferences, ranking some hypotheses above others -- like &amp;quot;prefer simpler hypotheses over complex ones.&amp;quot; And some inductive biases are hardwired into the algorithm's design. But it is even possible to create a learner that can modify its own inductive bias. (Mitchell, p. 45).&lt;/p&gt;
&lt;h2&gt;Restriction Bias vs. Preference Bias&lt;/h2&gt;
&lt;p&gt;Not all biases work the same way. Mitchell draws an important distinction between two kinds (Mitchell, pp. 62-64).&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;restriction bias&lt;/strong&gt; limits which hypotheses the learner can consider at all. The Candidate-Elimination algorithm has a restriction bias -- it literally cannot represent hypotheses outside its hypothesis space. If the true concept is a disjunction (i.e. &amp;quot;OR&amp;quot;s) and the space only allows conjunctions (&amp;quot;AND&amp;quot;s), the algorithm will never find it. The advantage is that within its restricted space, it searches completely -- it considers every consistent hypothesis (Mitchell, p. 64).&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;preference bias&lt;/strong&gt; does not restrict the hypothesis space but instead orders it -- the learner prefers some hypotheses over others. Decision tree learners like ID3 have a preference bias. The hypothesis space of decision trees can represent any discrete-valued function -- it is univeral! But ID3 does not search that space exhaustively. It uses a greedy, top-down strategy that favors shorter trees and places the most informative attributes near the root (Mitchell, p. 62). Its inductive bias is entirely a consequence of this search strategy, not the expressiveness of its representation.&lt;/p&gt;
&lt;p&gt;A great example of a preference bias is genetic programming. Its hypothesis space is the set of all programs that can be composed from a given set of primitives. Since nature is computable (see the &lt;a href="https://en.wikipedia.org/wiki/Church%E2%80%93Turing%E2%80%93Deutsch_principle"&gt;Church-Turing-Deutsch thesis&lt;/a&gt;), this means its search space is in principle universal -- it can represent anything. There is no restriction bias at all. But it still has an inductive bias. As Mitchell notes, the performance of genetic programming depends crucially on the choice of representation and on the choice of fitness function (Mitchell, p. 266). And realistically, the space of all programs is so vast that the algorithm will only ever explore a tiny fraction of it -- mostly relatively small programs. The evolutionary search strategy -- selection, crossover, mutation -- determines which programs get explored and which get discarded. The bias is entirely in &lt;em&gt;how&lt;/em&gt; it searches, not in &lt;em&gt;what&lt;/em&gt; it can represent.&lt;/p&gt;
&lt;p&gt;Mitchell is direct about which is generally better: a preference bias is typically more desirable than a restriction bias, because it allows the learner to work within a complete hypothesis space that is guaranteed to contain the target function. A restriction bias always carries the risk that you have excluded the right answer from the start (Mitchell, p. 64).&lt;/p&gt;
&lt;h2&gt;Neural Networks: A Different Kind of Bias&lt;/h2&gt;
&lt;p&gt;When Mitchell turns to neural networks trained with backpropagation, the inductive bias becomes harder to pin down -- but it does not disappear (Mitchell, pp. 104-107).&lt;/p&gt;
&lt;p&gt;The hypothesis space is now continuous rather than discrete. As Mitchell describes it, &amp;quot;every possible assignment of network weights represents a syntactically distinct hypothesis that in principle can be considered by the learner&amp;quot; -- the hypothesis space is the n-dimensional Euclidean space of all the network's weights (Mitchell, p. 106). And the search strategy is completely different from the algorithms we have been discussing. Backpropagation is essentially a hill-climbing algorithm that uses calculus to determine which direction is downhill. It follows the slope of the error surface, taking small steps in whatever direction reduces the error most. This means it can get trapped in local minima -- valleys that are not the deepest valley -- and is only guaranteed to converge toward some local minimum, not necessarily the global one (Mitchell, p. 104).&lt;/p&gt;
&lt;p&gt;So what is the inductive bias? Mitchell is candid that &amp;quot;it is difficult to characterize precisely the inductive bias of BACKPROPAGATION learning, because it depends on the interplay between the gradient descent search and the way in which the weight space spans the space of representable functions.&amp;quot; But he offers an approximate answer: &amp;quot;one can roughly characterize it as smooth interpolation between data points.&amp;quot; Given two positive training examples with no negative examples between them, backpropagation will tend to label points in between as positive as well (Mitchell, pp. 106-107).&lt;/p&gt;
&lt;p&gt;This is still an inductive bias -- an assumption that goes beyond what the data logically entails. The data does not say anything about what lies between the training examples. The network's architecture and training procedure are &lt;em&gt;assuming&lt;/em&gt; the answer is smooth. That assumption is what makes generalization possible, and it is also what makes generalization fallible.&lt;/p&gt;
&lt;h2&gt;Why This Matters&lt;/h2&gt;
&lt;p&gt;Every time a learning algorithm says anything about an instance it has not seen, it is going beyond what the data alone can justify. It is making an assumption. The inductive bias is that assumption, named and made explicit.&lt;/p&gt;
&lt;p&gt;This is Popper's point. You cannot get from observations to general theories without bringing something to the table. The question is never &lt;em&gt;whether&lt;/em&gt; you have prior assumptions -- you always do. The question is whether they are good ones.&lt;/p&gt;
&lt;h2&gt;An Optimal Inductive Bias?&lt;/h2&gt;
&lt;p&gt;This does raise an interesting question. Is there an &amp;quot;optimal&amp;quot; inductive bias? One whose search space is universal -- every possible function -- but whose search strategy is still efficient and tractable for any problem? We have seen that genetic programming and decision trees are both universal in their hypothesis space -- they can in principle represent any function. But universality alone is not enough. Both still depend on their search strategy to find the right hypothesis, and that search strategy is where the bias lives. And we have seen that stronger biases enable more generalization but risk being wrong. Is there a sweet spot? That is a question for a future post.&lt;/p&gt;
&lt;p&gt;All page references to Mitchell are from &lt;a href="https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf"&gt;&lt;em&gt;Machine Learning&lt;/em&gt; (McGraw-Hill, 1997)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Tue, 28 Apr 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-04-28T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2753</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/adding-epub-support-to-book2audio/</link>
      <category>System.String[]</category>
      <title>Adding EPUB Support to Book2Audio</title>
      <description>&lt;p&gt;Book2Audio started as a PDF-to-audiobook converter, but a lot of the best books come as EPUBs. This post covers how we added EPUB support by migrating code from our earlier &lt;a href="https://github.com/brucenielson/BookSearchArchive"&gt;BookSearchArchive&lt;/a&gt; project — a RAG-based book search tool discussed in &lt;a href="https://www.mindfiretechnology.com/blog/archive/our-open-source-ai-stack-the-book-search-archive/"&gt;a previous post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The code referenced in this post can be found at this specific commit in the &lt;a href="https://github.com/brucenielson/Book2Audio/tree/a4a5067a3542ff60d6aefba00b39059330355fdc"&gt;Book2Audio repository&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Where the Code Came From&lt;/h2&gt;
&lt;p&gt;BookSearchArchive included an &lt;code&gt;EPubLoader&lt;/code&gt; Haystack component that loaded EPUB files and returned raw HTML sections, and an &lt;code&gt;HTMLParser&lt;/code&gt; class that parsed those HTML sections into text chunks suitable for embedding and semantic search. Neither was designed for audio — they were optimised for RAG pipelines, returned Haystack &lt;code&gt;ByteStream&lt;/code&gt; objects, and included features like &lt;code&gt;double_notes&lt;/code&gt; that made sense for search but not for listening.&lt;/p&gt;
&lt;p&gt;The goal was to strip out the Haystack dependency, clean up the interface, and make EPUB parsing a drop-in replacement for PDF parsing in Book2Audio.&lt;/p&gt;
&lt;h2&gt;What Changed&lt;/h2&gt;
&lt;h3&gt;New File: epub_parser.py&lt;/h3&gt;
&lt;p&gt;This is the heart of the EPUB support. The &lt;code&gt;EpubParser&lt;/code&gt; class takes a path to an &lt;code&gt;.epub&lt;/code&gt; file and produces the same output as &lt;code&gt;DoclingParser&lt;/code&gt; — a list of cleaned paragraph strings and a list of metadata dicts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Single file, consistent interface.&lt;/strong&gt; Rather than a loader/parser split like BookSearchArchive, &lt;code&gt;EpubParser&lt;/code&gt; handles everything in one class:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;parser = EpubParser(&amp;quot;my_book.epub&amp;quot;, meta_data={}, min_paragraph_size=300)
docs, meta = parser.run()
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;HTML parsing via BeautifulSoup.&lt;/strong&gt; EPUBs are zipped HTML files. We use &lt;code&gt;ebooklib&lt;/code&gt; to read the EPUB and extract each section's HTML, then BeautifulSoup to traverse the tag tree. The &lt;code&gt;recursive_yield_tags&lt;/code&gt; function walks the HTML and yields leaf tags containing text, skipping structural elements like divs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chapter and section titles are emitted as paragraphs.&lt;/strong&gt; In the RAG version, titles were stored only in metadata. For audio they need to be read aloud, so chapter titles and section headers are emitted as their own standalone paragraphs before the section content.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Footnote removal.&lt;/strong&gt; The &lt;code&gt;remove_footnotes&lt;/code&gt; parameter strips superscript tags from paragraphs unless they appear as the first content — which usually means they are footnote markers at the start of a footnote paragraph rather than inline citations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sections to skip.&lt;/strong&gt; Two mechanisms are supported: a &lt;code&gt;sections_to_skip.csv&lt;/code&gt; file in the same directory as the EPUB, and a &lt;code&gt;sections_to_skip&lt;/code&gt; parameter passed directly to &lt;code&gt;run()&lt;/code&gt;. Both are additive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debug output.&lt;/strong&gt; Calling &lt;code&gt;run(generate_text_file=True)&lt;/code&gt; writes two files alongside the EPUB:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;n&amp;gt;_processed_paragraphs.txt&lt;/code&gt; — the cleaned paragraph text&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;n&amp;gt;_processed_meta.txt&lt;/code&gt; — metadata alongside each paragraph, useful for verifying chapter and section attribution&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Updated: general_utils.py&lt;/h3&gt;
&lt;p&gt;The refactor revealed that a lot of text cleaning logic was duplicated or misplaced. We moved reusable utilities from &lt;code&gt;docling_utils.py&lt;/code&gt; into &lt;code&gt;general_utils.py&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;is_sentence_end&lt;/code&gt; and &lt;code&gt;is_ends_with_punctuation&lt;/code&gt; — pure string functions with no DocItem dependency&lt;/li&gt;
&lt;li&gt;&lt;code&gt;is_roman_numeral&lt;/code&gt; and &lt;code&gt;enhance_title&lt;/code&gt; — migrated from &lt;code&gt;parse_utils.py&lt;/code&gt; in BookSearchArchive&lt;/li&gt;
&lt;li&gt;&lt;code&gt;load_sections_to_skip&lt;/code&gt; — CSV loading logic, shared between EPUB and potentially other formats&lt;/li&gt;
&lt;li&gt;The full &lt;code&gt;clean_text&lt;/code&gt; pipeline — whitespace, hyphens, quotes, punctuation spacing, bracket spacing, apostrophes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;docling_utils.py&lt;/code&gt; now focuses on what it should: DocItem inspection helpers and &lt;code&gt;clean_pdf_text&lt;/code&gt;, which extends &lt;code&gt;clean_text&lt;/code&gt; with PDF-specific steps for ligature normalisation, encoding artifact correction, and footnote number stripping.&lt;/p&gt;
&lt;h3&gt;Updated: book_converter.py&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;convert_to_audio&lt;/code&gt; now dispatches by file extension:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if suffix == '.pdf':
    # DoclingParser
elif suffix == '.epub':
    # EpubParser
elif suffix == '.txt':
    # read and convert directly
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;sections_to_skip&lt;/code&gt; parameter threads all the way from the command line through &lt;code&gt;main()&lt;/code&gt; and &lt;code&gt;convert_to_audio&lt;/code&gt; to &lt;code&gt;EpubParser.run()&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Using It&lt;/h2&gt;
&lt;h3&gt;From the Command Line&lt;/h3&gt;
&lt;p&gt;Convert an EPUB to audio:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/my_book.epub&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Dry run with debug output to inspect what the parser extracted:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/my_book.epub&amp;quot; --dry-run --generate-text-file
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Skip front matter and navigation sections:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/my_book.epub&amp;quot; --sections-to-skip cover titlepage toc
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Command Line Parameters&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;file_path&lt;/code&gt; — path to the EPUB file&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--dry-run&lt;/code&gt; — parse the document but skip audio generation&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--generate-text-file&lt;/code&gt; — save processed paragraph and metadata files alongside the source EPUB&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--sections-to-skip&lt;/code&gt; — one or more section IDs to skip, separated by spaces&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--engine&lt;/code&gt; — TTS engine to use: &lt;code&gt;kokoro&lt;/code&gt; (default) or &lt;code&gt;qwen&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--voice&lt;/code&gt; — Kokoro voice identifier (default: &lt;code&gt;af_heart&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--speaker&lt;/code&gt; — Qwen speaker name (default: &lt;code&gt;vivian&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--output-file&lt;/code&gt; — path to the output WAV file&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;From Python&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;from epub_parser import EpubParser

parser = EpubParser(
    source=&amp;quot;documents/my_book.epub&amp;quot;,
    meta_data={&amp;quot;source&amp;quot;: &amp;quot;my_book.epub&amp;quot;},
    min_paragraph_size=300,
    remove_footnotes=True
)
docs, meta = parser.run(
    generate_text_file=True,
    sections_to_skip=[&amp;quot;cover&amp;quot;, &amp;quot;titlepage&amp;quot;, &amp;quot;toc&amp;quot;]
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;--generate-text-file&lt;/code&gt; flag is especially useful for EPUBs where section IDs vary between publishers and you need to identify which ones to skip. Run with &lt;code&gt;--dry-run --generate-text-file&lt;/code&gt; first, inspect the output, then add unwanted section IDs to &lt;code&gt;--sections-to-skip&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;What We Left Behind&lt;/h2&gt;
&lt;p&gt;A few things from BookSearchArchive did not make the cut for this version:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;double_notes&lt;/code&gt;&lt;/strong&gt; — in the RAG version, sections titled &amp;quot;Notes&amp;quot; got double the minimum paragraph size to avoid dominating search results. This does not apply to audio.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;min_section_size&lt;/code&gt;&lt;/strong&gt; — the RAG pipeline skipped sections below a minimum total length. For audio we want everything.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Haystack component wrapper&lt;/strong&gt; — &lt;code&gt;EPubLoader&lt;/code&gt; and &lt;code&gt;HTMLParserComponent&lt;/code&gt; were Haystack-specific. All of that is gone.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multiple file paths&lt;/strong&gt; — &lt;code&gt;EPubLoader&lt;/code&gt; accepted a list of files. &lt;code&gt;EpubParser&lt;/code&gt; takes one file, consistent with &lt;code&gt;DoclingParser&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Files Added or Modified&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;epub_parser.py&lt;/code&gt; — new&lt;/li&gt;
&lt;li&gt;&lt;code&gt;utils/general_utils.py&lt;/code&gt; — significantly expanded&lt;/li&gt;
&lt;li&gt;&lt;code&gt;utils/docling_utils.py&lt;/code&gt; — slimmed down, PDF-specific logic only&lt;/li&gt;
&lt;li&gt;&lt;code&gt;book_converter.py&lt;/code&gt; — EPUB dispatch in &lt;code&gt;convert_to_audio&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;book_to_audio.py&lt;/code&gt; — &lt;code&gt;sections_to_skip&lt;/code&gt; parameter in &lt;code&gt;main()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tests/test_epub_parser.py&lt;/code&gt; — new unit tests&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tests/test_general_utils.py&lt;/code&gt; — new unit tests for moved utilities&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tests/test_document_output.py&lt;/code&gt; — extended to cover EPUB integration tests&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Mon, 20 Apr 2026 00:00:00 -0600</pubDate>
      <a10:updated>2026-04-20T00:00:00-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2750</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/implementing-qwen3-tts-in-my-pdf-to-audiobook-pipeline-qwen3-tts-part-1/</link>
      <category>System.String[]</category>
      <title>Implementing Qwen3-TTS in My PDF-to-Audiobook Pipeline (Qwen3-TTS, Part 1)</title>
      <description>&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/blog/archive/using-kokoro-82m-to-convert-a-pdf-to-an-audiobook/"&gt;In my last post&lt;/a&gt;, I walked through building a PDF-to-audiobook pipeline using Kokoro for text-to-speech. The pipeline worked well enough that I've been actively using it to listen to books that only exist as PDFs. But I mentioned wanting to try Alibaba's recently open-sourced &lt;a href="https://github.com/QwenLM/Qwen3-TTS"&gt;Qwen3-TTS&lt;/a&gt; as an alternative voice engine, and I've now done exactly that. (&lt;a href="https://github.com/brucenielson/Book2Audio/tree/8e7e547b8f97625c62d82f55bbe43d286daceb73"&gt;My code is found in my github repo&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;This post covers how to use Qwen3-TTS to generate speech from the command line and a discussion of how I refactored the code to support multiple TTS engines. A follow-up post will walk through the Qwen engine code itself.&lt;/p&gt;
&lt;h2&gt;Trying Qwen3-TTS&lt;/h2&gt;
&lt;p&gt;Before touching any of my existing code, I wanted to hear what Qwen3-TTS actually sounded like. The setup is straightforward. Install the package:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install -U qwen-tts
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The first time you use a model, the weights download automatically from Hugging Face. The 0.6B model is roughly 1.2GB and the 1.7B model is around 3.4GB.&lt;/p&gt;
&lt;p&gt;Once installed, generating speech from Python is only a few lines:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel

model = Qwen3TTSModel.from_pretrained(
    &amp;quot;Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice&amp;quot;,
    device_map=&amp;quot;cuda:0&amp;quot;,
    dtype=torch.bfloat16,
)

wavs, sr = model.generate_custom_voice(
    text=&amp;quot;The philosopher argued that all knowledge is provisional.&amp;quot;,
    language=&amp;quot;English&amp;quot;,
    speaker=&amp;quot;ryan&amp;quot;,
)

sf.write(&amp;quot;output.wav&amp;quot;, wavs[0], sr)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://github.com/brucenielson/Book2Audio/blob/8e7e547b8f97625c62d82f55bbe43d286daceb73/try_qwen3-tts.py"&gt;Code found here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You load a model with &lt;code&gt;from_pretrained&lt;/code&gt;, call &lt;code&gt;generate_custom_voice&lt;/code&gt; with your text, a language, and a speaker name, and you get back a list of waveform arrays and a sample rate. Write the first waveform to a file and you have a WAV you can play.&lt;/p&gt;
&lt;p&gt;Qwen3-TTS comes with nine built-in speakers: aiden, dylan, eric, ono&lt;em&gt;anna, ryan, serena, sohee, uncle&lt;/em&gt;fu, and vivian. They vary quite a bit in tone and accent. I'd recommend generating a short sample with each one to find what works for your use case. However, my experience is that you get a somewhat different voice each time you run the voice. This makes it less than desirable for reading an audio book.&lt;/p&gt;
&lt;p&gt;The 1.7B CustomVoice model also supports an &lt;code&gt;instruct&lt;/code&gt; parameter that lets you control the delivery style with natural language:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;wavs, sr = model.generate_custom_voice(
    text=&amp;quot;The philosopher argued that all knowledge is provisional.&amp;quot;,
    language=&amp;quot;English&amp;quot;,
    speaker=&amp;quot;ryan&amp;quot;,
    instruct=&amp;quot;Read in a calm, steady audiobook narration style&amp;quot;,
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is a genuinely interesting feature, but be aware that instruction control only works on the 1.7B models. The 0.6B models silently ignore the &lt;code&gt;instruct&lt;/code&gt; parameter.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://huggingface.co/collections/Qwen/qwen3-tts"&gt;Find a list of all the available models on Hugging Face here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Refactoring for Multiple Engines&lt;/h2&gt;
&lt;p&gt;My original code for Book2Audio had the Kokoro TTS model wired directly into the &lt;code&gt;AudioGenerator&lt;/code&gt; class. To support Qwen3-TTS as an alternative, I needed to pull the model-specific logic out and make it swappable. (Apologies for naming the repo Book2Audio and the python file book&lt;em&gt;to&lt;/em&gt;audio. I need to rename the repo at some point to match.)&lt;/p&gt;
&lt;p&gt;The approach was a straightforward application of the strategy pattern. I created a &lt;code&gt;TTSEngine&lt;/code&gt; abstract base class with two methods: &lt;code&gt;generate&lt;/code&gt;, which takes text and returns a numpy audio array, and a &lt;code&gt;sample_rate&lt;/code&gt; property. Then I wrote two concrete implementations: &lt;code&gt;KokoroEngine&lt;/code&gt; wrapping the existing Kokoro pipeline, and &lt;code&gt;QwenCustomVoiceEngine&lt;/code&gt; wrapping Qwen3-TTS.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;AudioGenerator&lt;/code&gt;, which previously owned the Kokoro pipeline directly, now takes any &lt;code&gt;TTSEngine&lt;/code&gt;. It delegates audio generation to whatever engine it's given and handles only the model-agnostic work of saving WAV files. &lt;code&gt;BookToAudio&lt;/code&gt;, the class that orchestrates document parsing and paragraph-by-paragraph generation, didn't need to change at all. It still talks to &lt;code&gt;AudioGenerator&lt;/code&gt; the same way it always did.&lt;/p&gt;
&lt;p&gt;I also split the single &lt;code&gt;book_to_audio.py&lt;/code&gt; file into several files. The engines live in their own directory, &lt;code&gt;AudioGenerator&lt;/code&gt; and &lt;code&gt;BookToAudio&lt;/code&gt; each got their own module, and &lt;code&gt;book_to_audio.py&lt;/code&gt; became a thin CLI entry point. This makes it easy to add more engines later without everything piling up in one file.&lt;/p&gt;
&lt;p&gt;From the command line, switching engines is just a flag:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --engine kokoro --voice af_heart

python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --engine qwen --speaker ryan --language English
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can also convert plain text directly without a PDF:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py --text &amp;quot;Hello world&amp;quot; --engine qwen --speaker vivian
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;By default, the Qwen engine uses the 0.6B model. To use the larger 1.7B model, which supports instruction control:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --engine qwen --speaker ryan --language English --model-size 1.7b --instruct &amp;quot;Read in a calm, steady audiobook narration style&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For long documents, you can limit the page range to test on a small section before committing to a full run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --engine qwen --speaker ryan --start-page 10 --end-page 15
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To process the document without generating audio — useful for inspecting the extracted text before spending time on generation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --dry-run --generate-text-file
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And to specify an output file name instead of the default:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;documents/MyBook.pdf&amp;quot; --engine qwen --speaker ryan --output-file &amp;quot;my_audiobook.wav&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The Kokoro path works exactly as before. The Qwen path adds a few extra options for speaker, language, model size, and style instructions.&lt;/p&gt;
&lt;h2&gt;Kokoro vs. Qwen3-TTS for Audiobooks&lt;/h2&gt;
&lt;p&gt;After testing both engines on the same material, I have to be honest: I still prefer Kokoro for most audiobook listening. The Qwen3-TTS voices, while technically impressive, tend to have cadence patterns and occasional accent shifts that can be fatiguing over long listening sessions. The &lt;code&gt;ryan&lt;/code&gt; speaker comes closest to a natural audiobook narrator in English, and I may switch to it in the future as I experiment more with the &lt;code&gt;instruct&lt;/code&gt; parameter on the 1.7B model. But for now, Kokoro's more neutral delivery wins for extended listening.&lt;/p&gt;
&lt;h2&gt;Hardware Considerations&lt;/h2&gt;
&lt;p&gt;One thing that surprised me during this process was discovering that my laptop had been running Kokoro on CPU the whole time. PyTorch had been installed without CUDA support, which meant &lt;code&gt;torch.cuda.is_available()&lt;/code&gt; returned &lt;code&gt;False&lt;/code&gt; and everything silently fell through to CPU inference. It worked, just slower than it needed to be.&lt;/p&gt;
&lt;p&gt;If you're running this on a machine with an NVIDIA GPU, make sure you install the CUDA-enabled version of PyTorch:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can verify it worked with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python -c &amp;quot;import torch; print(torch.cuda.is_available())&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In any case, my code will use your GPU if it's available and CUDA is properly installed.&lt;/p&gt;
&lt;p&gt;For Qwen3-TTS specifically, the 0.6B model needs roughly 1.5GB of VRAM and runs comfortably on a 6GB laptop GPU. The 1.7B model needs 4-6GB and may be tight on consumer hardware, especially if other applications are using the GPU. I'd recommend starting with the 0.6B model and only moving to the 1.7B if you want instruction control or find the quality insufficient. Theoretically the 1.7B should work, but I haven't really tested it yet on my laptop. I'll do that in a future post.&lt;/p&gt;
&lt;h2&gt;What's Next&lt;/h2&gt;
&lt;p&gt;In the next post, I'll walk through the actual Qwen engine code, explaining how it works and how to use the Qwen3-TTS API. I also plan to explore Qwen3-TTS voice cloning, which lets you train the model on a specific narrator's voice from just a short audio clip and then generate an entire audiobook in that style.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Fri, 17 Apr 2026 10:32:51 -0600</pubDate>
      <a10:updated>2026-04-17T10:32:51-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2748</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/machine-learning-101-the-key-concepts-behind-every-learning-algorithm/</link>
      <category>System.String[]</category>
      <title>Machine Learning 101: The Key Concepts Behind Every Learning Algorithm</title>
      <description>&lt;p&gt;Machine learning textbooks have their own vocabulary. But behind the jargon lies a process that would be deeply familiar to Karl Popper: conjecture and refutation. This post is a short reference guide to the key terms from Tom Mitchell's &lt;em&gt;Machine Learning&lt;/em&gt; -- a foundational textbook that I also draw on in my post on &lt;a href="https://www.mindfiretechnology.com/blog/archive/the-futility-of-unbiased-learning/"&gt;the futility of unbiased learning&lt;/a&gt;. For each term, I will give Mitchell's definition and then show how it maps onto Karl Popper's logic of scientific discovery.&lt;/p&gt;
&lt;p&gt;To make this concrete, imagine a 19th-century physician trying to figure out what causes a mysterious illness sweeping through a city. Patients come in with various combinations of age, diet, water source, neighborhood, and occupation. Some get sick, others do not. The physician is trying to discover the underlying rule from these observations.&lt;/p&gt;
&lt;p&gt;Here is the truth the physician does not yet know: the illness strikes patients who drink from the river &lt;em&gt;and&lt;/em&gt; live in the low-lying district near the tannery due to contamination from the tannery. Both conditions must be present -- river drinkers in the hills stay healthy, and tannery district residents who drink from wells stay healthy. Only the combination is deadly.&lt;/p&gt;
&lt;p&gt;This is, as Mitchell would put it, a concept learning task (Mitchell, p. 22). The physician's job is to discover that two-attribute rule from a handful of patients -- without knowing in advance how many attributes matter or which ones.&lt;/p&gt;
&lt;h2&gt;Instance Space&lt;/h2&gt;
&lt;p&gt;In machine learning, the &lt;strong&gt;instance space&lt;/strong&gt; (denoted X) is the set of all possible examples the learner could encounter (Mitchell, p. 22). In our medical example, it is the set of all possible patients -- every combination of age, diet, water source, neighborhood, occupation, and so on. A young well-drinking hillside baker. An old river-drinking tannery district laborer. Every combination, whether or not the physician has actually seen such a patient.&lt;/p&gt;
&lt;p&gt;In Popper's framework, the instance space is the set of all possible observations or experiments. It defines the scope of what the theory is &lt;em&gt;about&lt;/em&gt;. Most of these possible observations will never actually be made. But they all matter, because a good theory must make predictions about &lt;em&gt;all&lt;/em&gt; of them -- not just the ones we happen to have seen.&lt;/p&gt;
&lt;h2&gt;Target Concept&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;target concept&lt;/strong&gt; (denoted c) is the true rule the learner is trying to discover (Mitchell, p. 22). It is a function that correctly classifies every instance. In our example, the target concept is: a patient gets sick if and only if they drink from the river AND live in the tannery district. The target concept is unknown to the physician. The whole point of learning is to figure out what it is.&lt;/p&gt;
&lt;p&gt;In Popper's terms, the target concept is the law of nature we are searching for. We do not have direct access to it. We can only approach it indirectly through conjecture and refutation -- proposing theories and testing them against observations. We may never know for certain that we have found it, but we can know when we have &lt;em&gt;not&lt;/em&gt; found it, because our conjecture will be refuted by the evidence.&lt;/p&gt;
&lt;h2&gt;Hypothesis&lt;/h2&gt;
&lt;p&gt;A &lt;strong&gt;hypothesis&lt;/strong&gt; (denoted h) is a candidate theory -- one possible answer to the question &amp;quot;what is the target concept?&amp;quot; (Mitchell, p. 23). The physician might conjecture &amp;quot;patients who drink from the river get sick&amp;quot; or &amp;quot;patients who live in the tannery district get sick&amp;quot; or &amp;quot;old patients get sick.&amp;quot; Each of these is a hypothesis. Some are too broad, some are too narrow, and one -- &amp;quot;river drinkers in the tannery district get sick&amp;quot; -- happens to be correct. But the physician does not know that yet.&lt;/p&gt;
&lt;p&gt;A hypothesis is exactly what Popper calls a &lt;em&gt;conjecture&lt;/em&gt;. It is a bold guess about the structure of reality. It may be right or wrong. What matters is that it is &lt;em&gt;testable&lt;/em&gt; -- it makes predictions that can be checked against observations.&lt;/p&gt;
&lt;h2&gt;Hypothesis Space&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;hypothesis space&lt;/strong&gt; (denoted H) is the set of all hypotheses the learner is willing to consider (Mitchell, p. 23). This is not the set of all &lt;em&gt;possible&lt;/em&gt; explanations -- it is the set of explanations the learner &lt;em&gt;can express&lt;/em&gt; given its representation.&lt;/p&gt;
&lt;p&gt;It is intractable to consider every possible hypothesis, as this would be an infinite set. But this is unnecessary. We have a lot of background knowledge that lets us constrain the possible hypotheses we will consider. So typically we start with an already partially constrained hypothesis space based on our background knowledge. This is part of our own human &amp;quot;inductive bias.&amp;quot;&lt;/p&gt;
&lt;p&gt;But suppose our physician only considers single-attribute hypotheses -- &amp;quot;it is the water source&amp;quot; or &amp;quot;it is the neighborhood&amp;quot; -- then &lt;em&gt;the correct two-attribute answer is not even in the hypothesis space&lt;/em&gt;. The physician could examine every patient in the city and still never find the answer, because the target concept cannot be expressed within the hypothesis space he is considering. The choice of hypothesis space determines what the learner can and cannot discover.&lt;/p&gt;
&lt;p&gt;In Popper's framework, the hypothesis space corresponds to the theoretical framework within which the scientist operates. No scientist considers every conceivable theory. They work within a tradition, a paradigm, or a set of background assumptions that constrain which conjectures are even formulable. As I discussed in my post on [inductive bias][2], this constraint is not a flaw -- it is a prerequisite for learning anything at all. But it must be the &lt;em&gt;right&lt;/em&gt; constraint, or the truth will be invisible to you.&lt;/p&gt;
&lt;h2&gt;Training Examples&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Training examples&lt;/strong&gt; (denoted D) are the actual observations available to the learner (Mitchell, p. 23). Each training example is an instance paired with its correct classification. A &lt;strong&gt;positive example&lt;/strong&gt; is a patient who got sick. A &lt;strong&gt;negative example&lt;/strong&gt; is a patient who stayed healthy.&lt;/p&gt;
&lt;p&gt;Suppose the physician has seen five patients so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Old, river, tannery district, laborer -- &lt;strong&gt;sick&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Young, river, tannery district, baker -- &lt;strong&gt;sick&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Old, river, hillside, farmer -- &lt;strong&gt;healthy&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Young, well, tannery district, tanner -- &lt;strong&gt;healthy&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Old, well, hillside, laborer -- &lt;strong&gt;healthy&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In Popper's framework, these are the observations against which we test our conjectures. The third patient -- a river drinker who stayed healthy -- is a falsification of the hypothesis &amp;quot;river water causes the illness.&amp;quot; The fourth patient -- a tannery district resident who stayed healthy -- falsifies &amp;quot;living in the tannery district causes the illness.&amp;quot; And the fifth patient -- an old laborer who stayed healthy -- falsifies &amp;quot;being a laborer causes the illness.&amp;quot; Only the conjunction &amp;quot;river AND tannery district&amp;quot; survives all five observations.&lt;/p&gt;
&lt;p&gt;As Popper argued, falsifications are where the real action is. A thousand sick river-drinking tannery residents do not prove that the combination is the cause -- they are merely consistent with it. But each negative example eliminates entire families of wrong theories in one stroke.&lt;/p&gt;
&lt;h2&gt;Concept Learning&lt;/h2&gt;
&lt;p&gt;Mitchell defines &lt;strong&gt;concept learning&lt;/strong&gt; as &amp;quot;inferring a boolean-valued function from training examples of its input and output&amp;quot; (Mitchell, p. 21). That sounds narrow, but it is surprisingly general. Our physician is doing concept learning: he has patients (instances), he knows which ones got sick and which did not (positive and negative examples), and he is searching for the rule that separates them. But this is also what scientists do all the time. Any time we seek a causal explanation -- what causes this disease, what makes this material brittle, why do some stars explode and others do not -- we are doing concept learning. We have observations, we have outcomes, and we are searching for the rule.&lt;/p&gt;
&lt;p&gt;But here is the key reframing that Popper would appreciate: Mitchell treats concept learning as a &lt;em&gt;search problem&lt;/em&gt;. The learner is searching through a hypothesis space for a hypothesis consistent with the training data (Mitchell, p. 23). Our physician is not staring at patients and waiting for a pattern to emerge. He is starting with a space of possible theories and eliminating the ones that fail. The hypothesis &amp;quot;it is the water&amp;quot; fails when he sees a healthy river drinker in the hills. The hypothesis &amp;quot;it is the neighborhood&amp;quot; fails when he sees a healthy well drinker in the tannery district. What remains is not what the data &lt;em&gt;induced&lt;/em&gt; but what the data &lt;em&gt;failed to refute&lt;/em&gt;. This is exactly what Popper and Donald Campbell called &amp;quot;evolutionary epistemology&amp;quot; -- knowledge grows not by accumulation but by variation and selection. You generate candidate theories, test them against reality, and the ones that survive criticism are what remain. The data does not build the theory. The data &lt;em&gt;selects&lt;/em&gt; among the theories.&lt;/p&gt;
&lt;h2&gt;Version Space&lt;/h2&gt;
&lt;p&gt;The &lt;strong&gt;version space&lt;/strong&gt; is the set of all hypotheses in H that are consistent with the training data observed so far (Mitchell, p. 31). As training examples come in, hypotheses that are contradicted by the data get eliminated and the version space shrinks.&lt;/p&gt;
&lt;p&gt;After the physician's first two patients (both sick, both river-drinking tannery district residents), many hypotheses are still alive: maybe it is the river, maybe it is the district, maybe it is the combination, maybe it is being a laborer. The version space is large. But after the third patient -- a healthy river drinker in the hills -- every hypothesis that blames the river alone is eliminated. After the fourth -- a healthy well drinker in the tannery district -- every hypothesis that blames the district alone is eliminated. The version space is closing in on the truth.&lt;/p&gt;
&lt;p&gt;In Popper's terms, the version space is the set of conjectures that have survived all attempts at refutation so far. Each new observation either leaves it unchanged or shrinks it. The goal is to shrink it until only one hypothesis remains.&lt;/p&gt;
&lt;h2&gt;Consistent Hypothesis&lt;/h2&gt;
&lt;p&gt;A &lt;strong&gt;consistent hypothesis&lt;/strong&gt; is one that correctly classifies every training example seen so far (Mitchell, p. 23). It has not been falsified by any observation.&lt;/p&gt;
&lt;p&gt;After all five patients, the hypothesis &amp;quot;river AND tannery district&amp;quot; is consistent. But so are other, more specific hypotheses. Consider: we have not yet seen a river-drinking tannery district &lt;em&gt;farmer&lt;/em&gt;. So the hypothesis &amp;quot;river AND tannery district AND NOT farmer&amp;quot; is equally consistent with everything we have observed -- maybe farmers are somehow immune. Likewise, both sick patients happened to be either laborers or bakers, so &amp;quot;river AND tannery district AND (laborer OR baker)&amp;quot; also fits the data. We simply have not seen enough patients to distinguish these hypotheses from each other.&lt;/p&gt;
&lt;p&gt;So &amp;quot;consistent&amp;quot; does not mean &amp;quot;correct.&amp;quot; It means &amp;quot;not yet refuted.&amp;quot; This is precisely Popper's point about corroboration. A theory that has survived testing is &lt;em&gt;corroborated&lt;/em&gt;, not confirmed. It has proven its mettle so far, but it remains permanently open to future refutation. The physician would need to find a river-drinking tannery district farmer to tell these hypotheses apart.&lt;/p&gt;
&lt;h2&gt;The General-to-Specific Ordering&lt;/h2&gt;
&lt;p&gt;Mitchell observes that hypotheses can be naturally ordered from general to specific (Mitchell, p. 24). A more general hypothesis classifies more instances as positive. The hypothesis &amp;quot;anyone who drinks from the river gets sick&amp;quot; is more general than &amp;quot;river drinkers in the tannery district get sick&amp;quot; -- the first covers a superset of the cases covered by the second.&lt;/p&gt;
&lt;p&gt;This ordering has a direct Popperian interpretation. Popper argued that more general theories are &lt;em&gt;more falsifiable&lt;/em&gt; -- they make bolder claims about the world, and therefore there are more ways they could be wrong. &amp;quot;All river drinkers get sick&amp;quot; is easier to refute than &amp;quot;river drinkers in the tannery district get sick,&amp;quot; because the first makes predictions about a far wider range of patients. Popper considered this a virtue. Bolder theories, when they survive testing, tell us more about the world.&lt;/p&gt;
&lt;p&gt;The Candidate-Elimination algorithm exploits this ordering to efficiently search the hypothesis space -- tracking the most general and most specific surviving hypotheses and using each new observation to tighten the bounds. As I described in my post on &lt;a href="https://www.mindfiretechnology.com/blog/archive/the-futility-of-unbiased-learning/"&gt;the futility of unbiased learning&lt;/a&gt;, this is falsification implemented as a computer program.&lt;/p&gt;
&lt;h2&gt;The Takeaway&lt;/h2&gt;
&lt;p&gt;Machine learning is not induction. It is search -- a search through a space of conjectures, guided by observations, eliminating the hypotheses that fail. The vocabulary is different, but the logic is the same logic Karl Popper described: bold conjectures, tested against experience, with the wrong ones ruthlessly discarded.&lt;/p&gt;
&lt;p&gt;The key terms from Mitchell map cleanly onto this process:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;instance space&lt;/strong&gt; is the domain of possible observations&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;target concept&lt;/strong&gt; is the unknown law we seek&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;hypothesis&lt;/strong&gt; is a conjecture&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;hypothesis space&lt;/strong&gt; is the set of conjectures we are willing to consider&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Training examples&lt;/strong&gt; are the observations that test our conjectures&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Concept learning&lt;/strong&gt; is the search for surviving conjectures&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;version space&lt;/strong&gt; is the set of conjectures not yet refuted&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;consistent hypothesis&lt;/strong&gt; is a conjecture that has survived all tests so far&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;general-to-specific ordering&lt;/strong&gt; reflects Popper's insight that bolder theories are more falsifiable&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All page references to Mitchell are from &lt;a href="https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf"&gt;&lt;em&gt;Machine Learning&lt;/em&gt; (McGraw-Hill, 1997)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Fri, 10 Apr 2026 10:53:35 -0600</pubDate>
      <a10:updated>2026-04-10T10:53:35-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2746</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/book2audio-reviving-my-pdf-to-audiobook-project-and-fighting-dependency-hell-along-the-way/</link>
      <category>System.String[]</category>
      <title>Book2Audio: Reviving My PDF-to-Audiobook Project (and Fighting Dependency Hell Along the Way)</title>
      <description>&lt;p&gt;A while back, I wrote about &lt;a href="https://www.mindfiretechnology.com/blog/archive/using-kokoro-82m-to-convert-a-pdf-to-an-audiobook/"&gt;using Kokoro to convert a PDF into an audiobook&lt;/a&gt;. The core idea was straightforward: take a PDF, strip away the noise—headers, footers, page numbers, figure captions—and feed clean text into a text-to-speech model so the result actually sounds like someone reading a book, not someone reading a formatted document aloud.&lt;/p&gt;
&lt;p&gt;To get there, I leaned on &lt;a href="https://www.mindfiretechnology.com/blog/archive/docling-for-pdf-to-markdown-conversion/"&gt;IBM's Docling for PDF-to-Markdown conversion&lt;/a&gt;, which does an impressive job of understanding page layout, reading order, and document structure. I also used &lt;a href="https://www.mindfiretechnology.com/blog/archive/using-nltk-to-improve-rag-retrieval-augmented-generation-text-quality/"&gt;NLTK to clean up hyphenated words&lt;/a&gt; that PDF extraction loves to leave behind—the kind where &amp;quot;understand-\ning&amp;quot; gets split across a line break and ends up in your audio as two separate words.&lt;/p&gt;
&lt;h2&gt;Picking It Back Up&lt;/h2&gt;
&lt;p&gt;Unfortunately, I never got back to that project. The original code was embedded inside my &amp;quot;Book Search Archive&amp;quot; app, which had grown large and unwieldy enough that even small changes felt like a chore. So I decided to start fresh with a dedicated repository focused entirely on the PDF-to-audio pipeline: &lt;a href="https://github.com/brucenielson/Book2Audio/tree/922b7af3d23b85d7ba8326b8cc7900d0f16df171"&gt;Book2Audio on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Clean slate, single purpose, no distractions. Simple, right?&lt;/p&gt;
&lt;h2&gt;The Docling Regression&lt;/h2&gt;
&lt;p&gt;It was not simple.&lt;/p&gt;
&lt;p&gt;The first thing I discovered was that the latest version of IBM Docling simply could not read my test PDF past page 17. Every time it hit that page, the conversion would die with an error like this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Stage preprocess failed for run 1, pages [17]: std::bad_alloc&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That's a memory allocation failure deep inside the C++ PDF parsing backend. &lt;a href="https://github.com/docling-project/docling/issues/2670"&gt;It appears to be a known but unfixed error.&lt;/a&gt; After a fair amount of troubleshooting—trying different configurations, different PDFs, different environments—I came to accept that the newer version of Docling just didn't handle my test document as well as the original version I'd used months ago. Software updates aren't always upgrades, especially when the underlying parser has been rearchitected.&lt;/p&gt;
&lt;h2&gt;Python Dependency Hell&lt;/h2&gt;
&lt;p&gt;Rather than keep fighting the latest release, I decided to roll back to the older version that had worked before. This turned out to be its own adventure. Pinning &lt;code&gt;docling==2.14.0&lt;/code&gt; is one thing; getting it to coexist peacefully with Kokoro, NLTK, and all of their transitive dependencies is another.&lt;/p&gt;
&lt;p&gt;The pip resolver tried its best, but ultimately I had to hand-tune the versions myself. After a lot of trial and error, here's the incantation that finally worked:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;bash
pip install docling==2.14.0 kokoro==0.7.15 misaki==0.7.15 typer==0.12.5 numpy==1.26.4 opencv-python-headless==4.10.0.84
pip install nltk==3.9.1
pip install soundfile==0.13.1&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;There's a &lt;code&gt;requirements.txt&lt;/code&gt; in the repo as well, but I'll be honest—your mileage may vary. This is one of the perennial frustrations of the Python ecosystem. The pip installer does a reasonable job checking dependencies in isolation, but it often can't resolve conflicts across packages that each have their own strong opinions about which version of NumPy or OpenCV they need. And some of these packages really want Python 3.12 or lower, which adds yet another variable.&lt;/p&gt;
&lt;p&gt;If you've spent any time in Python-land, none of this will surprise you. But it's worth documenting, if only so future-me doesn't have to rediscover it.&lt;/p&gt;
&lt;h2&gt;But it Works!&lt;/h2&gt;
&lt;p&gt;If you follow the above instructions, it will work! I am actively using even this primitive version to convert PDFs into audio books. It's surprising how well it does given how unsophisticated it is. &lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/brucenielson/Book2Audio/tree/922b7af3d23b85d7ba8326b8cc7900d0f16df171"&gt;Go to my github repo&lt;/a&gt;  and clone it and then run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python book_to_audio.py &amp;quot;BookTitle.pdf&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There is a full readme file that explains the full details of how to use it. This is a decently working app even in this early state. And the code base has full unit tests. &lt;/p&gt;
&lt;h2&gt;What's Next&lt;/h2&gt;
&lt;p&gt;The current state of the repo is a working starting point: it can take a PDF, parse it with Docling, clean the text, and generate audio with Kokoro. But I have bigger plans.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Upgrading the voice with Qwen3-TTS.&lt;/strong&gt; Kokoro is capable, but I want to try Alibaba's recently open-sourced &lt;a href="https://github.com/QwenLM/Qwen3-TTS"&gt;Qwen3-TTS&lt;/a&gt;. What makes it especially interesting for audiobook generation is its contextual understanding—it can adapt tone, pacing, and emphasis based on the meaning of the text, not just its phonemes. It supports 10 languages, voice cloning from just 3 seconds of audio, and even voice design from natural language descriptions (&amp;quot;a warm, measured narrator voice&amp;quot;). For long-form audio like audiobooks, that kind of semantic awareness could make a real difference in listenability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Smarter text cleanup with local LLMs.&lt;/strong&gt; Right now, NLTK handles the text smoothing—fixing hyphenation, removing artifacts. But I want to experiment with running a local model through Ollama to do more intelligent cleanup: identifying and removing headers, footers, page numbers, figure references, and other extraneous text that shouldn't appear in an audiobook. An LLM can understand &lt;em&gt;context&lt;/em&gt; in a way that regex and rule-based approaches can't. Maybe the ideal approach is a combination of both—NLTK for the mechanical stuff, and a small fine-tuned model for the judgment calls. This might even be a good candidate for fine-tuning a smaller model specifically for the task of stripping non-narrative content from documents.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/brucenielson/Book2Audio/tree/346d4d7ecd37434348ff7cce3adbf22e608e3108"&gt;Book2Audio repo&lt;/a&gt; is public if you want to follow along or contribute. It's early days, but the foundation is in place.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Fri, 03 Apr 2026 12:38:46 -0600</pubDate>
      <a10:updated>2026-04-03T12:38:46-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2730</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/induction-is-a-myth-the-futility-of-unbiased-learning/</link>
      <category>System.String[]</category>
      <title>Induction is a Myth: The Futility of Unbiased Learning</title>
      <description>&lt;h2&gt;Karl Popper's Disproof of Induction&lt;/h2&gt;
&lt;p&gt;Karl Popper argued that it is logically impossible to derive a general theory from specific observations. You can stare at a million data points and no universal law will ever logically &lt;em&gt;follow&lt;/em&gt; from them. The observations are always concrete and specific; the theory is always abstract and universal. You simply cannot get from one to the other by logic alone. (See &lt;a href="https://amzn.to/4rvnaiq"&gt;&lt;em&gt;Conjectures and Refutations: The Growth of Scientific Knowledge&lt;/em&gt;&lt;/a&gt;, pp. 251-253)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Thus induction is a myth. No &amp;quot;inductive logic&amp;quot; exists. And although
there exists a &amp;quot;logical&amp;quot; interpretation of the probability calculus,
there is no good reason to assume that this &amp;quot;generalized logic&amp;quot; (as
it may be called) is a system of &amp;quot;inductive logic&amp;quot; (&lt;a href="https://amzn.to/4qKARc4"&gt;&lt;em&gt;Unended Quest&lt;/em&gt;&lt;/a&gt;, p. 171)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Despite such a startlingly powerful disproof, most people assumed it must be wrong. They sought after a new kind of &amp;quot;inductive logic&amp;quot; that would allow you to somehow start with only observations and generalize to a universal law. The reasoning went that there &lt;em&gt;must&lt;/em&gt; be such an inductive logic because we do—in practice—induce general laws all the time in science. &lt;a href="https://www.mindfiretechnology.com/blog/archive/dice-rolls-coin-flips-and-death-by-asteroid-a-probability-refresher/"&gt;Probability theory&lt;/a&gt;—particularly the Bayesian interpretation—was often advanced as the missing inductive logic that would bridge the gap. When it was discovered that probability theory could be thought of as &lt;a href="https://www.mindfiretechnology.com/blog/archive/from-certainty-to-belief-how-probability-extends-logic-part-2/"&gt;&amp;quot;extending deductive logic&amp;quot;&lt;/a&gt; (by which they really meant extending propositional logic, not first order logic), many became convinced they had found what they were looking for. The arrival of &lt;a href="https://www.mindfiretechnology.com/blog/archive/coxs-theorem-is-probability-theory-universal/"&gt;Cox's theorem&lt;/a&gt; was often interpreted as cementing this idea.&lt;/p&gt;
&lt;p&gt;And on top of all that, wasn't it just a fact that machine learning algorithms generalize from data every day? Your streaming service watches you binge a few movies and then somehow knows what to recommend next. That's induction, right? Isn't the machine deriving a general rule from specific observations? Popper must be wrong!&lt;/p&gt;
&lt;h2&gt;Tom Mitchell's &amp;quot;Futility of Bias-Free Learning&amp;quot;&lt;/h2&gt;
&lt;p&gt;Tom Mitchell—one of the foundational figures in machine learning—has a devastating answer to this question. In his textbook &lt;a href="https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf"&gt;&lt;em&gt;Machine Learning&lt;/em&gt; (McGraw-Hill, 1997)&lt;/a&gt;, he proves something that should be far better known: a learner that makes no prior assumptions about what it's looking for &lt;em&gt;cannot generalize at all&lt;/em&gt;. Not even a little. It can memorize what it has seen, but the moment you show it something new, it has literally no rational basis for making a classification. Mitchell calls this &amp;quot;The Futility of Bias-Free Learning&amp;quot; (Mitchell, p. 42).&lt;/p&gt;
&lt;p&gt;What makes this so interesting is that Mitchell arrives at essentially the same conclusion as Popper, but from the completely opposite direction. Popper was a philosopher &lt;strong&gt;arguing against inductivism&lt;/strong&gt;. Mitchell is a computer scientist &lt;strong&gt;trying to make induction &lt;em&gt;work&lt;/em&gt;&lt;/strong&gt;. And yet they converge on the same point: you cannot generalize from observations alone. You always need something else—some set of prior assumptions—to bridge the gap.&lt;/p&gt;
&lt;p&gt;But Mitchell goes one step further. He shows that when you &lt;em&gt;do&lt;/em&gt; add those prior assumptions, the resulting &amp;quot;inductive&amp;quot; algorithm is actually &lt;em&gt;equivalent to a deductive theorem prover&lt;/em&gt;. &lt;strong&gt;The so-called induction was deduction all along,&lt;/strong&gt; just as Popper claimed. It just didn't look like it.&lt;/p&gt;
&lt;p&gt;Let me walk through how this works.&lt;/p&gt;
&lt;h2&gt;A Movie Recommendation Problem&lt;/h2&gt;
&lt;p&gt;Imagine a movie streaming service trying to figure out whether you'll enjoy a given film. To keep things tractable, suppose the system describes movies using three attributes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Genre&lt;/strong&gt;: Action, Comedy, Drama, Sci-Fi&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mood&lt;/strong&gt;: Light, Dark&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pacing&lt;/strong&gt;: Fast, Slow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each movie is some combination of these attributes, and your reaction is binary: you either enjoyed it or you didn't. The service's job is to figure out the general rule—the &lt;em&gt;target concept&lt;/em&gt;—that explains your taste.&lt;/p&gt;
&lt;p&gt;This isn't meant to be a realistic example, but suppose you subconsciously only enjoy fast-paced movies with a dark mood, regardless of genre. The service is trying to figure that out and then recommend other such movies to you.&lt;/p&gt;
&lt;p&gt;Now, there are 4 x 2 x 2 = 16 possible movie descriptions in this space. And the streaming service has watched you react to five or six of them. From those data points, it needs to learn a rule that correctly predicts your reaction to all the other movies you haven't seen yet.&lt;/p&gt;
&lt;p&gt;This is, at its core, what Mitchell calls a &amp;quot;concept learning&amp;quot; task: searching through a space of possible hypotheses for the one that fits the observed data (Mitchell, p. 23).&lt;/p&gt;
&lt;h2&gt;How Candidate-Elimination Works (a.k.a. Falsification)&lt;/h2&gt;
&lt;p&gt;The algorithm Mitchell uses to illustrate this is called the Candidate-Elimination algorithm. And despite the dry name, what it actually does is strikingly Popperian: it starts with every possible hypothesis about your taste and then systematically &lt;em&gt;eliminates&lt;/em&gt; the ones that are contradicted by the data.&lt;/p&gt;
&lt;p&gt;The algorithm restricts itself to hypotheses that take the form of conjunctions of attribute values. So a hypothesis might be something like &amp;quot;I enjoy Action movies that are Dark&amp;quot;—represented as (Action, Dark, ?), where the question mark means &amp;quot;I don't care about this attribute.&amp;quot;&lt;/p&gt;
&lt;p&gt;The algorithm also uses a null symbol to mean &amp;quot;no value is acceptable.&amp;quot; So the hypothesis (null, null, null) means &amp;quot;no movie is enjoyable&amp;quot;—the most specific possible hypothesis that rejects everything.&lt;/p&gt;
&lt;p&gt;The algorithm maintains two boundaries that define a range of surviving hypotheses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;G boundary&lt;/strong&gt; (most general hypothesis consistent with the data)&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;S boundary&lt;/strong&gt; (most specific hypothesis consistent with the data)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Everything between these two bounds is called the &amp;quot;version space&amp;quot;—in Popper's language, it is the set of hypotheses not yet refuted by the evidence (Mitchell, pp. 32-33).&lt;/p&gt;
&lt;p&gt;Before we see any data, G starts as broad as possible and S starts as narrow as possible:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step0.png" alt="Step 0: Initial state before any training data" /&gt;&lt;/p&gt;
&lt;p&gt;Now the training data starts coming in.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; You watch a dark, fast-paced action movie and enjoy it. This is a positive example, so the S boundary has to move—it generalizes from &amp;quot;no movie is enjoyable&amp;quot; to the most specific hypothesis that covers this movie. The G boundary doesn't need to change yet because nothing has been ruled out:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step1.png" alt="Step 1: First positive example generalizes S" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; You watch a light, slow comedy and don't enjoy it. This is where the real elimination begins. The current G boundary—(?, ?, ?) or &amp;quot;every movie is enjoyable&amp;quot;—is &lt;em&gt;inconsistent&lt;/em&gt; with this negative example. It predicted you'd enjoy this movie, but you didn't. So (?, ?, ?) gets falsified and must be replaced.&lt;/p&gt;
&lt;p&gt;The algorithm replaces it with the &lt;em&gt;minimal specializations&lt;/em&gt; that exclude this negative example while still being more general than S. There are exactly three ways to do this—each one changes a single attribute to rule out the movie you disliked:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step2.png" alt="Step 2: G splits into three competing hypotheses" /&gt;&lt;/p&gt;
&lt;p&gt;This is a critical moment. We now have three competing theories about your taste, and the data will eventually falsify two of them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; You watch a dark, fast-paced sci-fi movie and enjoy it. Now things get interesting on both sides. On the S side, you've liked both Action and Sci-Fi, so genre can't be the deciding factor—S generalizes to (?, Dark, Fast). On the G side, the hypothesis (Action, ?, ?) predicted you'd &lt;em&gt;only&lt;/em&gt; like Action movies, but you just liked a Sci-Fi movie. Falsified! It gets removed from G:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step3.png" alt="Step 3: S generalizes genre away, G loses the Action-only hypothesis" /&gt;&lt;/p&gt;
&lt;p&gt;Two hypotheses remain in G: maybe it's about mood (?, Dark, ?), or maybe it's about pacing (?, ?, Fast). We need more data to tell them apart.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; You watch a dark, slow-paced action movie and don't enjoy it. This movie was Dark—just like the ones you enjoyed—but you disliked it. The hypothesis (?, Dark, ?) predicted you'd enjoy it, because it says dark mood is all that matters. But you didn't. Falsified! Only (?, ?, Fast) survives in G:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step4.png" alt="Step 4: The Dark-mood-only hypothesis is falsified" /&gt;&lt;/p&gt;
&lt;p&gt;Now G says (?, ?, Fast)—pacing is all that matters—while S says (?, Dark, Fast)—both mood and pacing matter. They haven't converged yet. We need one more piece of evidence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 5:&lt;/strong&gt; You watch a light, fast-paced comedy and don't enjoy it. This is the final falsification. This movie &lt;em&gt;was&lt;/em&gt; fast-paced, matching G's only remaining requirement, but you still disliked it. The difference from the movies you enjoyed? It was Light, not Dark. G must specialize mood from &amp;quot;?&amp;quot; to &amp;quot;Dark&amp;quot;—and now S and G are identical:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step5.png" alt="Step 5: Final falsification forces convergence" /&gt;&lt;/p&gt;
&lt;p&gt;The version space has been squeezed from both sides until exactly one hypothesis remains:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/step_final_v2.png" alt="Final: The algorithm has converged" /&gt;&lt;/p&gt;
&lt;p&gt;Notice what happened here. The algorithm did not &lt;em&gt;build up&lt;/em&gt; a theory from observations. It &lt;em&gt;tore down&lt;/em&gt; the wrong theories. It started with a space of conjectures and refuted them using evidence. This is Popper's falsificationism implemented as a computer program. Mitchell himself frames concept learning as a search problem (Mitchell, p. 23), and as I've argued on &lt;a href="https://open.spotify.com/episode/5bnUoJP1D8nDLwFVK4yCaq"&gt;my podcast&lt;/a&gt;, search is really just a form of variation and selection—which is exactly how Donald Campbell and Karl Popper described the growth of knowledge.&lt;/p&gt;
&lt;p&gt;If the training data is error-free and the correct hypothesis is somewhere in the hypothesis space, the algorithm will converge on it. Every incorrect hypothesis gets falsified. The correct one survives.&lt;/p&gt;
&lt;h2&gt;The Catch: A Biased Hypothesis Space&lt;/h2&gt;
&lt;p&gt;But here's the problem. We restricted the hypothesis space to conjunctions of attributes—meaning the learned rule can only take the form &amp;quot;attribute 1 must be X &lt;em&gt;and&lt;/em&gt; attribute 2 must be Y &lt;em&gt;and&lt;/em&gt; attribute 3 must be Z&amp;quot; (where any of those can be relaxed to &amp;quot;any value is fine&amp;quot;).&lt;/p&gt;
&lt;p&gt;That works for a rule like (?, Dark, Fast)—&amp;quot;dark and fast-paced, any genre.&amp;quot; But what if your actual taste is more complicated? What if you enjoy dark action movies &lt;em&gt;and also&lt;/em&gt; light comedies—two completely different profiles with nothing in common? There is no single conjunction of attributes that captures both. Any conjunction broad enough to include dark action movies and light comedies would also include things you don't like, such as dark comedies or light action movies.&lt;/p&gt;
&lt;p&gt;Our conjunctive hypothesis space—including hypotheses with ? (any value) and null (no value)—contains only 46 hypotheses, a tiny fraction of the 65,536 possible concepts that could be defined over our 16 movie descriptions. As Mitchell puts it, &amp;quot;a very biased hypothesis space indeed!&amp;quot; (Mitchell, p. 41).&lt;/p&gt;
&lt;p&gt;If the correct hypothesis is not representable in the space, the algorithm will fail. It will eliminate every hypothesis, leaving the version space empty, and you'll know something has gone wrong.&lt;/p&gt;
&lt;p&gt;So there is an obvious temptation: why not just expand the hypothesis space to include every possible hypothesis? Remove the bias entirely. Let the learner consider any conceivable pattern in the data.&lt;/p&gt;
&lt;h2&gt;The Power Set: An Unbiased Learner&lt;/h2&gt;
&lt;p&gt;Mitchell takes this idea seriously. He proposes expanding the hypothesis space to the &lt;em&gt;power set&lt;/em&gt; of all possible instances—the set of all possible subsets (Mitchell, p. 40). This means the learner can now represent &lt;em&gt;any&lt;/em&gt; target concept whatsoever. No more restrictions. No more bias. The learner can consider disjunctions, negations, arbitrary combinations—everything. If you like dark action movies and light comedies but nothing else, there's a hypothesis for that. If you like exactly seven specific movies for no discernible reason, there's a hypothesis for that too.&lt;/p&gt;
&lt;p&gt;For our movie example with 16 possible movie descriptions, the power set contains 2^16—65,536 possible target concepts. Our biased conjunctive space could only represent a handful of those. The unbiased learner must now contend with all 65,536.&lt;/p&gt;
&lt;p&gt;Problem solved, right?&lt;/p&gt;
&lt;p&gt;Not even close. Mitchell shows that this &amp;quot;unbiased&amp;quot; learner is now &lt;em&gt;completely unable to generalize beyond the observed examples&lt;/em&gt; (Mitchell, p. 41).&lt;/p&gt;
&lt;p&gt;To see why, think about what the algorithm has to work with after seeing our five training examples. It knows you liked two specific movies and disliked three specific movies. In the biased version, the conjunctive restriction forced the algorithm's hand—there were only so many ways to draw the line, and most of them got falsified. But now? The hypothesis space contains &lt;em&gt;every possible&lt;/em&gt; way to divide the 16 movies into &amp;quot;liked&amp;quot; and &amp;quot;disliked.&amp;quot; And there are a staggering number of ways to do that which are perfectly consistent with our five data points.&lt;/p&gt;
&lt;p&gt;Consider a new movie you haven't rated—say (Drama, Dark, Fast). Should the algorithm predict you'll enjoy it? In the biased version, the answer was clear: (?, Dark, Fast) covers it, so yes. But in the unbiased version, for every hypothesis in the version space that says you'll like this movie, there exists another hypothesis that is identical in every respect—agrees on all five training examples—except that it says you &lt;em&gt;won't&lt;/em&gt; like this one (Mitchell, p. 41). Both hypotheses are equally consistent with everything the algorithm has seen.&lt;/p&gt;
&lt;p&gt;This isn't a minor inconvenience. It's total paralysis. The version space splits exactly 50/50 on every unseen movie (Mitchell, p. 41). The algorithm can only say &amp;quot;I don't know&amp;quot; to every new movie it encounters. It has become a glorified lookup table—perfectly memorizing what it has seen, but completely powerless to predict anything it hasn't.&lt;/p&gt;
&lt;p&gt;The reason is almost embarrassingly simple once you see it. In the biased version, the conjunctive restriction was doing most of the work. It told the algorithm: &amp;quot;The answer has a &lt;em&gt;structure&lt;/em&gt;—it's a rule defined by attribute values.&amp;quot; That assumption is what made it possible to look at a dark, fast-paced sci-fi movie and say &amp;quot;this is similar to the dark, fast-paced action movie you liked, so you'll probably like this too.&amp;quot; Without that structural assumption, there is no basis for calling any two movies &amp;quot;similar.&amp;quot; Each movie is just an isolated point, and knowing you liked one tells you nothing about any other.&lt;/p&gt;
&lt;p&gt;To converge on a single final hypothesis, this unbiased learner would need to see &lt;em&gt;every single one&lt;/em&gt; of the 16 possible movies as a training example (Mitchell, p. 41). At which point it hasn't learned anything—it's just stored your complete viewing history.&lt;/p&gt;
&lt;h2&gt;Mitchell's Conclusion: The Futility of Bias-Free Learning&lt;/h2&gt;
&lt;p&gt;Mitchell states his conclusion directly:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;quot;A learner that makes no a priori assumptions regarding the identity
of the target concept has no rational basis for classifying any unseen
instances.&amp;quot; (Mitchell, p. 42)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Read that again. It is a remarkable statement. It means that the only reason the original Candidate-Elimination algorithm could generalize at all was because it had a built-in assumption—the bias toward conjunctive hypotheses—that constrained the space of possibilities.&lt;/p&gt;
&lt;p&gt;Mitchell calls this assumption the algorithm's &amp;quot;inductive bias.&amp;quot; We'll discuss the formal definition of inductive bias in more detail in a future post. But a short version is that the inductive bias of a learner is the minimal set of additional assertions B such that the learner's classifications follow deductively from B combined with the training data (Mitchell, pp. 42-43).&lt;/p&gt;
&lt;p&gt;That last part is the kicker. Mitchell is saying that what looks like induction is actually deduction in disguise—just as Popper claimed. The algorithm appears to be generalizing from data, but what it is really doing is deducing conclusions from the data plus an unstated set of assumptions.&lt;/p&gt;
&lt;h2&gt;The Deductive Theorem Prover&lt;/h2&gt;
&lt;p&gt;Mitchell makes this equivalence explicit with a striking thought experiment (Mitchell, pp. 43-44). Imagine two systems side by side.&lt;/p&gt;
&lt;p&gt;On the left, the Candidate-Elimination algorithm—exactly the one we just walked through. You feed it your movie ratings and a new movie to classify. It searches the version space and outputs a prediction.&lt;/p&gt;
&lt;p&gt;On the right, a deductive theorem prover. You feed it the same movie ratings, the same new movie, and one additional input: the explicit assertion &amp;quot;the target concept can be represented as a conjunction of the attributes Genre, Mood, and Pacing.&amp;quot;&lt;/p&gt;
&lt;p&gt;Mitchell proves that these two systems will produce &lt;em&gt;identical outputs&lt;/em&gt; for every possible set of training examples and every possible new instance. They are functionally the same system. The only difference is that the inductive bias is implicit in the code of the learning algorithm, while it is explicit as an input to the theorem prover.&lt;/p&gt;
&lt;p&gt;Think about what this means. When our algorithm concluded (?, Dark, Fast), it felt like it was &lt;em&gt;inducing&lt;/em&gt; a general rule from specific examples. But Mitchell has shown that it was actually &lt;em&gt;deducing&lt;/em&gt; a conclusion from the training data &lt;em&gt;plus&lt;/em&gt; an unstated assumption about the structure of the answer. The assumption—&amp;quot;your taste can be expressed as a conjunction of attributes&amp;quot;—was baked into the algorithm's design. Make that assumption explicit, hand it to a theorem prover along with the data, and you get the same answer by pure deduction.&lt;/p&gt;
&lt;p&gt;As Mitchell puts it, the inductive bias &amp;quot;exists only in the eye of us beholders. Nevertheless, it is a perfectly well-defined set of assertions&amp;quot; (Mitchell, p. 44).&lt;/p&gt;
&lt;p&gt;The so-called induction was deduction all along.&lt;/p&gt;
&lt;h2&gt;What This Means&lt;/h2&gt;
&lt;p&gt;Let me be blunt about the implications.&lt;/p&gt;
&lt;p&gt;Mitchell has shown that every &amp;quot;inductive&amp;quot; learning algorithm is, underneath, a deductive system operating on unstated assumptions. The assumptions are doing the real work. The data merely selects among the possibilities that the assumptions have already circumscribed. Without those assumptions, we saw what happens: the unbiased learner is paralyzed, unable to classify a single new instance.&lt;/p&gt;
&lt;p&gt;This is precisely what Popper argued from the philosophy side. You cannot derive general theories from observations alone. You always need a prior theoretical framework—what Kant called imposing laws upon nature—to make sense of the data. The data doesn't speak for itself. It never has.&lt;/p&gt;
&lt;p&gt;But here is the part that both the machine learning community and many Popperians seem to miss. Popper proved that you can't generalize from observations &lt;em&gt;alone&lt;/em&gt;. He did &lt;em&gt;not&lt;/em&gt; prove that you can't generalize at all. Mitchell's work shows exactly when and how generalization becomes possible: when you bring background knowledge—an inductive bias—to the table. That bias, combined with observations, lets you deduce conclusions you couldn't have reached with either one alone.&lt;/p&gt;
&lt;p&gt;The learner doesn't start from a blank slate and induce its way to knowledge. It starts with a constrained space of possibilities and uses observations to falsify the wrong ones. That is not &amp;quot;induction&amp;quot; in the classical Baconian sense that Popper demolished. It is conjecture and refutation, running on silicon.&lt;/p&gt;
&lt;p&gt;Bias-free learning is futile. But &lt;em&gt;biased&lt;/em&gt; learning—learning with prior theoretical commitments—is not only possible, it is the &lt;em&gt;only&lt;/em&gt; kind of learning there is.&lt;/p&gt;
&lt;p&gt;This is a companion post to a &lt;a href="https://open.spotify.com/episode/5bnUoJP1D8nDLwFVK4yCaq"&gt;podcast episode&lt;/a&gt; where I discuss these ideas in more depth, including how they relate to Karl Popper's epistemology and David Deutsch's interpretation of it. All page references to Mitchell are from &lt;a href="https://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf"&gt;&lt;em&gt;Machine Learning&lt;/em&gt; (McGraw-Hill, 1997)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you need help with your &lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Thu, 12 Mar 2026 16:55:14 -0600</pubDate>
      <a10:updated>2026-03-12T16:55:14-06:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2697</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/adventures-in-langchains-quick-start-tutorial-using-ollama/</link>
      <category>System.String[]</category>
      <title>Adventures in LangChain's "Quick Start Tutorial" (Using Ollama)</title>
      <description>&lt;p&gt;Suppose you want to learn &lt;a href="https://www.langchain.com/"&gt;LangChain&lt;/a&gt;, so naturally you go to their &lt;a href="https://docs.langchain.com/oss/python/langchain/quickstart"&gt;quick start tutorial page&lt;/a&gt;. And here is what you find as their first suggested tutorial:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from langchain.agents import create_agent

def get_weather(city: str) -&amp;gt; str:
    &amp;quot;&amp;quot;&amp;quot;Get weather for a given city.&amp;quot;&amp;quot;&amp;quot;
    return f&amp;quot;It's always sunny in {city}!&amp;quot;

agent = create_agent(
    model=&amp;quot;claude-sonnet-4-5-20250929&amp;quot;,
    tools=[get_weather],
    system_prompt=&amp;quot;You are a helpful assistant&amp;quot;,
)

# Run the agent
agent.invoke(
    {&amp;quot;messages&amp;quot;: [{&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;what is the weather in sf&amp;quot;}]}
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Okay, but this requires you to run &lt;a href="https://claude.ai/"&gt;Claude&lt;/a&gt;. But doesn't Claude require an api key? Well, it doesn't say anything about that. Maybe there's a free Claude api tier that requires no key? &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Could not resolve authentication method. Expected either api_key or auth_token to be set.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Oops, nope. You need a Claude api key. I guess they were assuming I'd just know that and already have it set in my environment? Seems like a weird assumption for a quick start tutorial, but okay, I guess?&lt;/p&gt;
&lt;p&gt;No worries, I'll just go get a Claud api key. &lt;/p&gt;
&lt;p&gt;But wait, doesn't Claude require me to pay for tokens?&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/claude.png" alt="Claude Free Plan!" /&gt;&lt;/p&gt;
&lt;p&gt;Oh, whew! Claude has a free tier for testing! Good! Okay, not too bad. Let's try that quick start tutorial again.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Your credit balance is too low to access the Anthropic API. Please go to Plans &amp;amp; Billing to upgrade or purchase credits.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Arg! Okay, apparently Claude is lying and there is no free tier! Even though I took that verbiage directly off the API key page for Claude, I guess they meant I only had access to the web interface? &lt;/p&gt;
&lt;p&gt;I tried asking the &lt;a href="https://chat.langchain.com/"&gt;LangChain AI&lt;/a&gt; why this didn't work and it recommended I try out the OpenAI free tier instead and even helpfully gave me a rewrite to the 'quick start tutorial' that would do this. &lt;/p&gt;
&lt;p&gt;Wait, doesn't OpenAI have no free tier? Didn't they sunset that like two seconds after they went live and ended up with so many users? But surely the LangChain AI knows what it's talking about, right? &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Nope, the LangChain AI doesn't have a clue. &lt;/p&gt;
&lt;p&gt;Okay, no worries, I'll just rewrite this 'quick start tutorial' to use ollama. Here is what I try:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# pip pip install -U langchain langchain-ollama
# Requires Python 3.10+
# pip install -U langchain langchain-ollama
# Requires Ollama installed + &amp;quot;ollama pull llama3.2&amp;quot;
from langchain_ollama import ChatOllama  # Changed from OpenAI
from langchain.agents import create_agent
from langchain_core.tools import tool

@tool
def get_weather(city: str) -&amp;gt; str:
    &amp;quot;&amp;quot;&amp;quot;Get weather for a given city.&amp;quot;&amp;quot;&amp;quot;
    return f&amp;quot;It's always sunny in {city}!&amp;quot;

# No secrets needed!
llm = ChatOllama(model=&amp;quot;llama3.2:1b&amp;quot;)  # Free local model
agent = create_agent(
    model=llm,
    tools=[get_weather],
    system_prompt=&amp;quot;You are a helpful assistant.&amp;quot;,
)

# Run agent
# Our message
message = {
    &amp;quot;messages&amp;quot;: [{&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;what is the weather in sf&amp;quot;}]
}

result = agent.invoke(message)

# Print full result (shows all messages + tool calls)
print(&amp;quot;Full result:&amp;quot;, result)

# Print JUST the model's final response
print(&amp;quot;Model response:&amp;quot;, result[&amp;quot;messages&amp;quot;][-1].content)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Surely, I'm finally ready to use this 'quick start tutorial', right?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Model response: There is no weather data for that city.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What!? Okay, I was using Llama 1b. That sad little model probably can't figure out how to use the tool calling in this 'quick start' tutorial. &lt;/p&gt;
&lt;p&gt;Let's try Lama3.2:3b instead. I'll also improve the system prompt and get a lot more specific.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Current weather conditions in San Francisco are not available.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It must not be using the provided tool. Let me just set a breakpoint. &lt;/p&gt;
&lt;p&gt;Nope, it's using the tool...&lt;/p&gt;
&lt;p&gt;Let's just run it again, just to see what happens:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Model response: I made a mistake! The actual response from the tool is:

&amp;quot;Currently Sunny
Temperature: 62°F
Conditions: Clear
Wind: Light (5 mph)
Sky Conditions: Partly Cloudy&amp;quot;

So, to correct my previous response: The weather in SF is currently sunny with a temperature of 62°F.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Wow, it got the sunny right, finally, but it basically just made the rest of that answer up entirely. Sigh. &lt;/p&gt;
&lt;p&gt;Well, there you go. Here's my final revised 'quick start' code to use LangChain with oLlama. Good luck! (You'll need it.) &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Requires Python 3.10+
# pip install -U langchain langchain-ollama
# Requires Ollama installed + &amp;quot;ollama pull llama3.2&amp;quot;
from langchain_ollama import ChatOllama  # Changed from OpenAI
from langchain.agents import create_agent
from langchain_core.tools import tool

@tool
def get_weather(city: str) -&amp;gt; str:
    &amp;quot;&amp;quot;&amp;quot;Get weather for a given city.&amp;quot;&amp;quot;&amp;quot;
    return f&amp;quot;It's always sunny in {city}!&amp;quot;

# No secrets needed!
llm = ChatOllama(model=&amp;quot;llama3.2:3b&amp;quot;)  # Free local model
agent = create_agent(
    model=llm,
    tools=[get_weather],
    system_prompt=&amp;quot;&amp;quot;&amp;quot;
    You are a weather assistant.
    1. ALWAYS use get_weather tool for weather questions
    2. REPORT EXACTLY what the tool returns - do not make up data
    3. Tool result = actual weather data
    4. Base your answer ONLY on tool output
    &amp;quot;&amp;quot;&amp;quot;,
)

# Run agent
# Our message
message = {
    &amp;quot;messages&amp;quot;: [{&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;what is the weather in sf&amp;quot;}]
}

result = agent.invoke(message)

# Print full result (shows all messages + tool calls)
print(&amp;quot;Full result:&amp;quot;, result)

# Print JUST the model's final response
print(&amp;quot;Model response:&amp;quot;, result[&amp;quot;messages&amp;quot;][-1].content)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://github.com/brucenielson/LangChainTutorials/blob/51aa88f535ee48160061d44280375ca1cbf5f0fd/simple_example.py"&gt;Github repo found here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/services/artificial-intelligence/"&gt;If you need help with your Artificial Intelligence solutions, we're here to help&lt;/a&gt;.&lt;/p&gt;
</description>
      <pubDate>Fri, 20 Feb 2026 12:00:00 -0700</pubDate>
      <a10:updated>2026-02-20T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2694</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/weekend-warrior-project-ai-patent-checking/</link>
      <category>System.String[]</category>
      <title>Weekend Warrior Project: AI Patent Checking</title>
      <description>&lt;p&gt;Ever had an idea and wondered, &lt;em&gt;“Wait… has someone already patented this?”&lt;/em&gt; At Mindfire Tech, we recently faced that exact problem. We were working on an AI patent, and before we could go any further, we needed to know if the concept was already patented.&lt;/p&gt;
&lt;p&gt;Sure, we could have spent hours — maybe days — digging through USPTO records, manually reading abstracts, and trying to make sense of hundreds of documents. Or… we could let AI do it for us.  
&lt;/p&gt;
&lt;p&gt;One weekend later, we had a fully functioning AI Patent Search agent. In a few hours, we built a tool where you just paste in your patent abstract (or the full patent!), and it will:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Generate intelligent search queries.&lt;/li&gt;
&lt;li&gt;Search the USPTO patent database automatically. &lt;/li&gt;
&lt;li&gt;Refine queries if there are too many or too few results.  
&lt;/li&gt;
&lt;li&gt;Retrieve abstracts and documents.  
&lt;/li&gt;
&lt;li&gt;Rank the results by relevance.  
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All you have to do is review the final top results — the agent does the rest. The funniest part? Writing the AI agent was &lt;em&gt;far easier&lt;/em&gt; than trying to do this work manually. Instead of getting lost in endless PDFs and XML files, we spent the weekend building a tool that could save weeks of research for anyone working on patents.  
&lt;/p&gt;
&lt;p&gt;It’s the kind of “weekend warrior” project that feels like magic: a fully functional AI agent built in a single weekend, solving a problem that used to take days. At Mindfire, that’s exactly the kind of creative, high-impact AI work we love doing for our clients.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/screenshot-2026-01-14-204551-1.png" alt="Mindfire's Patent Search" /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;How It Works: The Tech Behind the Magic&lt;/h2&gt;
&lt;p&gt;While the AI agent feels seamless, there’s a lot going on under the hood. Here’s a breakdown of the technologies that make this possible:&lt;/p&gt;
&lt;h3&gt;1. Gradio for a Clean, Instant UI&lt;/h3&gt;
&lt;p&gt;We used &lt;strong&gt;Gradio&lt;/strong&gt;, a Python library, to create a web interface in minutes.  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Textbox for your patent idea  
&lt;/li&gt;
&lt;li&gt;Dropdown to select an AI model  
&lt;/li&gt;
&lt;li&gt;Button to trigger the search  
&lt;/li&gt;
&lt;li&gt;Expandable previews of retrieved patents  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All interactive, all local — no frontend frameworks, no headaches.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;2. Local LLMs with Ollama&lt;/h3&gt;
&lt;p&gt;All AI reasoning happens locally using &lt;strong&gt;Ollama&lt;/strong&gt;.  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Models like Gemma, Llama, and Mistral run right on your machine  
&lt;/li&gt;
&lt;li&gt;No data leaves your system — perfect for confidential IP  
&lt;/li&gt;
&lt;li&gt;Fast enough to generate search queries and semantic rankings in seconds  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We used Ollama as the backbone for our AI agent, orchestrated with DSPy.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;3. DSPy: Structured AI Orchestration&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;DSPy&lt;/strong&gt; provides the “brains” of the agent:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Generates structured queries from your patent idea  
&lt;/li&gt;
&lt;li&gt;Refines queries if the first pass fails  
&lt;/li&gt;
&lt;li&gt;Scores patents for relevance using their abstracts  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;DSPy enforces strict input/output formats, which keeps the AI’s predictions consistent and reliable. It’s what turns a raw LLM into a real research assistant.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;4. Multi-Stage Patent Retrieval&lt;/h3&gt;
&lt;p&gt;The agent searches the USPTO using our &lt;strong&gt;Python wrapper for their API&lt;/strong&gt;:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Submits structured keyword queries  
&lt;/li&gt;
&lt;li&gt;Filters only granted patents  
&lt;/li&gt;
&lt;li&gt;Fetches up to hundreds of results per query  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If a query returns zero or too many results, the agent automatically refines it.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;5. FlashRank Reranking&lt;/h3&gt;
&lt;p&gt;Once we have raw results, we use &lt;strong&gt;FlashRank&lt;/strong&gt;, a fast CPU-based reranker:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scores patents by relevance to your idea  
&lt;/li&gt;
&lt;li&gt;Works without GPUs or PyTorch  
&lt;/li&gt;
&lt;li&gt;Reduces hundreds of results to the most relevant top candidates  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This ensures you only see what really matters.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;6. Full Patent Document Retrieval&lt;/h3&gt;
&lt;p&gt;For the top patents, the agent fetches the &lt;strong&gt;full XML from the USPTO&lt;/strong&gt;:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Extracts abstracts with &lt;code&gt;lxml&lt;/code&gt;  
&lt;/li&gt;
&lt;li&gt;Prepares server-side previews for your review  
&lt;/li&gt;
&lt;li&gt;Keeps everything local and ready for the next step  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;No more opening PDFs one by one — the agent handles the tedious part.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;7. Semantic Reranking with LLMs&lt;/h3&gt;
&lt;p&gt;Finally, the agent uses the LLM again to read and rank abstracts semantically:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scores each abstract 0–1 for relevance  
&lt;/li&gt;
&lt;li&gt;Sorts patents so the top few are the ones you actually care about  
&lt;/li&gt;
&lt;li&gt;Provides a concise, actionable list of results, including the actual abstracts. &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This makes the final output feel like a real research assistant — not just a search engine.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;The Result&lt;/h3&gt;
&lt;p&gt;In a single weekend, we created an agent that:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Understands your patent idea  
&lt;/li&gt;
&lt;li&gt;Queries official government databases  
&lt;/li&gt;
&lt;li&gt;Retrieves, parses, and ranks hundreds of patents  
&lt;/li&gt;
&lt;li&gt;Presents a clean, interactive review interface  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s fast, reliable, and ready to save anyone hours of manual research.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;AI Patent Search Agent Pipeline&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/pipeline.png" alt="Patent AI Agent Pipeline" /&gt;&lt;/p&gt;
&lt;p&gt;The diagram above shows how the agent flows from idea input to semantic scoring, highlighting all the key stages of retrieval, refinement, and ranking.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Why Mindfire Can Help&lt;/h2&gt;
&lt;p&gt;Building an AI prototype in a weekend is fun — but imagine what our team can do with more time and your unique business challenges.  
&lt;/p&gt;
&lt;p&gt;At Mindfire Tech, we specialize in:  
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Custom AI agents for research and analysis  
&lt;/li&gt;
&lt;li&gt;Automation that saves weeks of manual work  
&lt;/li&gt;
&lt;li&gt;Secure, locally hosted AI solutions for sensitive projects  
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you’re curious about how AI can transform your workflows — patent research or beyond — we’d love to talk.  
&lt;/p&gt;
&lt;p&gt;After all, if we can build a full patent search agent in a weekend, imagine what we could build for your business.&lt;/p&gt;
</description>
      <pubDate>Fri, 06 Feb 2026 12:00:00 -0700</pubDate>
      <a10:updated>2026-02-06T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2699</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/from-prompts-to-applications-a-beginner-s-introduction-to-langchain/</link>
      <category>System.String[]</category>
      <title>From Prompts to Applications: A Beginner’s Introduction to LangChain</title>
      <description>&lt;p&gt;If you’ve spent any time experimenting with large language models, you’ve probably had this experience: the first few prompts feel magical — and then things get messy and difficult. You want your model to look things up, remember context, call tools, or follow a multi-step process. Suddenly, a single prompt isn’t enough. And those dang Large Language Models (LLMs) seem to have a mind of their own. &lt;/p&gt;
&lt;p&gt;That’s the gap &lt;a href="https://www.langchain.com/"&gt;LangChain&lt;/a&gt; was created to fill.&lt;/p&gt;
&lt;p&gt;LangChain is an open-source framework designed to help developers move from isolated LLM calls to structured AI applications. Instead of treating a language model as a black box that simply returns text, LangChain encourages you to think in terms of workflows — sequences of steps that combine models, tools, and data sources into something more reliable and reusable. This philosophy is central to how the project describes itself and why it exists (&lt;a href="https://docs.langchain.com/oss/python/langchain/philosophy"&gt;LangChain Philosophy&lt;/a&gt;).&lt;/p&gt;
&lt;h2&gt;What LangChain Actually Is&lt;/h2&gt;
&lt;p&gt;At a high level, LangChain provides abstractions for working with language models in a consistent way. It standardizes how you connect to different model providers and how models interact with external systems. The core idea is simple but powerful: Models should be used for more than just text generation - they should also be used to orchestrate more complex flows that interact with other data. (See &lt;a href="https://docs.langchain.com/oss/python/langchain/philosophy"&gt;LangChain Philosophy&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;This shows up in LangChain’s building blocks. You’ll often hear terms like &lt;em&gt;chains&lt;/em&gt;, &lt;em&gt;tools&lt;/em&gt;, and &lt;em&gt;agents&lt;/em&gt;. Chains represent ordered steps of computation, tools allow models to interact with external data or APIs, and agents enable models to decide dynamically which tools to use. Together, these concepts make it easier to build things like retrieval-augmented generation (RAG), multi-step reasoning pipelines, and AI assistants that can interact with real-world data.&lt;/p&gt;
&lt;h2&gt;Installing LangChain&lt;/h2&gt;
&lt;p&gt;LangChain is intentionally easy to get started with, especially for Python developers. Installation typically starts with the core package, followed by optional integrations for the model providers you plan to use:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install -U langchain
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Or do a provider specific version like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install -U langchain-openai
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Or (better yet for us low cost AI solution types) this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install -U langchain langchain-ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This modular approach lets developers start small and only pull in what they need, while still leaving room to grow into more advanced use cases later (see the official &lt;a href="https://docs.langchain.com/oss/python/langchain/install"&gt;LangChain Installation Docs&lt;/a&gt;).&lt;/p&gt;
&lt;h2&gt;A Quick Start Example&lt;/h2&gt;
&lt;p&gt;This short quick start tutorial should get you started:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage

# Initialize the Ollama model
llm = ChatOllama(model=&amp;quot;llama3.2:1b&amp;quot;)

# Create a human message
message = HumanMessage(content=&amp;quot;Write a short introduction about LangChain&amp;quot;)

# Generate a response
response = llm.generate([[message]])

# The response object contains generations
print(response.generations[0][0].text)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You will have to have &lt;a href="https://www.mindfiretechnology.com/blog/archive/installing-ollama-for-large-language-models-llm-in-windows/"&gt;&lt;strong&gt;ollama already installed&lt;/strong&gt;&lt;/a&gt; and llama3.2:1b downloaded. &lt;/p&gt;
&lt;h2&gt;When LangChain Shines&lt;/h2&gt;
&lt;p&gt;LangChain works best when your application needs more than a single prompt and response. If you’re building workflows that involve retrieving information from documents, calling external APIs, coordinating multiple LLM calls, or guiding a model through a multi-step reasoning process, LangChain provides helpful structure without forcing a rigid architecture.&lt;/p&gt;
&lt;p&gt;It is especially popular as a prototyping and experimentation tool. Many developers use LangChain to explore ideas like agent-based workflows or retrieval-augmented generation (RAG) systems before deciding how much structure they want to carry forward into production. Examples like agentic GraphRAG systems built with LangChain show how it can connect models, tools, and data sources in flexible ways (&lt;a href="https://ai.gopubby.com/agentic-graphrag-with-neo4j-and-langchain-ce03b344149c"&gt;Agentic GraphRAG with LangChain&lt;/a&gt;).&lt;/p&gt;
&lt;h2&gt;When to Be Careful&lt;/h2&gt;
&lt;p&gt;Despite its popularity, LangChain is not universally considered production-ready out of the box. Some developers argue that its abstractions can introduce unnecessary complexity, make debugging harder, or obscure performance characteristics in real-world systems. This concern comes up frequently in discussions about production RAG pipelines, where fine-grained control over retrieval logic, latency, and observability is often critical (&lt;a href="https://medium.com/@aldendorosario/langchain-is-not-for-production-use-here-is-why-9f1eca6cce80"&gt;LangChain Is Not for Production Use — Here Is Why&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;This doesn’t mean LangChain shouldn’t be used at all — but it does mean it should be treated as a toolkit rather than a complete solution. Strong engineering practices are still required to turn prototypes into reliable, maintainable applications.&lt;/p&gt;
&lt;h2&gt;LangChain in the Broader Ecosystem&lt;/h2&gt;
&lt;p&gt;LangChain exists alongside several other frameworks with overlapping goals. Tools like Haystack focus more heavily on search and retrieval performance, while newer projects like LangGraph emphasize lower-level control over agent workflows. Which framework makes sense depends largely on your use case, performance requirements, and tolerance for abstraction (&lt;a href="https://medium.com/@amit25173/langchain-vs-haystack-7fa0faa901cd"&gt;LangChain vs Haystack&lt;/a&gt;, &lt;a href="https://medium.com/data-science/ai-agent-workflows-a-complete-guide-on-whether-to-build-with-langgraph-or-langchain-117025509fa0"&gt;LangGraph vs LangChain&lt;/a&gt;).&lt;/p&gt;
&lt;h2&gt;Final Thoughts&lt;/h2&gt;
&lt;p&gt;LangChain is best understood as a bridge. It connects the early excitement of prompt-based experimentation with the practical reality of building real AI applications. For beginners, it offers a way to think beyond prompts and start designing systems composed of models, tools, and workflows. Used thoughtfully, LangChain can be an effective stepping stone from experimentation to production — as long as its tradeoffs are clearly understood.&lt;/p&gt;
</description>
      <pubDate>Fri, 30 Jan 2026 12:00:00 -0700</pubDate>
      <a10:updated>2026-01-30T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2692</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/ai-bubble-or-breakthrough-a-look-at-the-risks-and-the-rewards/</link>
      <category>System.String[]</category>
      <title>AI: Bubble or Breakthrough? A Look at the Risks and the Rewards</title>
      <description>&lt;p&gt;Artificial intelligence is arguably &lt;em&gt;the most hyped technology in decades&lt;/em&gt;. But hype isn’t the same as reality. In the debate over whether AI is in a bubble, there is both a pessimistic and a more optimistic view. &lt;strong&gt;Cory Doctorow&lt;/strong&gt; warns that the economic foundations of AI are shaky and potentially disastrous, and &lt;strong&gt;Jeff Bezos&lt;/strong&gt;, who agrees there’s a bubble — but insists it’s the &lt;em&gt;good kind&lt;/em&gt; and heralds long-term benefits.&lt;/p&gt;
&lt;p&gt;Let’s explore both sides with concrete examples.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Pessimistic View: AI as “Funny Money” and Fragile Economics&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://doctorow.medium.com/https-pluralistic-net-2025-09-27-econopocalypse-subprime-intelligence-e9a06136d109"&gt;Cory Doctorow’s article &lt;em&gt;“The real (economic) AI apocalypse is nigh”&lt;/em&gt;&lt;/a&gt; argues that &lt;strong&gt;today’s AI bubble isn’t built on sound economics&lt;/strong&gt; but on massive investor excitement and weak financial fundamentals. He paints a picture of an industry where &lt;em&gt;capital flows in without sustainable revenue models or profit paths&lt;/em&gt; — classic signs of a bubble that could end badly.&lt;/p&gt;
&lt;p&gt;Doctorow argues many AI firms “&lt;em&gt;keep the lights on by soaking up hundreds of billions of dollars in other people’s money and then lighting it on fire.&lt;/em&gt;” He argues it’s a dependency on the continual infusion of new capital, hardly a sustainable business model.&lt;/p&gt;
&lt;p&gt;Doctorow points out that, unlike past emerging technologies, &lt;strong&gt;each generation of AI technology costs more to build and operate than the last&lt;/strong&gt;, with little evidence that those costs decline or that scaling will improve profitability.&lt;/p&gt;
&lt;p&gt;He also points out that some AI data center companies are reportedly &lt;em&gt;collateralizing loans with tens of thousands of GPUs&lt;/em&gt;, despite these chips losing value quickly — a bizarre financial setup that signals desperation rather than strength.&lt;/p&gt;
&lt;p&gt;Even scarier is where Doctorow highlights practices where &lt;strong&gt;the same money is booked as an investment by one company and revenue by another&lt;/strong&gt; — such as Microsoft “investing” in OpenAI by providing server access, then counting that as revenue. That’s &lt;em&gt;not&lt;/em&gt; real earnings; it’s accounting sleight-of-hand.&lt;/p&gt;
&lt;p&gt;Finally, he quotes a venture capitalist suggesting AI firms would need to &lt;em&gt;sell hundreds of billions worth of services just to break even&lt;/em&gt;, a goal so large it’s hard to take seriously given current revenue levels. Is there really this much demand for AI in real life?&lt;/p&gt;
&lt;p&gt;Doctorow's view is that this whole thing is just hype run out of control, leading to bad decisions—such as firing people over AI that can't do the job. In Doctorow's opinion, this is just the market gone mad.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Optimistic Take: Jeff Bezos on the “Industrial Bubble”&lt;/h2&gt;
&lt;p&gt;But not everyone sees this as &lt;em&gt;pure&lt;/em&gt; madness. Jeff Bezos — longtime tech leader and founder of Amazon — also acknowledges we’re in &lt;em&gt;bubble territory&lt;/em&gt;, but his perspective is more nuanced.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=4Vf8pljp1FY"&gt;Bezos said at &lt;em&gt;Italian Tech Week 2025&lt;/em&gt;&lt;/a&gt; that &lt;strong&gt;AI’s rapid growth and inflated valuations fit the classic pattern of a bubble&lt;/strong&gt;, where “every experiment and every company gets funded — the good ideas and the bad ones alike.” &lt;/p&gt;
&lt;p&gt;Bezos compared the AI bubble to past tech booms particularly the dot-com bubble. During the dot-com bust, Amazon’s stock famously plummeted from roughly &lt;em&gt;$113 to around $6 a share&lt;/em&gt; during the early 2000s bust, even as its business fundamentals were improving. This shows how markets can wildly misprice companies in a bubble. &lt;/p&gt;
&lt;p&gt;Bezos also points out that during the dot-com bubble everyone was chasing the Internet. Fiber-optic companies went bankrupt laying fiber, but the fiber they laid was bought out by other companies and is still around. The overall benefit to society was real. &lt;/p&gt;
&lt;p&gt;This actually makes sense. Everyone can sense that AI is a transformative technology just like the Internet was. Back then everyone was trying to figure out how to use (and preferably dominate) the Internet because they knew there was massive profits on the line. So, from an investor's perspective, it made sense to chase these profits even if you risked losing everything. And for the companies that won this mad dash—Amazon, Google, Netflix, Facebook, etc—the rewards were well worth the risk. &lt;/p&gt;
&lt;p&gt;Notice how this doesn't deny the problems Doctorow highlights, but it explains it differently. Instead of trying to explain it as market madness, it explains it as a correct understanding that huge profits are on the line. The problem is that no one knows at this point who the big winners are going to be. &lt;/p&gt;
&lt;p&gt;This is what Bezos calls an &lt;em&gt;industrial bubble&lt;/em&gt;: &lt;strong&gt;capital may flow to losers, but the underlying technology creates lasting value&lt;/strong&gt;. Compare this to the bad kind of bubble, such as the 2008 housing bubble. That bubble was all bad because it really was just market madness with on actual paradigm-changing technology at its center.&lt;/p&gt;
&lt;p&gt;Bezos points out that during hype cycles, &lt;strong&gt;investors often can’t distinguish good ideas from bad ones&lt;/strong&gt; — meaning capital pours into projects that may never have viable products. But in the process, &lt;em&gt;real innovation also gets funded&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Importantly, Bezos argues that &lt;em&gt;just because valuations are frothy doesn’t mean the technology itself isn’t transformative&lt;/em&gt;. The dot-com bubble was a bubble, but the Internet really did transform everything. So, Bezos maintains that &lt;strong&gt;AI will change every industry&lt;/strong&gt; and that the benefits to society will be “gigantic.” &lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;So What Should We Make of It?&lt;/h2&gt;
&lt;h3&gt;🚨 The Bubble Side&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;AI has &lt;strong&gt;massive investment without clear returns&lt;/strong&gt;. &lt;/li&gt;
&lt;li&gt;Financial engineering and accounting quirks mask true economics.  
&lt;/li&gt;
&lt;li&gt;If capital stops flowing, many companies might fail spectacularly.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;🌟 The Breakthrough Side&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Innovation doesn’t happen without experimentation and funding&lt;/strong&gt;, even if it’s wasteful at times.&lt;/li&gt;
&lt;li&gt;Past &lt;em&gt;industrial bubbles&lt;/em&gt; (Internet) left real infrastructure and products behind. &lt;/li&gt;
&lt;li&gt;AI’s underlying &lt;em&gt;technical progress&lt;/em&gt; isn’t a mirage — it’s real and pushing boundaries.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;So I Should Be Afraid of the AI Bubble?&lt;/h2&gt;
&lt;p&gt;Surprisingly, no — though, you probably &lt;em&gt;should&lt;/em&gt; be concerned. Take a look at this chart of the Shiller PE ratio and our three bubbles:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/shiller-pe.png" alt="Three Bubbles" /&gt;&lt;/p&gt;
&lt;p&gt;A Shiller PE is a price-to-earnings ratio, except smoothed out over 10 years. It's a great way to fundamentally value the market historically that smooths out all the local ups and downs. During the dot-com bubble the Shiller PE got up to 44 against a historical average of closer to 20. We are currently at 40. Even the 2008 housing bubble only got up to 28. So there is a lot of potential downside in this AI bubble. The stock market is clearly overvalued in large part due to the AI bubble.&lt;/p&gt;
&lt;h2&gt;Conclusion: Bubble or Breakthrough?&lt;/h2&gt;
&lt;p&gt;The answer might be &lt;strong&gt;both&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;AI today shows many signs of a bubble: unsustainable valuations, weak unit economics, and overfunded startups with no path to profit. That’s worth worry — especially for investors and workers whose livelihoods depend on sound business fundamentals.&lt;/p&gt;
&lt;p&gt;But if we look at history, &lt;em&gt;bubbles aren’t uniformly bad&lt;/em&gt;. As Jeff Bezos emphasizes, when the dust settles, the infrastructure and breakthroughs left behind can help reshape industries for decades.&lt;/p&gt;
&lt;p&gt;Whether you come down on the &lt;em&gt;doom&lt;/em&gt; side or the &lt;em&gt;long-term value&lt;/em&gt; side, one thing is clear: &lt;strong&gt;AI’s story is just beginning, and its economic impact will be debated for years to come&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the meantime, one value that can be certain from AI is its use in projects and software that have an immediate purpose and measurable benefit or profitability. Mindfire Tech is committed to providing state-of-the-art software solutions and we believe AI can be a great boon to those who know how to use it properly. If you are interested in learning more about how AI could be used in your company or in your future projects, &lt;a href="https://www.mindfiretechnology.com/contact-us"&gt;please don't hesitate to reach out for a free consultation&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;Sources:&lt;/em&gt;&lt;br /&gt;
- Cory Doctorow, &lt;em&gt;The real (economic) AI apocalypse is nigh&lt;/em&gt; — Medium. :contentReference[oaicite:18]{index=18}&lt;br /&gt;
- Jeff Bezos on the AI bubble, Italian Tech Week 2025 — various reporting. :contentReference[oaicite:19]{index=19}&lt;/p&gt;
</description>
      <pubDate>Tue, 20 Jan 2026 12:00:00 -0700</pubDate>
      <a10:updated>2026-01-20T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2688</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/building-a-local-deepseek-r1-chatbot-with-chainlit/</link>
      <category>System.String[]</category>
      <title>Building a Local DeepSeek R1 Chatbot with Chainlit</title>
      <description>&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/blog/archive/building-a-local-deepseek-r1-chatbot-with-streamlit-and-ollama/"&gt;In the previous post&lt;/a&gt; we used &lt;a href="https://streamlit.io/"&gt;Streamlit&lt;/a&gt; and &lt;a href="https://ollama.com/"&gt;Ollama&lt;/a&gt; to build a local Deepseek R1 chatbot. Let's now do the same thing using &lt;a href="https://docs.chainlit.io/get-started/overview"&gt;Chainlit&lt;/a&gt; as our UI to try out Chainlit. Chainlit provides an elegant, real-time chat interface out of the box, and it works beautifully with models running through &lt;strong&gt;Ollama&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Just like before, DeepSeek R1 will stream both its &lt;em&gt;thinking process&lt;/em&gt; and its final answer. Chainlit is dedicated specifically to building chatbots and doesn't seem quite as flexible as Streamlit. This led to at least one problem I'll explain below.&lt;/p&gt;
&lt;p&gt;The full working code is included in this post and &lt;a href="https://github.com/brucenielson/PythonUIs/blob/2f8eb60bb97350ce3e5a2b22d05520f624aaa692/chainlit_example.py"&gt;in my GitHub repo&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;What You'll Need&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Python 3.11+ installed  
&lt;/li&gt;
&lt;li&gt;Ollama installed (&lt;a href="https://www.mindfiretechnology.com/blog/archive/installing-ollama-for-large-language-models-llm-in-windows/"&gt;see this post&lt;/a&gt;)  
&lt;/li&gt;
&lt;li&gt;DeepSeek R1 pulled locally:&lt;br /&gt;
  &lt;code&gt;ollama pull deepseek-r1:1.5b&lt;/code&gt;  
&lt;/li&gt;
&lt;li&gt;Chainlit:&lt;br /&gt;
  &lt;code&gt;pip install chainlit&lt;/code&gt;  
&lt;/li&gt;
&lt;li&gt;Ollama Python library:&lt;br /&gt;
  &lt;code&gt;pip install ollama&lt;/code&gt;  
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;The Complete Code&lt;/h2&gt;
&lt;p&gt;Below is the full &lt;code&gt;app.py&lt;/code&gt; file we'll walk through:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import chainlit as cl
import ollama


def convert_latex_delimiters(text):
    &amp;quot;&amp;quot;&amp;quot;Convert LaTeX delimiters from backslash-bracket to dollar signs&amp;quot;&amp;quot;&amp;quot;
    if not text:
        return text
    # Replace display math delimiters
    text = text.replace(r'\[', '$$')
    text = text.replace(r'\]', '$$')
    # Replace inline math delimiters
    text = text.replace(r'\(', '$')
    text = text.replace(r'\)', '$')
    return text


@cl.on_message
async def on_message(message: cl.Message):
    &amp;quot;&amp;quot;&amp;quot;
    Handles incoming messages from the user, sends them to the LLM,
    and streams the response back to the Chainlit interface with thinking process.
    &amp;quot;&amp;quot;&amp;quot;
    # System prompt for the AI
    system_message = {
        &amp;quot;role&amp;quot;: &amp;quot;system&amp;quot;,
        &amp;quot;content&amp;quot;: &amp;quot;&amp;quot;&amp;quot;You are an advanced AI assistant powered by the deepseek-r1 model.

Guidelines:
- If you're uncertain about something, acknowledge it rather than making up information
- Format your responses with markdown when it improves readability
&amp;quot;&amp;quot;&amp;quot;
    }

    # Create a step for thinking and messages for streaming
    thinking_step = cl.Step(name=&amp;quot;💭 Thinking&amp;quot;, type=&amp;quot;tool&amp;quot;)
    final_answer = cl.Message(content=&amp;quot;&amp;quot;)

    accumulated_thinking = &amp;quot;&amp;quot;
    accumulated_answer = &amp;quot;&amp;quot;

    try:
        # Request completion from the model with streaming and thinking enabled
        stream = ollama.chat(
            model=&amp;quot;deepseek-r1:1.5b&amp;quot;,
            messages=[
                system_message,
                {&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: message.content}
            ],
            stream=True,
            think=True,  # This is the critical parameter for Ollama native API
        )

        thinking_started = False
        answer_started = False
        answer_buffer = &amp;quot;&amp;quot;

        # Stream the response to the UI
        for chunk in stream:
            chunk_msg = chunk.get(&amp;quot;message&amp;quot;, {})

            # Handle thinking content
            if chunk_msg.get(&amp;quot;thinking&amp;quot;):
                if not thinking_started:
                    thinking_started = True
                    await thinking_step.send()

                thinking_text = chunk_msg[&amp;quot;thinking&amp;quot;]
                accumulated_thinking += thinking_text
                thinking_step.output = convert_latex_delimiters(accumulated_thinking)
                await thinking_step.update()

            # Handle answer content
            if chunk_msg.get(&amp;quot;content&amp;quot;):
                if not answer_started:
                    answer_started = True
                    if thinking_started:
                        # Finalize the thinking step
                        await thinking_step.update()
                    await final_answer.send()

                answer_text = chunk_msg[&amp;quot;content&amp;quot;]
                accumulated_answer += answer_text
                answer_buffer += answer_text

                # Only update every 10 characters or so to avoid overwhelming the socket
                if len(answer_buffer) &amp;gt;= 10:
                    await final_answer.stream_token(answer_buffer)
                    answer_buffer = &amp;quot;&amp;quot;

        # Send any remaining buffered content
        if answer_buffer:
            await final_answer.stream_token(answer_buffer)

        # Update final answer with LaTeX conversion
        final_answer.content = convert_latex_delimiters(accumulated_answer)
        await final_answer.update()

    except Exception as e:
        error_msg = cl.Message(content=f&amp;quot;❌ Error generating response: {str(e)}&amp;quot;)
        await error_msg.send()


@cl.on_chat_start
async def start():
    &amp;quot;&amp;quot;&amp;quot;
    Sends a welcome message when the chat starts.
    &amp;quot;&amp;quot;&amp;quot;
    await cl.Message(
        content=&amp;quot;👋 Hello! I'm powered by **DeepSeek R1**. I'll show you my thinking process before answering.\n\n&amp;quot;
                &amp;quot;Try asking me a math problem or reasoning question!&amp;quot;
    ).send()
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Breaking Down the Code&lt;/h2&gt;
&lt;h3&gt;Chainlit Event Hooks&lt;/h3&gt;
&lt;p&gt;Chainlit uses decorators such as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;@cl.on_message
@cl.on_chat_start
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These act as event listeners.&lt;br /&gt;
- &lt;code&gt;on_chat_start&lt;/code&gt; fires once when the UI loads.&lt;br /&gt;
- &lt;code&gt;on_message&lt;/code&gt; fires every time the user sends a message.&lt;/p&gt;
&lt;h3&gt;LaTeX Conversion Helper&lt;/h3&gt;
&lt;p&gt;DeepSeek R1 frequently uses LaTeX but with delimiters that Chainlit doesn't render by default. We fix that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def convert_latex_delimiters(text):
    text = text.replace(r'\[', '$$')
    text = text.replace(r'\]', '$$')
    text = text.replace(r'\(', '$')
    text = text.replace(r'\)', '$')
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Same logic as the Streamlit version, just applied before displaying anything.&lt;/p&gt;
&lt;h3&gt;Thinking Step&lt;/h3&gt;
&lt;p&gt;Chainlit allows you to create &amp;quot;steps&amp;quot; that appear in the UI:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;thinking_step = cl.Step(name=&amp;quot;💭 Thinking&amp;quot;, type=&amp;quot;tool&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As the model streams reasoning content, we update this step live. The user can watch DeepSeek R1 deliberate. &lt;/p&gt;
&lt;p&gt;Unfortunately, I couldn't figure out (in time for this post) how to get the thinking to sit on top of the answer. So if you are watching it think the spot where the final answer goes scrolls off the top of the screen. I tried several ideas on how to fix this and none worked. Good job Chainlit getting your app that has only one job to not do it right! (At least not easily.) &lt;/p&gt;
&lt;h3&gt;Streaming with Ollama&lt;/h3&gt;
&lt;p&gt;The heart of the system:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;stream = ollama.chat(
    model=&amp;quot;deepseek-r1:1.5b&amp;quot;,
    messages=[...],
    stream=True,
    think=True,
)
&lt;/code&gt;&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;stream=True&lt;/code&gt; → tokens arrive as a generator  
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;think=True&lt;/code&gt; → reasoning is delivered separately from the final answer  
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Streaming Back to Chainlit&lt;/h3&gt;
&lt;p&gt;We listen for both thinking and final answer tokens:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if chunk_msg.get(&amp;quot;thinking&amp;quot;):
    ...
if chunk_msg.get(&amp;quot;content&amp;quot;):
    ...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Thinking updates go to &lt;code&gt;thinking_step.update()&lt;/code&gt;, while answer tokens are streamed via:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;await final_answer.stream_token(...)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This gives a smooth, real-time chat experience.&lt;/p&gt;
&lt;h3&gt;Welcome Message&lt;/h3&gt;
&lt;p&gt;When the chat starts, we show a friendly introduction:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;await cl.Message(
    content=&amp;quot;👋 Hello! I'm powered by **DeepSeek R1**...&amp;quot;
).send()
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Error Handling&lt;/h3&gt;
&lt;p&gt;Any unexpected issues are passed back to the UI:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;await cl.Message(content=f&amp;quot;❌ Error generating response: {str(e)}&amp;quot;).send()
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Running the Application&lt;/h2&gt;
&lt;p&gt;Save the file as &lt;code&gt;app.py&lt;/code&gt; and run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;chainlit run chainlit_example.py -w
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Your browser will open automatically with the Chainlit interface. Try a math or logic problem, and you'll see DeepSeek R1 stream its thoughts step-by-step before answering.&lt;/p&gt;
&lt;p&gt;Final Result:
&lt;img src="https://www.mindfiretechnology.com/blog/media/screenshot-2025-12-04-174722.png" alt="enter image description here" /&gt;&lt;/p&gt;
&lt;h2&gt;Key Features&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Runs entirely locally—no external API calls  
&lt;/li&gt;
&lt;li&gt;Chainlit displays the reasoning process as a dedicated live-updating step  
&lt;/li&gt;
&lt;li&gt;Smooth streaming output  
&lt;/li&gt;
&lt;li&gt;Proper math rendering  
&lt;/li&gt;
&lt;li&gt;Clean and modern chat UI with minimal setup  
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Concerns&lt;/h2&gt;
&lt;p&gt;I never did get Chainlit to work as well as Streamlit. It tended to error out due to 'too many tokens' errors and the Latex support was hit and miss at best. I'll have to circle back on this and see if I can improve it.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Using Chainlit with DeepSeek R1 makes it incredibly simple to build a polished chatbot interface. With around 100 lines of code, you get live thinking visualization, Markdown rendering, history management, and a production-ready UI—all running locally via Ollama.&lt;/p&gt;
&lt;p&gt;Feel free to adapt this example and extend it for your own experiments!&lt;/p&gt;
</description>
      <pubDate>Tue, 06 Jan 2026 12:00:00 -0700</pubDate>
      <a10:updated>2026-01-06T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2685</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/building-a-local-deepseek-r1-chatbot-with-streamlit-and-ollama/</link>
      <category>System.String[]</category>
      <title>Building a Local DeepSeek R1 Chatbot with Streamlit and Ollama</title>
      <description>&lt;p&gt;With all the -- admittedly already out of date -- hype around &lt;a href="https://www.mindfiretechnology.com/blog/archive/explaining-deepseek-r1-and-how-to-use-it/"&gt;Deepseek R1&lt;/a&gt; from China, I thought it would be fun to build a simple chatbot interface for DeepSeek R1 that runs entirely on your local machine. DeepSeek R1 is a reasoning model that shows its &amp;quot;thinking process&amp;quot; before giving answers, similar to OpenAI's o1 model. We'll use &lt;a href="https://streamlit.io/"&gt;Streamlit for the interface&lt;/a&gt; and &lt;a href="https://www.mindfiretechnology.com/blog/archive/installing-ollama-for-large-language-models-llm-in-windows/"&gt;Ollama to run the model locally&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The full code available both in this post and in &lt;a href="https://github.com/brucenielson/PythonUIs/blob/f19932dfe243dc803eb0ae2de9f3b693d133164d/streamlit_example.py"&gt;my Github Repo found here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;What You'll Need&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Python 3.11+ installed on your computer&lt;/li&gt;
&lt;li&gt;Ollama installed (&lt;a href="https://www.mindfiretechnology.com/blog/archive/installing-ollama-for-large-language-models-llm-in-windows/"&gt;see this post for details&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;The DeepSeek R1 model pulled in Ollama: &lt;code&gt;ollama pull deepseek-r1:1.5b&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Streamlit: &lt;code&gt;pip install streamlit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Ollama Python library: &lt;code&gt;pip install ollama&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;The Complete Code&lt;/h2&gt;
&lt;p&gt;Here's the full code we'll be breaking down:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Run using: streamlit run streamlit_example.py
import streamlit as st
import ollama

st.set_page_config(page_title=&amp;quot;DeepSeek R1 Chat&amp;quot;, page_icon=&amp;quot;🤖&amp;quot;)


def convert_latex_delimiters(text):
    &amp;quot;&amp;quot;&amp;quot;Convert LaTeX delimiters from backslash-bracket to dollar signs&amp;quot;&amp;quot;&amp;quot;
    # Replace display math delimiters
    text = text.replace(r'\[', '$$')
    text = text.replace(r'\]', '$$')
    # Replace inline math delimiters
    text = text.replace(r'\(', '$')
    text = text.replace(r'\)', '$')
    return text


# Initialize chat history
if &amp;quot;messages&amp;quot; not in st.session_state:
    st.session_state.messages = []

# Display chat history
for msg in st.session_state.messages:
    with st.chat_message(msg[&amp;quot;role&amp;quot;]):
        if msg[&amp;quot;role&amp;quot;] == &amp;quot;assistant&amp;quot; and msg.get(&amp;quot;thinking&amp;quot;):
            with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=False):
                st.markdown(convert_latex_delimiters(msg[&amp;quot;thinking&amp;quot;]))
        st.markdown(convert_latex_delimiters(msg[&amp;quot;content&amp;quot;]))

# Chat input
user_input = st.chat_input(&amp;quot;Ask DeepSeek R1&amp;quot;)

if user_input:
    # Add and display user message
    st.session_state.messages.append({&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: user_input})
    with st.chat_message(&amp;quot;user&amp;quot;):
        st.markdown(user_input)

    # Generate and display assistant response
    with st.chat_message(&amp;quot;assistant&amp;quot;):
        thinking_display = st.empty()
        answer_display = st.empty()
        accumulated_thinking = &amp;quot;&amp;quot;
        accumulated_answer = &amp;quot;&amp;quot;

        with st.spinner(&amp;quot;Thinking...&amp;quot;, show_time=True):
            stream = ollama.chat(
                model=&amp;quot;deepseek-r1:1.5b&amp;quot;,
                messages=[{&amp;quot;role&amp;quot;: m[&amp;quot;role&amp;quot;], &amp;quot;content&amp;quot;: m[&amp;quot;content&amp;quot;]}
                          for m in st.session_state.messages],
                stream=True,
                think=True,
            )

            for chunk in stream:
                chunk_msg = chunk.get(&amp;quot;message&amp;quot;, {})

                # Stream thinking content
                if chunk_msg.get(&amp;quot;thinking&amp;quot;):
                    accumulated_thinking += chunk_msg[&amp;quot;thinking&amp;quot;]
                    with thinking_display.container():
                        with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=True):
                            st.markdown(convert_latex_delimiters(accumulated_thinking) + &amp;quot;▌&amp;quot;)

                # Stream answer content
                if chunk_msg.get(&amp;quot;content&amp;quot;):
                    accumulated_answer += chunk_msg[&amp;quot;content&amp;quot;]
                    answer_display.markdown(convert_latex_delimiters(accumulated_answer) + &amp;quot;▌&amp;quot;)

        # Final display without cursor
        if accumulated_thinking:
            with thinking_display.container():
                with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=False):
                    st.markdown(convert_latex_delimiters(accumulated_thinking))
        answer_display.markdown(convert_latex_delimiters(accumulated_answer))

        # Save to chat history
        st.session_state.messages.append({
            &amp;quot;role&amp;quot;: &amp;quot;assistant&amp;quot;,
            &amp;quot;content&amp;quot;: accumulated_answer,
            &amp;quot;thinking&amp;quot;: accumulated_thinking or None
        })
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Breaking Down the Code&lt;/h2&gt;
&lt;h3&gt;Page Configuration&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;st.set_page_config(page_title=&amp;quot;DeepSeek R1 Chat&amp;quot;, page_icon=&amp;quot;🤖&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This sets up our Streamlit page with a custom title and icon that appears in the browser tab.&lt;/p&gt;
&lt;h3&gt;LaTeX Conversion Function&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;def convert_latex_delimiters(text):
    text = text.replace(r'\[', '$$')
    text = text.replace(r'\]', '$$')
    text = text.replace(r'\(', '$')
    text = text.replace(r'\)', '$')
    return text
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;DeepSeek R1 often outputs mathematical notation using LaTeX syntax. However, it uses &lt;code&gt;\[&lt;/code&gt; and &lt;code&gt;\]&lt;/code&gt; for display math and &lt;code&gt;\(&lt;/code&gt; and &lt;code&gt;\)&lt;/code&gt; for inline math. Streamlit's markdown renderer expects &lt;code&gt;$$&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; instead, so this function converts between the two formats.&lt;/p&gt;
&lt;h3&gt;Initializing Chat History&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;if &amp;quot;messages&amp;quot; not in st.session_state:
    st.session_state.messages = []
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Streamlit reruns your entire script on every interaction. To preserve chat history between reruns, we store messages in &lt;code&gt;st.session_state&lt;/code&gt;, which persists across reruns.&lt;/p&gt;
&lt;h3&gt;Displaying Chat History&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;for msg in st.session_state.messages:
    with st.chat_message(msg[&amp;quot;role&amp;quot;]):
        if msg[&amp;quot;role&amp;quot;] == &amp;quot;assistant&amp;quot; and msg.get(&amp;quot;thinking&amp;quot;):
            with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=False):
                st.markdown(convert_latex_delimiters(msg[&amp;quot;thinking&amp;quot;]))
        st.markdown(convert_latex_delimiters(msg[&amp;quot;content&amp;quot;]))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This loop displays all previous messages. For assistant messages, if there's thinking content, we show it in a collapsible expander. This keeps the interface clean while still making the reasoning process available.&lt;/p&gt;
&lt;h3&gt;Handling User Input&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;user_input = st.chat_input(&amp;quot;Ask DeepSeek R1&amp;quot;)

if user_input:
    st.session_state.messages.append({&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: user_input})
    with st.chat_message(&amp;quot;user&amp;quot;):
        st.markdown(user_input)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When a user types a message, we add it to our message history and display it immediately in the chat interface.&lt;/p&gt;
&lt;h3&gt;Streaming the Response&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;with st.spinner(&amp;quot;Thinking...&amp;quot;, show_time=True):
    stream = ollama.chat(
        model=&amp;quot;deepseek-r1:1.5b&amp;quot;,
        messages=[{&amp;quot;role&amp;quot;: m[&amp;quot;role&amp;quot;], &amp;quot;content&amp;quot;: m[&amp;quot;content&amp;quot;]}
                  for m in st.session_state.messages],
        stream=True,
        think=True,
    )
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The critical part here is &lt;code&gt;think=True&lt;/code&gt;. This tells Ollama to return DeepSeek R1's reasoning process separately from its final answer. The &lt;code&gt;stream=True&lt;/code&gt; parameter makes the response appear word-by-word rather than all at once.&lt;/p&gt;
&lt;h3&gt;Processing the Stream&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;for chunk in stream:
    chunk_msg = chunk.get(&amp;quot;message&amp;quot;, {})

    # Stream thinking content
    if chunk_msg.get(&amp;quot;thinking&amp;quot;):
        accumulated_thinking += chunk_msg[&amp;quot;thinking&amp;quot;]
        with thinking_display.container():
            with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=True):
                st.markdown(convert_latex_delimiters(accumulated_thinking) + &amp;quot;▌&amp;quot;)

    # Stream answer content
    if chunk_msg.get(&amp;quot;content&amp;quot;):
        accumulated_answer += chunk_msg[&amp;quot;content&amp;quot;]
        answer_display.markdown(convert_latex_delimiters(accumulated_answer) + &amp;quot;▌&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As chunks arrive from Ollama, they contain either &amp;quot;thinking&amp;quot; or &amp;quot;content&amp;quot; fields. We accumulate both separately and display them in real-time. The &amp;quot;▌&amp;quot; character creates a blinking cursor effect showing that content is still streaming.&lt;/p&gt;
&lt;h3&gt;Finalizing the Display&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;if accumulated_thinking:
    with thinking_display.container():
        with st.expander(&amp;quot;🧠 View Thinking Process&amp;quot;, expanded=False):
            st.markdown(convert_latex_delimiters(accumulated_thinking))
answer_display.markdown(convert_latex_delimiters(accumulated_answer))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Once streaming completes, we remove the cursor and collapse the thinking expander by default. This gives users a clean final view with the option to expand and see the reasoning.&lt;/p&gt;
&lt;h3&gt;Saving to History&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;st.session_state.messages.append({
    &amp;quot;role&amp;quot;: &amp;quot;assistant&amp;quot;,
    &amp;quot;content&amp;quot;: accumulated_answer,
    &amp;quot;thinking&amp;quot;: accumulated_thinking or None
})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Finally, we save the complete response to our message history so it persists when the page reruns.&lt;/p&gt;
&lt;h2&gt;Running the Application&lt;/h2&gt;
&lt;p&gt;Save the code as &lt;code&gt;streamlit_example.py&lt;/code&gt; and run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;streamlit run streamlit_example.py
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Your browser will open with the chatbot interface. Try asking it a math question like &amp;quot;What is 15 times 23?&amp;quot; and you'll see it show its reasoning process before giving the final answer.&lt;/p&gt;
&lt;p&gt;Here was my result:
&lt;img src="https://www.mindfiretechnology.com/blog/media/screenshot-2025-12-04-141355.png" alt="Deepseek R1 Running in Streamlit" /&gt;&lt;/p&gt;
&lt;p&gt;One annoying aspect of Deepseek R1 is that I've seen it got stuck in a 'thinking loop'. I once asked it to give me the first 12 digits of PI and it couldn't decide if that included the '3.' or not. It kept going back and forth trying to decide for 10 minutes before I gave up. (Rather than just giving me both answers.) &lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/screenshot-2025-12-04-132204.png" alt="enter image description here" /&gt;&lt;/p&gt;
&lt;h2&gt;Key Features&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Runs completely locally with no API calls&lt;/li&gt;
&lt;li&gt;Shows the model's reasoning process in expandable sections&lt;/li&gt;
&lt;li&gt;Streams responses in real-time for a better user experience&lt;/li&gt;
&lt;li&gt;Properly renders mathematical notation&lt;/li&gt;
&lt;li&gt;Maintains conversation history&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;With just 90 lines of Python, we've created a functional chatbot interface for DeepSeek R1 that showcases one of its most interesting features: transparent reasoning. The combination of Streamlit's simple API and Ollama's local model hosting makes it easy to experiment with AI models without worrying about API costs or privacy concerns.&lt;/p&gt;
&lt;p&gt;The complete source code is available above. Feel free to modify and extend it for your own projects!&lt;/p&gt;
</description>
      <pubDate>Tue, 30 Dec 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-12-30T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2683</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/how-dspy-optimizes-prompts/</link>
      <category>System.String[]</category>
      <title>How DSPy Optimizes Prompts</title>
      <description>&lt;p&gt;In our previous posts (&lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-prompt-your-llm-like-its-code/"&gt;here&lt;/a&gt;, &lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-a-powerful-but-sometimes-dangerous-prompting-tool/"&gt;here&lt;/a&gt;, and &lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-how-it-works/"&gt;here&lt;/a&gt;), we looked at &lt;a href="https://dspy.ai/"&gt;DSPy&lt;/a&gt; and how it turns Large Language Models (LLMs) into structured, type-safe Python functions. You define a class with input and output fields, and DSPy builds the prompt for you automatically.  
&lt;/p&gt;
&lt;p&gt;But here’s the thing — sometimes your initial prompt isn’t perfect. The model might get the instructions a little wrong, or your few-shot examples might not be ideal. That’s where DSPy optimizers like &lt;strong&gt;MIPROv2&lt;/strong&gt; comes in. It’s a tool in DSPy that automatically tweaks your prompts and few-shot examples to make the model behave better. You can &lt;a href="https://dspy.ai/api/optimizers/MIPROv2/"&gt;read more about it on the here&lt;/a&gt;.  
&lt;/p&gt;
&lt;p&gt;In this post, we’ll show you how DSPy can &lt;strong&gt;optimize prompts&lt;/strong&gt; using MIPROv2. We’ll go from an instruction that gives poor results to one that gives consistently correct results — and we’ll show you exactly what changes along the way. You’ll see how the model’s instructions and examples evolve, and why this makes your LLM programs more reliable.&lt;/p&gt;
&lt;p&gt;You can find &lt;a href="https://github.com/brucenielson/DSPyTutorials/blob/403b2f4a11b0c03aec5b684dcd60984887f8b737/dspy_optimizing.py"&gt;the full code for this blog post on my github repo here.&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;A Poorly Defined Prompt&lt;/h2&gt;
&lt;p&gt;The fact is that DSPy is so good at building prompts that I actually struggled to come up with a good simple example of how it can automatically improve prompts for you. It often got 100% accuracy on sentiment analysis right out of the box. So to give it a real challenge I rewrote my 'Classify' function like this. Originally we had:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class Classify(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Classify sentiment of a given sentence.&amp;quot;&amp;quot;&amp;quot;
    sentence: str = dspy.InputField()
    sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;] = dspy.OutputField()
    confidence: float = dspy.OutputField()
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And now we have the much vaguer: &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class AnalyzeText(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Process the input.&amp;quot;&amp;quot;&amp;quot;  # ← Useless instruction
    text: str = dspy.InputField(desc=&amp;quot;input data&amp;quot;)
    label: Literal[&amp;quot;Bingo!&amp;quot;, &amp;quot;Hmmm...&amp;quot;, &amp;quot;Oink!&amp;quot;] = dspy.OutputField(desc=&amp;quot;output&amp;quot;)  # Shuffled order!
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Instead of being called &amp;quot;Classify&amp;quot; (which is a dead give away we're doing sentiment analysis) we now call it &amp;quot;AnalyzeText&amp;quot;. And &amp;quot;sentence&amp;quot; and &amp;quot;sentiment&amp;quot; are now &amp;quot;text&amp;quot; and &amp;quot;label&amp;quot;. Plus the labels are 'Bingo!', 'Oink!' and 'Hmmm...' &lt;/p&gt;
&lt;p&gt;The first time I tried this, Gemini figured out on its own that 'Bingo!' was positive, 'Oink!' was negative, and 'Hmmm...' was neutral. So I had to swap orders around. And now 'Bingo!' is neutral, 'Oink!' is positive, and 'Hmmm...' is negative. So now there is no way Gemini can figure out what the correct labels are!&lt;/p&gt;
&lt;p&gt;And sure enough, it does a terrible job labeling the data out of the box:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;============================================================
BEFORE OPTIMIZATION
============================================================

📋 Initial Instruction:
&amp;quot;Process the input.&amp;quot;

🎭 INVERTED Nonsense Categories (counterintuitive!):
   'Oink!'   = positive sentiment (opposite of what you'd expect!)
   'Bingo!'  = neutral/mixed sentiment (not positive!)
   'Hmmm...' = negative sentiment (not thoughtful!)

The model will naturally guess WRONG without training!

🧪 Testing on dev set (without optimization):
  ✗ 'The first half was amazing but then it fell a...' → Hmmm...  (expected: Bingo!)
  ✗ 'Not terrible but nothing special....' → Hmmm...  (expected: Bingo!)
  ✗ 'An absolute masterpiece in every way!...' → Bingo!   (expected: Oink!)
  ✓ 'I wanted to like it but it was just awful....' → Hmmm...  (expected: Hmmm...)
  ✗ 'Has its moments but overall just average....' → Hmmm...  (expected: Bingo!)

📊 Initial accuracy: 1/5 (20%)
    ↑ Should be near random chance (~33%) with no instruction!
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I'm sort of surprised it even got 20% right. &lt;/p&gt;
&lt;h2&gt;Running the Optimizer&lt;/h2&gt;
&lt;p&gt;Now that we’ve seen how DSPy builds prompts automatically, let’s see how it can improve them using the MIPROv2 optimizer.&lt;/p&gt;
&lt;p&gt;We’ll take our deliberately confusing example above to make the optimizer necessary. We also define a small &lt;strong&gt;training set&lt;/strong&gt; and &lt;strong&gt;validation set&lt;/strong&gt; with tricky examples to test the model. The metric simply checks whether the model predicts the expected label:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def metric(example, pred, trace=None):
    return example.label == pred.label
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As we saw, this initially only gets 20% correct. The full initial prompt can also be inspected using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;lm.inspect_history(n=1)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This shows exactly how DSPy was instructing the LLM before optimization. Here is the prompt that is built initially:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;System message:

Your input fields are:
1. `text` (str): input data
Your output fields are:
1. `label` (Literal['Bingo!', 'Hmmm...', 'Oink!']): output
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## text ## ]]
{text}

[[ ## label ## ]]
{label}        # note: the value you produce must exactly match (no extra characters) one of: Bingo!; Hmmm...; Oink!

[[ ## completed ## ]]
In adhering to this structure, your objective is:
        Process the input.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Let's run the MIPROv2 optimizer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;optimizer = MIPROv2(metric=metric, auto=&amp;quot;light&amp;quot;)
optimized = optimizer.compile(
    student=program,
    trainset=trainset,
    valset=devset,
)

...

if hasattr(optimized, 'predictor'):
    optimized_instruction = optimized.predictor.signature.__doc__
    print(f'&amp;quot;{optimized_instruction}&amp;quot;')

    original_instruction = program.predictor.signature.__doc__
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;After the MIPROv2 optimizer runs we get much better results:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;============================================================
AFTER OPTIMIZATION
============================================================

📋 Optimized Instruction:
&amp;quot;You are a sentiment analysis model. Classify the given text into one of the following categories: &amp;quot;Oink!&amp;quot; for strongly positive, &amp;quot;Hmmm...&amp;quot; for negative, and &amp;quot;Bingo!&amp;quot; for mixed or neutral sentiments.&amp;quot;

✨ INSTRUCTION CHANGED! ✨

  Before: 'Process the input.'

  After:  'You are a sentiment analysis model. Classify the given text into one of the following categories: &amp;quot;Oink!&amp;quot; for strongly positive, &amp;quot;Hmmm...&amp;quot; for negative, and &amp;quot;Bingo!&amp;quot; for mixed or neutral sentiments.'

  The optimizer learned the nonsense mapping!

🧪 Testing optimized version on dev set:
  ✓ 'The first half was amazing but then it fell a...' → Bingo!   (expected: Bingo!)
  ✓ 'Not terrible but nothing special....' → Bingo!   (expected: Bingo!)
  ✓ 'An absolute masterpiece in every way!...' → Oink!    (expected: Oink!)
  ✓ 'I wanted to like it but it was just awful....' → Hmmm...  (expected: Hmmm...)
  ✓ 'Has its moments but overall just average....' → Bingo!   (expected: Bingo!)

📊 Optimized accuracy: 5/5 (100%)
    ↑ Should be much better now!
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Wow! 100% success now despite the misleading labels I'm expecting! Okay, how does it do it? What does the final prompt now look like?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;System message:

Your input fields are:
1. `text` (str): input data
Your output fields are:
1. `label` (Literal['Bingo!', 'Hmmm...', 'Oink!']): output
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## text ## ]]
{text}

[[ ## label ## ]]
{label}        # note: the value you produce must exactly match (no extra characters) one of: Bingo!; Hmmm...; Oink!

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        You are a sentiment analysis model. Classify the given text into one of the following categories: &amp;quot;Oink!&amp;quot; for strongly positive, &amp;quot;Hmmm...&amp;quot; for negative, and &amp;quot;Bingo!&amp;quot; for mixed or neutral sentiments.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Whoa! It changed the instructions from &amp;quot;Process the input.&amp;quot; (My default) to &amp;quot;You are a sentiment analysis model. Classify the given text into one of the following categories: 'Oink!' for strongly positive, 'Hmmm...' for negative, and 'Bingo!' for mixed or neutral sentiments.&amp;quot;&lt;/p&gt;
&lt;p&gt;No wonder it's now getting 100% correct! &lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;What this example shows is the real power of DSPy’s optimizer workflow. With just a few lines of Python, we went from a deliberately confusing instruction that produced almost random results to a fully optimized prompt that got 100% accuracy on our dev set.&lt;/p&gt;
&lt;p&gt;The key takeaway is that DSPy doesn’t just wrap an LLM in a function — it gives you control over how the model interprets your instructions and examples, and it can automatically improve them. That means:&lt;/p&gt;
&lt;p&gt;You can use arbitrary input/output field names and even nonsense labels, and the optimizer will learn the correct mapping.&lt;/p&gt;
&lt;p&gt;You can systematically improve your prompts without manually guessing how to rewrite them using empirical evidence from a test set.&lt;/p&gt;
&lt;p&gt;The process is reproducible and type-safe, so you can confidently switch models or update instructions without breaking your code. If you do switch models, just re-run your optimizer and it will build prompts appropriate for the new model!&lt;/p&gt;
&lt;p&gt;In short, DSPy + MIPROv2 turns what used to be trial-and-error prompt engineering into a structured, programmatic, and testable process. For anyone building reliable LLM-powered applications, this is a huge productivity and quality-of-results win.&lt;/p&gt;
</description>
      <pubDate>Tue, 23 Dec 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-12-23T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2682</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/dspy-how-it-works/</link>
      <category>System.String[]</category>
      <title>How DSPy Builds Prompts</title>
      <description>&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-prompt-your-llm-like-its-code/"&gt;In previous posts&lt;/a&gt;, we talked about what &lt;a href="https://dspy.ai/"&gt;DSPy&lt;/a&gt;, a tool you can use to treat Large Language Model (LLM) interactions as functions written in code rather than via prompting. &lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-a-powerful-but-sometimes-dangerous-prompting-tool/"&gt;In a follow-on post&lt;/a&gt;, we saw examples of how to use it out of the box. &lt;/p&gt;
&lt;p&gt;Okay, that is all nice and good, but how exactly does it work? I mean ultimately you interact with an LLM via prompts, right? So what is all this code doing?&lt;/p&gt;
&lt;h2&gt;How Classify Works&lt;/h2&gt;
&lt;p&gt;In this post, we're going to answer that question. You can &lt;a href="https://github.com/brucenielson/DSPyTutorials/blob/994f1720dbd775ba91bbf26aa36eb1833ba0bb20/dspy_prompt_building.py"&gt;find my code for this post in my github repo&lt;/a&gt;. Let's start with the same Classify signature as from our previous post:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class Classify(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Classify sentiment of a given sentence.&amp;quot;&amp;quot;&amp;quot;
    sentence: str = dspy.InputField()
    sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;] = dspy.OutputField()
    confidence: float = dspy.OutputField()
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We then turn this into a real Classify function:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;model = setup_model()
classify = dspy.Predict(Classify)

result = classify(
    sentence=&amp;quot;This book was super fun to read, though not the last chapter.&amp;quot;
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Recall that the LLM produces a structured output like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Prediction(
    sentiment='neutral',
    confidence=0.8
)
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;How it Works&lt;/h2&gt;
&lt;p&gt;But how does it do it? Obviously it is building a prompt behind the scenes. Can we see what it is doing? Yes, by inspecting the history:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;print(&amp;quot;\n--- Prompt Built by DSPy ---&amp;quot;)
print(model.inspect_history(n=1))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;.inspect_history() is a method that keeps a record of every request DSPy has sent to the model in order.&lt;/li&gt;
&lt;li&gt;The parameter n=1 means “show me the last 1 interaction.”&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here is the (somewhat long) result:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Classification Result:
Prediction(
    sentiment='neutral',
    confidence=0.8
)

--- Prompt Built by DSPy ---

[2025-11-20T10:39:12.956671]

System message:

Your input fields are:
1. `sentence` (str):
Your output fields are:
1. `sentiment` (Literal['positive', 'negative', 'neutral']): 
2. `confidence` (float):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## sentence ## ]]
{sentence}

[[ ## sentiment ## ]]
{sentiment}        # note: the value you produce must exactly match (no extra characters) one of: positive; negative; neutral

[[ ## confidence ## ]]
{confidence}        # note: the value you produce must be a single float value

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Classify sentiment of a given sentence.

User message:

[[ ## sentence ## ]]
This book was super fun to read, though not the last chapter.

Respond with the corresponding output fields, starting with the field `[[ ## sentiment ## ]]` (must be formatted as a valid Python Literal['positive', 'negative', 'neutral']), then `[[ ## confidence ## ]]` (must be formatted as a valid Python float), and then ending with the marker for `[[ ## completed ## ]]`.

Response:

[[ ## sentiment ## ]]
neutral

[[ ## confidence ## ]]
0.8

[[ ## completed ## ]]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What we see here illustrates how DSPy builds structured prompts:&lt;/p&gt;
&lt;p&gt;System message: DSPy tells the model exactly what the input and output fields are, and how the response should be formatted.&lt;/p&gt;
&lt;p&gt;Field markers ([[ ## sentence ## ]], etc.): These placeholders map your Python class fields to text in the prompt. The LLM fills them in with actual values (sentence input, sentiment and confidence output).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[[ ## sentence ## ]]
{sentence}

[[ ## sentiment ## ]]
{sentiment}        # note: the value you produce must exactly match (no extra characters) one of: positive; negative; neutral
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Docstring usage: The docstring from Classify (“Classify sentiment of a given sentence.”) becomes part of the instructions, guiding the model on the task.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Classify sentiment of a given sentence.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This structure ensures the LLM returns a predictable, typed response — which is exactly why we get Prediction(sentiment='neutral', confidence=0.8) instead of an unstructured string.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this post, we saw how DSPy automatically converts a Python class into a fully structured prompt for Gemini. By inspecting the history, we can see exactly how the class name, fields, and docstring are used to generate instructions that guide the model’s response.&lt;/p&gt;
&lt;p&gt;This approach gives you predictable, type-safe outputs without manually writing prompts. You can experiment by tweaking input/output fields or the docstring and immediately see how the prompt — and the model’s behavior — changes.&lt;/p&gt;
&lt;p&gt;Ultimately, DSPy lets you treat LLMs like functions in your code, while still giving full transparency into the underlying prompts. It’s a powerful way to combine the flexibility of LLMs with the reliability of typed, structured programming.&lt;/p&gt;
</description>
      <pubDate>Tue, 16 Dec 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-12-16T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2658</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/dspy-a-powerful-but-sometimes-dangerous-prompting-tool/</link>
      <category>System.String[]</category>
      <title>DSPy: A Powerful (But Sometimes Dangerous) Prompting Tool</title>
      <description>&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/blog/archive/dspy-prompt-your-llm-like-its-code/"&gt;In our last post&lt;/a&gt;, we introduced &lt;a href="https://dspy.ai/"&gt;DSPy&lt;/a&gt;, a tool to treat prompt building for your Large Language Model (LLM) like it is a Python function. This time we're going to use another modified example off &lt;a href="https://dspy.ai/"&gt;their 'getting started' page&lt;/a&gt; and then play with it a little and get a feel for how DSPy works.&lt;/p&gt;
&lt;h2&gt;Writing a &amp;quot;Classify&amp;quot; Function&lt;/h2&gt;
&lt;p&gt;Let's use DSPy to write a sentiment classification function. Here is the code (&lt;a href="https://github.com/brucenielson/DSPyTutorials/tree/ac7077d57071135965a1b88710405832ed07ef90"&gt;found in this github repo&lt;/a&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class Classify(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Classify sentiment of a given sentence.&amp;quot;&amp;quot;&amp;quot;
    sentence: str = dspy.InputField()
    sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;] = dspy.OutputField()
    confidence: float = dspy.OutputField()
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Explanation of the &lt;code&gt;Classify&lt;/code&gt; function, line by line:&lt;/h3&gt;
&lt;p&gt;Let's analyze what is going on here. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;class Classify(dspy.Signature):&lt;/code&gt;&lt;br /&gt;
  This defines a new DSPy &lt;strong&gt;Signature&lt;/strong&gt; class named &lt;code&gt;Classify&lt;/code&gt;. By subclassing &lt;code&gt;dspy.Signature&lt;/code&gt;, you declare that this class specifies the input/output contract for a DSPy module. (&lt;a href="https://dspy.ai/learn/programming/signatures/"&gt;DSPy signatures docs&lt;/a&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;quot;&amp;quot;&amp;quot;Classify sentiment of a given sentence.&amp;quot;&amp;quot;&amp;quot;&lt;/code&gt;&lt;br /&gt;
  A docstring that gives a human-readable description of the task: sentiment classification of a sentence.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sentence: str = dspy.InputField()&lt;/code&gt;&lt;br /&gt;
  Declares an input field named &lt;code&gt;sentence&lt;/code&gt; of type &lt;code&gt;str&lt;/code&gt;. This tells the module it will receive a sentence as input.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;] = dspy.OutputField()&lt;/code&gt;&lt;br /&gt;
  Declares an output field called &lt;code&gt;sentiment&lt;/code&gt; whose type is a &lt;code&gt;Literal&lt;/code&gt; limited to the values &lt;code&gt;&amp;quot;positive&amp;quot;&lt;/code&gt;, &lt;code&gt;&amp;quot;negative&amp;quot;&lt;/code&gt;, or &lt;code&gt;&amp;quot;neutral&amp;quot;&lt;/code&gt;. This instructs DSPy to parse and return one of those labels.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;confidence: float = dspy.OutputField()&lt;/code&gt;&lt;br /&gt;
  Declares another output field named &lt;code&gt;confidence&lt;/code&gt; of type &lt;code&gt;float&lt;/code&gt;. This tells DSPy to return a numeric confidence score alongside the label.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;What this signature does overall:&lt;/strong&gt;&lt;br /&gt;
Together, the fields form a function-like contract: given a &lt;code&gt;sentence&lt;/code&gt; string, the module will return a &lt;code&gt;sentiment&lt;/code&gt; label and a &lt;code&gt;confidence&lt;/code&gt; float. DSPy uses this signature to (1) build the underlying prompt, (2) call the LLM, and (3) parse &amp;amp; cast the model output into the declared typed fields — returning a structured result instead of a raw string.&lt;/p&gt;
&lt;h3&gt;Running the Code&lt;/h3&gt;
&lt;p&gt;Now we'll use this function:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if __name__ == &amp;quot;__main__&amp;quot;:
    set_model()
    dspy.configure()

    classify = dspy.Predict(Classify)
    result = classify(sentence=&amp;quot;This book was super fun to read, though not the last chapter.&amp;quot;)
    print(&amp;quot;\nClassification Example:&amp;quot;)
    print(result)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We first use the DSPy Predict module and pass in our Classify class to it and we get back a new classify function:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;classify = dspy.Predict(Classify)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now we can use the classify method by calling it and passing in a &lt;code&gt;sentence&lt;/code&gt; parameter, just like we specified in our class.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;result = classify(sentence=&amp;quot;This book was super fun to read, though not the last chapter.&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The sentence we're trying to classify is &amp;quot;This book was super fun to read, though not the last chapter.&amp;quot; This is positive for the first half and negative for the second half. So, unsurprisingly, when I run this code I get back a result like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Classification Example:
Prediction(
    sentiment='neutral',
    confidence=0.8
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Gemini is 80% confident that this is a neutral sentence. &lt;/p&gt;
&lt;p&gt;However, let's play around with this a bit and we'll see the dangers of not fully controlling your prompts. Let's intentionally try to screw things up a bit by redoing &lt;code&gt;Classify&lt;/code&gt; like this:&lt;/p&gt;
&lt;p&gt;class Classify(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Classify sentiment of a given sentence.&amp;quot;&amp;quot;&amp;quot;
    sentence: str = dspy.InputField()
    sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;, &amp;quot;boo!&amp;quot;] = dspy.OutputField()
    confidence: float = dspy.OutputField()&lt;/p&gt;
&lt;p&gt;I added &amp;quot;boo!&amp;quot; as a possible sentiment, which makes no real sense. Now run the code again and we get:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Classification Example:
Prediction(
    sentiment='positive',
    confidence=0.75
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;By adding &amp;quot;boo!&amp;quot; as an option, we went from 80% confident this was a neutral statement to 75% confident it is positive. Why? I have no idea and I doubt the LLM does either.&lt;/p&gt;
&lt;h2&gt;How Prompts Are Built&lt;/h2&gt;
&lt;p&gt;You might, at this point, be wondering how this function is turned into a prompt that is sent to Gemini. The answer is that it uses the name of the class, the class properties, and even the docstring to come up with a prompt. To prove this, let's change the &lt;code&gt;Classify&lt;/code&gt; method to instead be:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class Classify(dspy.Signature):
    &amp;quot;&amp;quot;&amp;quot;Return &amp;quot;boo!&amp;quot; as sentiment every time&amp;quot;&amp;quot;&amp;quot;
    sentence: str = dspy.InputField()
    sentiment: Literal[&amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;neutral&amp;quot;, &amp;quot;boo!&amp;quot;] = dspy.OutputField()
    confidence: float = dspy.OutputField()
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notice that all I did was change the docstring to tell it to return &amp;quot;boo!&amp;quot; every time. And we now get:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Classification Example:
Prediction(
    sentiment='boo!',
    confidence=1.0
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Nice, eh? The docstring isn't just a comment any more, it's part of how you code the function! &lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Hopefully this short demo will help you understand how DSPy turns its classes/functions into prompts. There is clearly a lot of power here, but also some danger. The best reason to do this is if you plan to use DSPy's optimizers to let your software come up with the testably best prompts. Imagine how you might simply unplug one model, plug in another, then rerun the optimizer. You could easily move from one LLM to another using the same code base!&lt;/p&gt;
</description>
      <pubDate>Tue, 09 Dec 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-12-09T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2657</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/dspy-prompt-your-llm-like-its-code/</link>
      <category>System.String[]</category>
      <title>DSPy: Prompt your LLM Like It's Code</title>
      <description>&lt;p&gt;&lt;a href="https://dspy.ai/"&gt;DSPy (pronounced &amp;quot;dee-s-pie&amp;quot;) is an open-source Python framework&lt;/a&gt; that lets you exchange prompt building for code. Sounds impossible, right? What it really does is cleverly turn Python code into prompts behind the scenes. You get to specify what you want using code. For example, you might define a &lt;code&gt;Classify&lt;/code&gt; function with specific inputs and expected outputs. Behind the scenes, DSPy converts that into a prompt, sends it to your Large Language Model (LLM), and returns a result formatted according to your specifications. The end result feels like you wrote code instead of building natural language prompts.&lt;/p&gt;
&lt;h2&gt;Why DSPy Instead of Prompt Building?&lt;/h2&gt;
&lt;p&gt;Why might you want to do this? Well, first of all, maybe you don’t. I often like to have full control over everything, so when I don’t know what’s being sent to the LLM, I’m like a nervous public speaker in front of a large auditorium—without being told what my topic is.&lt;/p&gt;
&lt;p&gt;However, think about this differently. Using DSPy forces you to treat your interactions with the LLM as if they were functions with defined inputs and outputs. It formalizes these interactions into Python functions that are independent of the underlying LLM. That alone makes it valuable—but the real power comes when you use DSPy to optimize those interactions. &lt;/p&gt;
&lt;p&gt;Prompt optimization in DSPy works by running your program on a set of developer-provided training examples and collecting traces of input/output behavior. It then systematically &lt;strong&gt;proposes and evaluates new prompt instructions&lt;/strong&gt; (and optionally fine-tunes the model itself), as well as few-shot examples, to maximize your chosen metric. Yes, you heard that right! Instead of writing each prompt and relying on your gut (or ad hoc testing), DSPy will take training examples and try out different prompts until it finds the ones that perform best.&lt;/p&gt;
&lt;p&gt;Even if you don’t use DSPy’s optimizers, there’s still value in the framework thanks to its pre-built modules. In this post, I’ll show you how to build a basic chain-of-thought model &lt;strong&gt;without having to write your own chain-of-thought prompts&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;Getting Started with DSPy&lt;/h2&gt;
&lt;p&gt;First install DSPy:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install -U dspy
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now here is how to use the DSPy Chain of Thought module. &amp;quot;Chain of thought&amp;quot; is a prompting technique for an LLM that tells it to think through the problem step-by-step and reason when giving an answer. It was invented by Google in a famous 2022 paper called &amp;quot;&lt;a href="https://arxiv.org/abs/2201.11903"&gt;Chain-of-Thought Prompting Elicits Reasoning in Large Language Models&lt;/a&gt;&amp;quot;. It allows an LLM to reason better. This example is a modified version of the &lt;a href="https://dspy.ai/"&gt;the 'getting started' tutorial from the DSPy home page&lt;/a&gt;. I have modified it to use Google's Gemini as a model, which was basically just plug and play. I also added my standard get_secret function, &lt;a href="https://github.com/brucenielson/DSPyTutorials/blob/ac7077d57071135965a1b88710405832ed07ef90/general_utils.py"&gt;as found here&lt;/a&gt;. It allows you to load a password out of a file of your choosing to avoid accidentally checking in an API key or password into your code base.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import dspy
from general_utils import get_secret

def set_model():
    # Load Gemini secret and configure LM
    gemini_secret: str = get_secret(r'D:\Documents\Secrets\gemini_secret.txt')
    lm = dspy.LM(&amp;quot;gemini/gemini-2.5-flash&amp;quot;, api_key=gemini_secret)
    dspy.configure(lm=lm)
    return lm

def chain_of_thought():
    math = dspy.ChainOfThought(&amp;quot;question -&amp;gt; answer: float&amp;quot;)
    result = math(question=&amp;quot;Two dice are tossed. What is the probability that the sum equals two?&amp;quot;)
    print(result)

if __name__ == &amp;quot;__main__&amp;quot;:
    set_model()

    print(&amp;quot;Chain of Thought Example:&amp;quot;)
    chain_of_thought()
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can find &lt;a href="https://github.com/brucenielson/DSPyTutorials/tree/ac7077d57071135965a1b88710405832ed07ef90"&gt;my full code here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Understanding the ChainOfThought Signature&lt;/h2&gt;
&lt;p&gt;Here we're using the ChainOfThought module:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;math = dspy.ChainOfThought(&amp;quot;question -&amp;gt; answer: float&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This signature string (&lt;code&gt;&amp;quot;question -&amp;gt; answer: float&amp;quot;&lt;/code&gt;) is how DSPy describes the &lt;strong&gt;inputs&lt;/strong&gt; and &lt;strong&gt;outputs&lt;/strong&gt; of a module in a compact, human-readable way.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The part before the arrow (&lt;code&gt;question&lt;/code&gt;) names the input parameter the module expects.&lt;/li&gt;
&lt;li&gt;The part after the arrow (&lt;code&gt;answer: float&lt;/code&gt;) names the output and declares its type (&lt;code&gt;float&lt;/code&gt; in this case).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Why this matters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Prompt generation&lt;/strong&gt;: DSPy uses that signature to automatically build the underlying prompt. Instead of you writing the prompt text,
  DSPy knows you expect a &lt;code&gt;question&lt;/code&gt; and that the final &lt;code&gt;answer&lt;/code&gt; should
  be a &lt;code&gt;float&lt;/code&gt;, so it steers the model to produce a suitably formatted
  response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Parsing &amp;amp; validation&lt;/strong&gt;: After the LLM replies, DSPy parses and casts the model output into the declared type. In the example above, the returned &lt;code&gt;answer&lt;/code&gt; is converted to a Python &lt;code&gt;float&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Structured traces&lt;/strong&gt;: For chain-of-thought modules in particular, DSPy usually returns both a &lt;code&gt;reasoning&lt;/code&gt; trace (the step-by-step explanation the model produced) and the typed &lt;code&gt;answer&lt;/code&gt;. That’s why your output looked like:&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Chain of Thought Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Prediction(
        reasoning='When two dice are tossed, each die can show a number from 1 to 6.\nThe total number of possible outcomes is 6 (for the first die) * 6 (for the second die) = 36.\nTo find the sum that equals two, we need to list the combinations:\nThe only combination that results in a sum of two is (1, 1).\nThere is only 1 favorable outcome.\nThe probability is the number of favorable outcomes divided by the total number of possible outcomes.\nProbability = 1/36.\n\nTo express this as a float: 1 / 36 = 0.027777777777777776\nRounding to a reasonable number of decimal places, or just providing the direct float value.',
        answer=0.027777777777777776
)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notice that the reasoning trace is how the LLM reasoned (via the chain-of-thought technique) to come up with an answer.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;We have introduced DSPy and explained why you might want to treat your prompt building as code. We've installed it and used its out-of-the-box chain-of-thought module to show what you can do with minimal programming. Next time we'll look deeper at how DSPy works.&lt;/p&gt;
</description>
      <pubDate>Tue, 02 Dec 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-12-02T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2684</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/google-gemini-3-makes-a-huge-leap-on-the-arc-agi-benchmark/</link>
      <category>System.String[]</category>
      <title>Google Gemini 3 Makes a Huge Leap on the ARC‑AGI Benchmark</title>
      <description>&lt;p&gt;Google’s &lt;strong&gt;Gemini 3&lt;/strong&gt; has posted a standout result on the &lt;a href="https://arcprize.org/leaderboard"&gt;ARC Prize Leaderboard – ARC‑AGI‑1&lt;/a&gt;, scoring &lt;strong&gt;about 87.5%&lt;/strong&gt; in its Deep Think preview — a very strong showing on a benchmark focused on abstract reasoning. &lt;strong&gt;ARC‑AGI‑1&lt;/strong&gt; (Abstraction &amp;amp; Reasoning Corpus for AGI) is designed to test &lt;em&gt;fluid intelligence&lt;/em&gt;: each task provides a few input/output examples, and models must infer underlying rules rather than rely on memorization. The benchmark emphasizes &lt;strong&gt;skill-acquisition efficiency&lt;/strong&gt;, rewarding reasoning and generalization over brute-force performance. (&lt;a href="https://arcprize.org/media/arc-prize-2024-technical-report.pdf"&gt;ARC Prize Technical Report&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;This 87.5% score suggests Gemini 3 Deep Think is effectively reasoning with abstract rules, not just pattern matching, and positions it as a model capable of structured, AGI‑style problem solving. Beyond ARC‑AGI‑1, Google has publicly confirmed that Gemini 3 Deep Think achieves &lt;strong&gt;45.1% on ARC‑AGI‑2&lt;/strong&gt;, a more challenging follow-up benchmark that also tests reasoning and code execution skills. (&lt;a href="https://blog.google/intl/en-africa/company-news/outreach-and-initiatives/a-new-era-of-intelligence-with-gemini-3/"&gt;Google Blog&lt;/a&gt;, &lt;a href="https://venturebeat.com/ai/google-unveils-gemini-3-claiming-the-lead-in-math-science-multimodal-and"&gt;VentureBeat&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;If these results hold, they represent a meaningful step forward: Gemini 3 is not just a larger LLM but a reasoning-first model capable of abstract problem solving. High performance on ARC‑AGI‑1 signals efficient learning and generalization — core aspects of intelligence that many benchmarks don’t test — and marks a clear signal that AI systems are beginning to handle tasks previously out of reach for conventional models.&lt;/p&gt;
</description>
      <pubDate>Tue, 25 Nov 2025 09:00:00 -0700</pubDate>
      <a10:updated>2025-11-25T09:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2659</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/ai-tutorial-build-a-free-gemini-ai-chat-agent-with-n8n/</link>
      <category>System.String[]</category>
      <title>AI Tutorial: Build a Free Gemini AI Chat Agent with n8n</title>
      <description>&lt;p&gt;n8n is a great tool for building useful AI Agents with minimal coding. And it even has a free version available if you want to self-host.&lt;/p&gt;
&lt;p&gt;One of &lt;a href="https://docs.n8n.io/advanced-ai/intro-tutorial/"&gt;the first tutorials n8n offers is building an n8n chat agent&lt;/a&gt;. Let’s go through that tutorial but with a twist: we’ll connect to Google’s Gemini instead of OpenAI. This is consistent with Mindfire’s goal of finding the best low-cost AI resources for smaller clients that can’t afford a huge AI bill but want to add AI to their applications. Plus, my version of the tutorial will walk you through step-by-step visually to make it as easy as possible.&lt;/p&gt;
&lt;p&gt;If necessary, &lt;a href="https://www.mindfiretechnology.com/blog/archive/ai-tutorial-installing-n8n-self-hosted-community-edition/"&gt;install n8n as discussed in the previous post&lt;/a&gt;. That post got you running a free working version of n8n running locally. Once that is done, you’re ready to create your first chat agent.&lt;/p&gt;
&lt;h2&gt;Creating a New Workflow&lt;/h2&gt;
&lt;p&gt;After running n8n locally (for me that is http://localhost:5678/) and registering or signing in (as covered in the previous post) you’ll see the overview screen.&lt;/p&gt;
&lt;p&gt;Click on “Create Workflow”:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture1.jpg" alt="Image 1. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Adding a Chat Trigger&lt;/p&gt;
&lt;p&gt;You’ll be taken to the workflow building screen where you’ll click on “Add first step…”:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture2.jpg" alt="Image 2. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Next you should search for ‘chat trigger’ and select it:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture3.jpg" alt="Image 3. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;From here you just want to go “Back to canvas”:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture4.jpg" alt="Image 4. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;h2&gt;Adding an AI Agent&lt;/h2&gt;
&lt;p&gt;You’ll go back to the workflow screen where you will want to add another node:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture5.jpg" alt="Image 5. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Now search for “AI Agent”:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture6.jpg" alt="Image 6. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;This will take you to the Edit AI Agent View. We need to add a chat model to the chat agent, so select the “+” under the ‘chat model’:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture7.jpg" alt="Image 7. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;This will automatically take you back to the search screen but filtered on language models (notice the yellow box). You will search for the model of your choice. For this free demo, we’ll use Google Gemini so that it doesn’t cost anything:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture8.jpg" alt="Image 8. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;The end result should look something like this the below. Notice that for me it defaulted to the right Gemini credentials because I’ve set them up before. It’s smart enough to reuse them. Also note that control to edit the credentials:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture9.jpg" alt="Image 9. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;However, if you are doing this the first time, you’ll need to setup your credentials:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture10.jpg" alt="Image 10. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;This will take you to a screen to setup credentials:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture11.jpg" alt="Image 11. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Host for Gemini is &lt;a href="https://generativelanguage.googleapis.com/"&gt;https://generativelanguage.googleapis.com&lt;/a&gt; and you’ll need an API Key which you can get from &lt;a href="https://aistudio.google.com/"&gt;https://aistudio.google.com/&lt;/a&gt;. (See my tutorial for &lt;a href="https://www.mindfiretechnology.com/blog/archive/ai-tutorial-what-is-google-ai-studio/"&gt;Google’s AI Studio here&lt;/a&gt;.) &lt;/p&gt;
&lt;p&gt;There is a link to get an API key:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture12.jpg" alt="Image 12. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Now fill it into the credentials and it should work. I’ve had some problems with it not initially recognizing a valid Google model. Use models/gemini-2.5-flash (which should be the default). If it doesn’t like that and claims its invalid, I found it worked if I just went back to canvas and started over.&lt;/p&gt;
&lt;h2&gt;Connecting the Chat Trigger&lt;/h2&gt;
&lt;p&gt;It should automatically connect to the chat trigger by default, but I found I sometimes have to connect it manually like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture13.jpg" alt="Image 13. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;It should now look something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture14.jpg" alt="Image 14. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;h2&gt;Adding Persistence&lt;/h2&gt;
&lt;p&gt;Large Language Models, by default, have no persistence. Each time you type to them they have no idea what you previously were talking about. To fix this problem, we’ll need to add ‘memory’ by clicking the “+” under ‘memory’ and then select ‘simple memory:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture15.jpg" alt="Image 15. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;The default context window of 5 messages is fine for now, so go back to canvas:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture16.jpg" alt="Image 16. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;You should have a completed workflow that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture17.jpg" alt="Image 17. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;One mistake I’ve made in the past is accidentally pinning the AI Agent so that it doesn’t execute when run. You can bang your head trying to figure out why your workflow isn’t working when really, it’s in test mode always returning default data. Be sure the pin icon is unselected here:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture18.jpg" alt="Image 18. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Let’s add a ‘system message’ which is just instructions for the chatbot to follow that allows you to lock in a specific style of chatbot:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture19.jpg" alt="Image 19. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;In this case let’s have the chatbot act like a politician that no matter what you ask it finds a way to change the subject. &lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture20.jpg" alt="Image 20. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;h2&gt;Save the final workflow&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture21.jpg" alt="Image 21. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;h2&gt;Chatting with Your AI Agent&lt;/h2&gt;
&lt;p&gt;Click the “Open Chat” button to test it out.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/build-a-free-gemini-ai-chat-agent-with-n8n_picture22.jpg" alt="Image 22. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Try asking your chatbot something and see how it responds. For me, it started off with:&lt;/p&gt;
&lt;p&gt;“Ah, the weather! A truly fascinating subject, wouldn't you agree? But you know, what's even more fascinating is the incredible progress we're making on the new community initiative. …”&lt;/p&gt;
&lt;p&gt;And then it went on to change the subject to talk about its own political initiatives.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;And that’s it!&lt;/strong&gt; You’ve just built your first n8n workflow — for free. But this is only the beginning.&lt;/p&gt;
&lt;p&gt;n8n is a powerful tool for automating almost anything. Imagine all those repetitive tasks you dread every day—gone. Need to schedule meetings automatically? Connect an AI agent to your calendar and email, and let it handle the back-and-forth for you. Overwhelmed by your inbox? Set up an AI-powered workflow to read messages, reply to urgent ones, and even text you when something truly needs your attention.&lt;/p&gt;
&lt;p&gt;With n8n and AI, you can turn busywork into background work—so you can focus on what really matters.&lt;/p&gt;
&lt;h2&gt;Ready to see what else is possible?&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://www.mindfiretechnology.com/contact-us"&gt;Contact us at Mindfire Tech&lt;/a&gt;, and we’ll help you unlock the full potential of automation before your competitors do.&lt;/p&gt;
</description>
      <pubDate>Tue, 18 Nov 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-11-18T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2649</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/ai-tutorial-installing-n8n-self-hosted-community-edition/</link>
      <category>System.String[]</category>
      <title>AI Tutorial: Installing n8n Self Hosted Community Edition</title>
      <description>&lt;p&gt;I’m going to do some blog posts about using &lt;a href="https://n8n.io/"&gt;n8n&lt;/a&gt; to build an AI workflow. Consistent with our ‘open sourced’ / low-cost approach to Artificial Intelligence, I’m going to use the self-hosted community edition of n8n. This blog post will walk you through how to do the installation, in my case, for Windows. (Though it should work more or less the same for other operating systems.)&lt;/p&gt;
&lt;h2&gt;What is n8n?&lt;/h2&gt;
&lt;p&gt;n8n is a workflow-automation platform: it lets you connect apps, services, databases and APIs together, and automate sequences of tasks. It has a visual interface that lets you connect nodes together in a workflow graph:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture1.jpg" alt="Image 1. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;This may sound somewhat similar to &lt;a href="https://www.mindfiretechnology.com/blog/archive/haystack-google-and-gemma-a-tutorial/"&gt;Deepset’s Haystack&lt;/a&gt; and there is some overlap of functionality. For example, Haystack also has a graphical interface available to create its pipelines and it can also chain together nodes that call services. However, n8n can be used for any kind of workflow because n8n offers extensive API integrations to services (like Slack or Google Sheets) whereas Haystack is more oriented specifically to &lt;a href="https://www.mindfiretechnology.com/blog/archive/installing-pgvector-in-preparation-for-retrieval-augmented-generation/"&gt;integration with tools like PostgreSQL and pgvector&lt;/a&gt; or &lt;a href="https://d.docs.live.net/20bdfd902ae25866/Documents/Mindfire%20Tech%20Documents/Website%20Articles/Bruce%20Nielson%20Articles/Installing%20n8n%20Self%20Hosted%20Community%20Edition/haystack.deepset.ai/integrations/elasticsearch-document-store"&gt;elastic search&lt;/a&gt; as well as various Large Language Models. In other words, n8n is more general purpose and Haystack is more oriented towards AI.&lt;/p&gt;
&lt;p&gt;Consistent with Mindfire’s goal of finding low-cost AI solutions, I am going to go over how to install &lt;a href="https://docs.n8n.io/choose-n8n/"&gt;the self-hosted version of n8n&lt;/a&gt; using the free &lt;a href="https://docs.n8n.io/hosting/community-edition-features/"&gt;community edition&lt;/a&gt;. (&lt;a href="https://github.com/n8n-io"&gt;Github repo found here&lt;/a&gt;.) Though &lt;a href="https://docs.n8n.io/choose-n8n/"&gt;other versions are available&lt;/a&gt; including a paid plan hosted by n8n. This edition uses a &lt;a href="https://faircode.io/"&gt;fair-code license&lt;/a&gt;, so it is free to use.&lt;/p&gt;
&lt;p&gt;If you want to learn more, this &lt;a href="https://docs.n8n.io/try-it-out/"&gt;quick start guide&lt;/a&gt; is a great place to start. The n8n website includes a number of &lt;a href="https://docs.n8n.io/courses/"&gt;text&lt;/a&gt; and &lt;a href="https://docs.n8n.io/video-courses/"&gt;video&lt;/a&gt; courses to bring you up to speed.&lt;/p&gt;
&lt;h2&gt;Installing n8n Locally&lt;/h2&gt;
&lt;p&gt;First, if you don’t already have it installed, you’ll need Node.js. &lt;a href="https://www.mindfiretechnology.com/blog/archive/how-to-install-nodejs-for-windows/"&gt;See this blog post for details on how to install node.js&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Next, we need to install n8n itself. &lt;a href="https://docs.n8n.io/hosting/installation/npm/#try-n8n-with-npx"&gt;Go to this page for details&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’ll walk you through my own install of n8n onto a Windows machine.&lt;/p&gt;
&lt;p&gt;For Windows, open “Terminal” to get a Powershell command prompt.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture2.png" alt="Image 2. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;For our purposes we want to do the global install, so at the command prompt type:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npm install n8n -g
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture3.png" alt="Image 3. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;You’ll see an install take place:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture4.png" alt="Image 4. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Notice at the end that you have a localhost url:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;http://localhost:5678
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Run that in a browser and you’ll get the n8n web interface running locally and you’ll get a registration screen:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture5.png" alt="Image 5. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Go ahead and register and create a password. There will be a few extra screens you’ll have to pass through the first time such as answering a survey: &lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture6.png" alt="Image 6. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Finally, you’ll get to the actual n8n screen:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/installing-n8n-self-hosted-community-edition_picture7.png" alt="Image 7. Will add description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;You have now installed n8n and you’re ready to go with the community edition.&lt;/p&gt;
</description>
      <pubDate>Tue, 11 Nov 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-11-11T12:00:00-07:00</a10:updated>
    </item>
    <item>
      <guid isPermaLink="false">2643</guid>
      <link>https://www.mindfiretechnology.com/blog/archive/how-to-install-nodejs-for-windows/</link>
      <category>System.String[]</category>
      <title>How to Install Node.js (for Windows)</title>
      <description>&lt;p&gt;I’m going to explore &lt;a href="https://n8n.io/"&gt;n8n&lt;/a&gt; in future blog posts, so, go over how to set up an AI workflow. To keep this consistent with our ‘open sourced’ approach to AI, we’re going to do a self-hosted version of n8n using the free community edition. To install n8n you need to have Node.js already installed so that you can use the npm command. So, for this blog post we’re going to briefly go over how to install Node.js. This one is particularly easy to do and likely you already have it installed for some other purpose. But for completeness, let’s quickly cover it:&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Update:
It's actually recommended that you use tools like &lt;a href="https://github.com/jasongin/nvs/releases/tag/v1.7.1"&gt;nvs&lt;/a&gt; (Windows) and &lt;a href="https://github.com/nvm-sh/nvm/releases"&gt;nvm&lt;/a&gt; (Mac &amp;amp; Linux)&lt;/p&gt;
&lt;p&gt;Additionally, there are very few (if any) packages that should be installed globally. Installing those as local dependencies can make a huge difference in your ability to move between projects that have different node versions and requirements. You can do this with the command parameter &lt;code&gt;--save-dev&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Example: &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;nvs use 20.15
&amp;gt;node --version
v20.15.0
&amp;gt;npm install typescript --save-dev
&lt;/code&gt;&lt;/pre&gt;

&lt;hr /&gt;
&lt;p&gt;First, go to the node.js home page:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://nodejs.org/"&gt;https://nodejs.org/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You’ll see this page below. Click on the “Get Node.js” button:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/how-to-install-nodejs-for-windows_picture1.jpg" alt="image 1. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;That will take you to the download page (or you can just click the link below and go directly there):&lt;/p&gt;
&lt;p&gt;&lt;a href="https://nodejs.org/en/download"&gt;https://nodejs.org/en/download&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The download page offers you several different ways to download Node.js. Since we’re a Windows shop (and since Windows needs more AI love) I’m going to show you how to download the Windows installer, though feel free to change this to whatever operating system you prefer. First you need to select the operating system of your choice, which for me is Windows 64-bit:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/how-to-install-nodejs-for-windows_picture2.jpg" alt="image 2. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Then you click the “Windows Installer.msi” button to do the download:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/how-to-install-nodejs-for-windows_picture3.jpg" alt="image 3. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;After the download, click the browser download icon and select what you just downloaded and run it:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/how-to-install-nodejs-for-windows_picture4.jpg" alt="image 4. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;This starts the installer:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.mindfiretechnology.com/blog/media/how-to-install-nodejs-for-windows_picture5.jpg" alt="image 5. Will add more detailed description at a later date." /&gt;&lt;/p&gt;
&lt;p&gt;Now just take the defaults in the installer and you’re ready for the next step for installing n8n.&lt;/p&gt;
</description>
      <pubDate>Tue, 04 Nov 2025 12:00:00 -0700</pubDate>
      <a10:updated>2025-11-04T12:00:00-07:00</a10:updated>
    </item>
  </channel>
</rss>