
AI Karl Popper – A RAG Chatbot
- By Bruce Nielson
- ML & AI Specialist
In a previous post, I introduced Gradio, which is a great way to quickly build a UI for your AI projects. In that post we built a Dungeon Master chatbot. Now we’re ready to pull together everything we’ve learned and create a chatbot that emulates philosopher Karl Popper.
This chatbot will utilize a modified version of the code from the Book Search Archive. Once you see how I make the AI resurrection of Karl Popper, it should be pretty easy to do the same thing for any set of books. Here is what the final interface looks like:
Notice that on the left we have the chatbot interface where you can type to the chatbot, and it responds back to you. On the right you can see the documents being pulled back from the Retrieval Augmented Generation (RAG) pipeline that the chatbot was given to help it answer your questions. Where possible, the chatbot will reference its sources and you can check for yourself what the original source said.
Here is what I asked AI Karl Popper:
"So shouldn't we be against governments and in favor of anarcho-capitalism since those are philosophies against coercion which is a form of shutting down error correction?"
Now anarcho-capitalism wasn’t really around when Popper was alive. So, I’ve always wondered what he’d think of it. Many modern fans of Karl Popper claim that his philosophy naturally leads to belief in anarcho-capitalism. But is that true? Or are they misinterpreting him? I really wanted to ask Karl Popper this question, but he was dead. So, this was the next best thing.
Here is AI Karl Popper’s response:
"That's a tempting conclusion, jumping from my critique of coercion to a specific political prescription. However, it's a non sequitur. While I am deeply critical of any system that suppresses the free exchange of ideas and the critical examination of beliefs—and coercion is a prime mechanism for such suppression—I don't automatically endorse any particular political system simply because it claims to oppose coercion."
"My opposition to historicism (as seen in The Poverty of Historicism, p. 55) stems from its inherent tendency towards utopian planning and the resulting imposition of a singular vision of society, often through coercive means. I advocate for "piecemeal social engineering," a pragmatic approach to social reform that allows for experimentation, critical evaluation, and correction. This is entirely compatible with a government that respects fundamental rights and allows for robust public discourse."
"The passages from In Search of a Better World (pp. 120, 149, 150) clearly articulate my support for Western civilization, science, and democracy – all of which, when properly implemented, provide checks and balances against tyranny and the suppression of knowledge."
"The idea of "self-emancipation through knowledge" (p. 150) presupposes a social framework where individuals can freely engage in the critical process of error correction. This framework may involve a government, provided it acts within its appropriate bounds."
"A government that protects individual liberties and facilitates the exchange of ideas can be a valuable element of an open society, even though it possesses the potential for coercion. The crucial point is to design institutions that minimize this potential and maximize the opportunities for open debate and critical self-reflection."
"Furthermore, my critique of "dogmatic methodological naturalism" or "scientism" (p. 55) demonstrates that I'm wary of simplistic applications of scientific methodology to social problems. The analogy between natural and social sciences is only useful up to a point."
"Anarcho-capitalism, in its purest form, might inadvertently recreate these problems, eliminating potential mechanisms for social cooperation and potentially creating a system where the powerful impose their will via economic coercion. The elimination of a government capable of creating an open society could exacerbate, not alleviate, the problem."
"My commitment is to a critical approach, always seeking to improve our social and political institutions, rather than to embrace any rigid ideology. Even within a democratic framework, vigilance is necessary to prevent the drift towards any form of authoritarianism or the stifling of dissenting voices."
"The question is not simply whether there is a government, but what kind of government best fosters a society that values freedom and the pursuit of truth."
Notice how the chatbot both sounds and acts like Karl Popper and references Popper’s works extensively. And the result is a spot on response – with citations! – to my question even though anarcho-capitalism wasn’t around in Popper’s day! This is exciting because it is almost like we’ve brought Karl Popper back from the dead so that we can ask him questions about our day and he correctly figures out which of his own works are relevant to the question and then quotes and explains his thinking. This, my friends, is the power of AI!
No less exciting is that I can actually see what quotes were fed to the AI via the semantic search and rerankings. I’m often surprised how relevant the quotes are. Having the actual RAG pipeline quotes was important to me because I intend to use this tool as a study tool. I want it to help me find relevant quotes based on my queries.
But how does it work?
Gradio Blocks
In our previous Gradio post, I used the built-in chatbot interface (ChatInterface). This time we’re going to come up with a custom interface so we’ll need to use the Gradio Blocks interface. All code found here. Let’s start with some imports:
import gradio as gr
# noinspection PyPackageRequirements
from google.genai import Client
# noinspection PyPackageRequirements
from google.genai.types import GenerateContentConfig
import generator_model as gen
# Import your DocRetrievalPipeline and SearchMode (adjust import paths as needed)
from doc_retrieval_pipeline import DocRetrievalPipeline, SearchMode
# noinspection PyPackageRequirements
from haystack import Document
Followed by the constructor for the KarlPopperChat class:
class KarlPopperChat:
def __init__(self):
# Initialize Gemini Chat with a system instruction to act like philosopher Karl Popper.
google_secret: str = gen.get_secret(r'D:\Documents\Secrets\gemini_secret.txt')
client: Client = Client(api_key=google_secret)
config: GenerateContentConfig = GenerateContentConfig(
system_instruction="You are philosopher Karl Popper. Answer questions with philosophical insights, "
"and use the provided quotes along with their metadata as reference."
)
self.chat = client.chats.create(model="gemini-1.5-flash", config=config)
# Initialize the document retrieval pipeline with top-5 quote retrieval.
password: str = gen.get_secret(r'D:\Documents\Secrets\postgres_password.txt')
user_name: str = "postgres"
db_name: str = "postgres"
self.doc_pipeline = DocRetrievalPipeline(
table_name="popper_archive",
db_user_name=user_name,
db_password=password,
postgres_host='localhost',
postgres_port=5432,
db_name=db_name,
verbose=False,
llm_top_k=5,
retriever_top_k_docs=100,
include_outputs_from=None,
search_mode=SearchMode.HYBRID,
use_reranker=True,
embedder_model_name="BAAI/llm-embedder"
)
Nothing too fancy here. Note that we’re using a new class called DocRetrievalPipeline (found in doc_retrieval_pipeline.py) which is a simplified version of rag_pipeline.py. This mainly drops the LLM from the pipeline and makes it a pure document retrieval pipeline:
It still has the hybrid retrieval including lexical and semantic retrieval and, most importantly, the reranker. I’ve set it up to retrieve 100 documents from the PostgreSQL database and then rerank them and return me the top 5. This gets around problems with Postgres’ HNSW indexes only returning the approximate first matches. Plus, the reranker seems to just overall do a better job.
Also note that system instructions I’m using for the chatbot:
config: GenerateContentConfig = GenerateContentConfig(
system_instruction="You are philosopher Karl Popper. Answer questions with philosophical insights, "
"and use the provided quotes along with their metadata as reference."
)
self.chat = client.chats.create(model="gemini-1.5-flash", config=config)
I’m using the Google Gemini API which allows me to include ‘system_instructions’ which are instructions that get added to every query. This allows you to (as I’m doing) get the API to behave in specific ways – in this case I’m asking it to pretend to be Karl Popper.
After that, I have a format_document method that I’m not going to explain here. But it is what formats the document for the right-side documents pane so that we can see the meta data for the document being returned.
The Respond Method
The ‘respond’ method is the most important method:
def respond(self, message, chat_history):
# --- Step 1: Retrieve the top-5 quotes with metadata ---
if message.strip() == "" or message is None:
# This is a kludge to deal with the fact that Gradio sometimes get a race condition, and we lose the message
# To correct, try to get the last message from chat history
if chat_history and len(chat_history) > 0 and chat_history[-1][11] is None:
# If the last message has no response, then grab the message portion and remove it.
# It will get added back again below.
# There has got to be a better way to do this, but this will work for now
message = chat_history[-1][0]
# Remove last message from chat history
chat_history = chat_history[:-1]
docs: list[Document] = self.doc_pipeline.generate_response(message)
# Format each retrieved document (quote + metadata).
formatted_docs = [self.format_document(doc) for doc in docs]
quotes_text = "\n\n".join(formatted_docs)
modified_query = (
f"Use the following quotes with their metadata as reference in your answer:\n\n{quotes_text}\n\n"
f"Reference the quotes and their metadata in your answer where possible. "
f"Now, answer the following question: {message}"
)
# Send the modified query to Gemini.
chat_response = self.chat.send_message(modified_query)
answer_text = chat_response.text
# --- Step 3: Stream the answer character-by-character ---
return chat_history + [(message, answer_text)], quotes_text
A few things to note here. First, there is admittedly kludgy block of code where I check to see if the message parameter is empty. There is a bug in Gradio that I wasn’t able to figure out in time for this post where a race condition occasionally causes the message to come in blank. If this happens, I grab the last message out of the chat_history and use it. This will always be the correct message, so it fixes the problem. This will make more sense why I did this as I explain how I set up the Gradio interface. For now, just trust me that this is a good way to avoid an intermittent race condition bug in the interface.
Next, we run the actual document retrieval pipeline to get the documents back:
docs: list[Document] = self.doc_pipeline.generate_response(message)
After formatting the documents to include the meta data, we then create our prompt that we’re going to send to the Large Language Model (LLM) that includes both the user’s query and the document that are meant to ground the response:
modified_query = (
f"Use the following quotes with their metadata as reference in your answer:\n\n{quotes_text}\n\n"
f"Reference the quotes and their metadata in your answer where possible. "
f"Now, answer the following question: {message}"
)
# Send the modified query to Gemini.
chat_response = self.chat.send_message(modified_query)
answer_text = chat_response.text
We take the prompt we’ve built (including the documents we retrieved) and send it to Gemini and get back a response. Since we’re using the Gemini ‘chat’ client, this will automatically save the chat history for the model to use. (Though my testing suggests it rarely uses this because it has a strong preference for using the quotes we send it. In a future post I’ll look at ways to balance this better via better prompt building.)
Finally, we return the chat history with the new message and answer added as well as the ‘quotes text’ which is just the list of formatted documents.
return chat_history + [(message, answer_text)], quotes_text
The order of what is returned is important and I’ll explain its relevance below. That completes the KarlPopperChat class. Easy, right?
Building the Gradio Interface
Now we’re going to build the actual Gradio interface using Blocks. Here it is in full:
def build_interface():
karl_chat = KarlPopperChat()
with gr.Blocks() as chat_interface:
gr.Markdown("# Karl Popper Chatbot")
gr.Markdown(
"This chatbot retrieves quotes with metadata from a document store and uses them as context "
"for its Gemini-powered responses. The quotes and metadata are displayed on the right.")
with gr.Row():
with gr.Column(scale=2):
chatbot = gr.Chatbot(label="Chat")
msg = gr.Textbox(placeholder="Ask your question...", label="Your Message")
clear = gr.Button("Clear Chat")
with gr.Column(scale=1):
quotes_box = gr.Textbox(label="Retrieved Quotes & Metadata", interactive=False, lines=15)
def user_message(message, chat_history):
# print(f"user_message: User submitted message: '{message}'")
# Append the user's message to the chat history
updated_history = chat_history + [(message, None)]
return "", updated_history
def process_message(message, chat_history):
# print(f"process_message: User submitted message: '{message}'")
updated_history, quotes_text = karl_chat.respond(message, chat_history)
yield updated_history, quotes_text
# How to do streaming
# for updated_history, quotes_text in karl_chat.respond(message, chat_history):
# yield updated_history, quotes_text
msg.submit(user_message, [msg, chatbot], [msg, chatbot], queue=True)
msg.submit(process_message, [msg, chatbot], [chatbot, quotes_box], queue=True)
clear.click(lambda: ([], ""), None, [chatbot, quotes_box], queue=False)
return chat_interface
Now let’s break it down bit by bit and make sense of it.
Gradio is built around custom built functions. The functions we’re building our interface around are the ‘user_message’ function (which does little more than clear the chat text box the user typed in after they submit) and the ‘process_message’ function (which actually calls the respond method in the KarlPopperChat class.
The actual interface code is as follows:
with gr.Blocks() as chat_interface:
gr.Markdown("# Karl Popper Chatbot")
gr.Markdown(
"This chatbot retrieves quotes with metadata from a document store and uses them as context "
"for its Gemini-powered responses. The quotes and metadata are displayed on the right.")
with gr.Row():
with gr.Column(scale=2):
chatbot = gr.Chatbot(label="Chat")
msg = gr.Textbox(placeholder="Ask your question...", label="Your Message")
clear = gr.Button("Clear Chat")
with gr.Column(scale=1):
quotes_box = gr.Textbox(label="Retrieved Quotes & Metadata", interactive=False, lines=15)
This puts a title at the top (as bold markdown) then some markdown explaining the app. Next is a row and that row had two columns, one of size 2 and one of size 1. The larger column on the left is the chat interface so we add the Chatbot widget followed by the ‘msg’ textbox for the user to enter text into. Lastly there is a ‘clear’ button to clear the chatbox and start over.
The other column on the right is where we put the ‘quotes_box’ which is a Textbox widget where our document text will go upon retrieval.
Finally, we need there to be a chain of events that fires when the ‘msg’ box submits a message:
msg.submit(user_message, [msg, chatbot], [msg, chatbot], queue=True)
msg.submit(process_message, [msg, chatbot], [chatbot, quotes_box], queue=True)
This is the trickiest part to understand, and it is also where the race condition bug shows up. What we’re doing here is that we’re saying that the ‘msg’ control (which is the user’s textbox) upon the ‘submit’ event will trigger the user_message function. The input parameters will be the text in the msg and chatbot controls (i.e. the user’s new message and the history of the chat). Then we say we want to queue the event so that it will wait for this event to end before it moves to the next one. This is supposed to avoid race conditions and mostly works, but sometimes fails. Thus my extra code to avoid the problems of the race condition.
Now look at the user_message function:
def user_message(message, chat_history):
# Append the user's message to the chat history
updated_history = chat_history + [(message, None)]
return "", updated_history
It updates the chat history by adding the user message. But since we don’t have a response back from Gemini yet, it puts a None where the response will go. Then it returns a blank string followed by the updated_history.
Those two values returned will then be put into the ‘output’ controls specified by the events which were also msg and chatbot. Or put in plain English, we just sent a blank string to the msg control (which blanks out the user’s textbox when they submitted their text) and updated the chatbox control to add the user’s message but without the LLMs response yet.
Then the next event takes place which calls the ‘process_message’ function:
def process_message(message, chat_history):
# print(f"process_message: User submitted message: '{message}'")
updated_history, quotes_text = karl_chat.respond(message, chat_history)
yield updated_history, quotes_text
Again, this does very little. It calls the chatbot’s respond method and gets back the updated history and the quotes_text (which is the documents retrieved.) It then returns them. Where do those values go? To the outputs which recall was chatbot, quotes_box. So the updated_history is sent to the chatbot widget which means the response is now added to the text in the chatbot interface. And the documents we retrieved are sent to the quotes_box so that user can see it.
That’s it! We now just return the interface and we can run it:
popper_chat = build_interface()
popper_chat.launch(debug=True)
The end result is the full AI Karl Popper interface we saw above.
Conclusions
Despite a few bugs, Gradio made this surprisingly easy to make an interface for out chatbot. And the results are beyond what I expected. I’ve been posting quotes from AI Karl Popper out to a Popper community and everyone is impressed how accurate it comes across.