AI Tutorial: A ReAct Agent Using Gemini and Haystack
- By Bruce Nielson
- ML & AI Specialist
Building a ReAct Research Agent with Gemini and Haystack
In a previous post, I walked through how function calling can be used to give large language models the ability to reason and act in a loop — using the ReAct pattern pioneered by this famous paper. In that post, I showed a simple ReAct agent that could chain together "thought → action → observation" steps to work through a question, invoking tools that were functions I had built. In that version, I built the ReAct agent from the ground up which meant I had to parse the results to determine which function to call. But then, in this post (i.e. "Using Gemini Function Calling to Build a Research Agent"), I showed how Google's Gemini had function calling like that built-in. This isn't even the best way to do this! I'll have to show you an easier way in a future post.
In any case, I have released a new version of the Book Search Archive aka "AI Karl Popper" (Mindfire's Open-Source AI software which is really our toy project to work out our open-source stack). It now includes a ReAct based 'research agent' (similar to what we developed in this post: "Using Gemini Function Calling to Build a Research Agent"). I'll explain in a future post why I wanted to do this. In this post, I just want to make people aware of this new version of the code and what has been added.
The ReAct Research Agent — designed specifically to allow "AI Karl Popper" (really just the Book Search Archive with a Karl Popper embedded text database attached) to dig in and come up with its own queries to try to answer a user's question using a combination of document retrieval, function calling, and last-resort fallback to Wikipedia. Yes, that's right, AI Karl Popper can now go to Wikipedia to answer questions. Allowing our pseudo-Karl Popper to update this knowledge beyond the books the real Karl Popper wrote in his lifetime. No bad, eh?
You can view the full code here on GitHub.
What's New in This Agent?
This version of the agent is designed to dig into a custom document store (in my case, a PostgreSQL-backed archive powered by the Haystack framework) and synthesize answers from the most relevant quotes it finds. Only when that fails — and only after six reasoning steps — does it escalate to a fallback query against Wikipedia. (I say after 6 steps, but honestly, AI Karl tends to 'disobey' and call Wikipedia early. Though he's very apologetic about it. No, I'm not making this up. Try him out!)
Key improvements include:
-
Tool declarations for document and Wikipedia search: Each function is explicitly registered with the Gemini model using Google's function-calling tools. These include
search_datastore,search_wikipedia, andanswer. -
Score-aware document formatting: The quotes pulled from the document store include not just the content, but also metadata such as confidence scores, sources, and retrieval methods. If AI Karl looks up something on Wikipedia a 'document' is created as a reference and the user is given access to what question led to that Wikipedia look up.
-
Conversation state and flow control: The agent tracks its iteration count, discourages premature Wikipedia access, and gracefully ends when it hits the
answerfunction.
How It Works at a High Level
Here’s a quick summary of the core idea:
-
Step-by-step reasoning: The agent starts by “thinking aloud” — planning how to approach the question.
-
Search the datastore first: Using
search_datastore, it retrieves top-ranked results from the Haystack-powered database and tries to synthesize an answer. -
Fallback to Wikipedia only if needed: If the model still can’t answer the question after several attempts, it uses
search_wikipediato retrieve page content and try again. Try asking AI Karl when Bozo the clown is born and get ready for him to figure out some way to reasonably answer that question even though there are no references to Bozo the clown in Karl Popper's writings and even though it is ambiguous as to what the user intends. (When the character was created? How old the actor would be today?) -
Final response with sources: Once it’s ready, the agent calls
answer()to return its final result, listing all information sources used.
Here’s an example question you might ask it:
"Is induction valid in some cases, particularly when doing statistics?"
Or...
"When was Bozo the Clown born and how old is he today?"
The agent might begin by planning its approach, then search the document store for any references to, for example, induction, statistical inference, and related concepts. If it finds quotes supporting the idea that induction is a valid form of reasoning in certain empirical domains, it builds an answer from them. Otherwise, it checks Wikipedia for broader philosophical discussion.
Lessons from Gemini's Shifting Tiers
As I noted in this other post, using Gemini effectively now requires a paid API key. Google quietly downgraded the free-tier access earlier this year, making many of these advanced features — especially tool calling — available only via the paid tier. If you're trying to run this agent yourself, make sure to use an authenticated Gemini model (e.g. gemini-1.5-pro or gemini-1.5-flash) with API key access.
What’s Next?
There are several ways I’m looking to improve this:
-
Better fallback reasoning: Right now, the Wikipedia fallback is a blunt tool. I’d like to add more nuanced decision-making about when to escalate to external sources. And frankly, it's too slow.
-
Chain-of-thought memory and reasoning: Having the agent remember its past thoughts and decisions more explicitly could help avoid redundant searches.
-
Cross-query synthesis: Right now, each search is evaluated in isolation. I'd like to experiment with combining results from multiple queries to build richer answers.
If you're interested in trying this out, or building your own research agent, the GitHub repo is a great place to start.
Got questions? Want to contribute? Leave a comment or reach out — I’m always happy to talk shop about AI tooling and research agents.