Building a Local DeepSeek R1 Chatbot with Streamlit and Ollama
- By Bruce Nielson
- ML & AI Specialist
With all the -- admittedly already out of date -- hype around Deepseek R1 from China, I thought it would be fun to build a simple chatbot interface for DeepSeek R1 that runs entirely on your local machine. DeepSeek R1 is a reasoning model that shows its "thinking process" before giving answers, similar to OpenAI's o1 model. We'll use Streamlit for the interface and Ollama to run the model locally.
The full code available both in this post and in my Github Repo found here.
What You'll Need
- Python 3.11+ installed on your computer
- Ollama installed (see this post for details)
- The DeepSeek R1 model pulled in Ollama:
ollama pull deepseek-r1:1.5b - Streamlit:
pip install streamlit - Ollama Python library:
pip install ollama
The Complete Code
Here's the full code we'll be breaking down:
# Run using: streamlit run streamlit_example.py
import streamlit as st
import ollama
st.set_page_config(page_title="DeepSeek R1 Chat", page_icon="🤖")
def convert_latex_delimiters(text):
"""Convert LaTeX delimiters from backslash-bracket to dollar signs"""
# Replace display math delimiters
text = text.replace(r'\[', '$$')
text = text.replace(r'\]', '$$')
# Replace inline math delimiters
text = text.replace(r'\(', '$')
text = text.replace(r'\)', '$')
return text
# Initialize chat history
if "messages" not in st.session_state:
st.session_state.messages = []
# Display chat history
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
if msg["role"] == "assistant" and msg.get("thinking"):
with st.expander("🧠 View Thinking Process", expanded=False):
st.markdown(convert_latex_delimiters(msg["thinking"]))
st.markdown(convert_latex_delimiters(msg["content"]))
# Chat input
user_input = st.chat_input("Ask DeepSeek R1")
if user_input:
# Add and display user message
st.session_state.messages.append({"role": "user", "content": user_input})
with st.chat_message("user"):
st.markdown(user_input)
# Generate and display assistant response
with st.chat_message("assistant"):
thinking_display = st.empty()
answer_display = st.empty()
accumulated_thinking = ""
accumulated_answer = ""
with st.spinner("Thinking...", show_time=True):
stream = ollama.chat(
model="deepseek-r1:1.5b",
messages=[{"role": m["role"], "content": m["content"]}
for m in st.session_state.messages],
stream=True,
think=True,
)
for chunk in stream:
chunk_msg = chunk.get("message", {})
# Stream thinking content
if chunk_msg.get("thinking"):
accumulated_thinking += chunk_msg["thinking"]
with thinking_display.container():
with st.expander("🧠 View Thinking Process", expanded=True):
st.markdown(convert_latex_delimiters(accumulated_thinking) + "▌")
# Stream answer content
if chunk_msg.get("content"):
accumulated_answer += chunk_msg["content"]
answer_display.markdown(convert_latex_delimiters(accumulated_answer) + "▌")
# Final display without cursor
if accumulated_thinking:
with thinking_display.container():
with st.expander("🧠 View Thinking Process", expanded=False):
st.markdown(convert_latex_delimiters(accumulated_thinking))
answer_display.markdown(convert_latex_delimiters(accumulated_answer))
# Save to chat history
st.session_state.messages.append({
"role": "assistant",
"content": accumulated_answer,
"thinking": accumulated_thinking or None
})
Breaking Down the Code
Page Configuration
st.set_page_config(page_title="DeepSeek R1 Chat", page_icon="🤖")
This sets up our Streamlit page with a custom title and icon that appears in the browser tab.
LaTeX Conversion Function
def convert_latex_delimiters(text):
text = text.replace(r'\[', '$$')
text = text.replace(r'\]', '$$')
text = text.replace(r'\(', '$')
text = text.replace(r'\)', '$')
return text
DeepSeek R1 often outputs mathematical notation using LaTeX syntax. However, it uses \[ and \] for display math and \( and \) for inline math. Streamlit's markdown renderer expects $$ and $ instead, so this function converts between the two formats.
Initializing Chat History
if "messages" not in st.session_state:
st.session_state.messages = []
Streamlit reruns your entire script on every interaction. To preserve chat history between reruns, we store messages in st.session_state, which persists across reruns.
Displaying Chat History
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
if msg["role"] == "assistant" and msg.get("thinking"):
with st.expander("🧠 View Thinking Process", expanded=False):
st.markdown(convert_latex_delimiters(msg["thinking"]))
st.markdown(convert_latex_delimiters(msg["content"]))
This loop displays all previous messages. For assistant messages, if there's thinking content, we show it in a collapsible expander. This keeps the interface clean while still making the reasoning process available.
Handling User Input
user_input = st.chat_input("Ask DeepSeek R1")
if user_input:
st.session_state.messages.append({"role": "user", "content": user_input})
with st.chat_message("user"):
st.markdown(user_input)
When a user types a message, we add it to our message history and display it immediately in the chat interface.
Streaming the Response
with st.spinner("Thinking...", show_time=True):
stream = ollama.chat(
model="deepseek-r1:1.5b",
messages=[{"role": m["role"], "content": m["content"]}
for m in st.session_state.messages],
stream=True,
think=True,
)
The critical part here is think=True. This tells Ollama to return DeepSeek R1's reasoning process separately from its final answer. The stream=True parameter makes the response appear word-by-word rather than all at once.
Processing the Stream
for chunk in stream:
chunk_msg = chunk.get("message", {})
# Stream thinking content
if chunk_msg.get("thinking"):
accumulated_thinking += chunk_msg["thinking"]
with thinking_display.container():
with st.expander("🧠 View Thinking Process", expanded=True):
st.markdown(convert_latex_delimiters(accumulated_thinking) + "▌")
# Stream answer content
if chunk_msg.get("content"):
accumulated_answer += chunk_msg["content"]
answer_display.markdown(convert_latex_delimiters(accumulated_answer) + "▌")
As chunks arrive from Ollama, they contain either "thinking" or "content" fields. We accumulate both separately and display them in real-time. The "▌" character creates a blinking cursor effect showing that content is still streaming.
Finalizing the Display
if accumulated_thinking:
with thinking_display.container():
with st.expander("🧠 View Thinking Process", expanded=False):
st.markdown(convert_latex_delimiters(accumulated_thinking))
answer_display.markdown(convert_latex_delimiters(accumulated_answer))
Once streaming completes, we remove the cursor and collapse the thinking expander by default. This gives users a clean final view with the option to expand and see the reasoning.
Saving to History
st.session_state.messages.append({
"role": "assistant",
"content": accumulated_answer,
"thinking": accumulated_thinking or None
})
Finally, we save the complete response to our message history so it persists when the page reruns.
Running the Application
Save the code as streamlit_example.py and run:
streamlit run streamlit_example.py
Your browser will open with the chatbot interface. Try asking it a math question like "What is 15 times 23?" and you'll see it show its reasoning process before giving the final answer.
Here was my result:

One annoying aspect of Deepseek R1 is that I've seen it got stuck in a 'thinking loop'. I once asked it to give me the first 12 digits of PI and it couldn't decide if that included the '3.' or not. It kept going back and forth trying to decide for 10 minutes before I gave up. (Rather than just giving me both answers.)

Key Features
- Runs completely locally with no API calls
- Shows the model's reasoning process in expandable sections
- Streams responses in real-time for a better user experience
- Properly renders mathematical notation
- Maintains conversation history
Conclusion
With just 90 lines of Python, we've created a functional chatbot interface for DeepSeek R1 that showcases one of its most interesting features: transparent reasoning. The combination of Streamlit's simple API and Ollama's local model hosting makes it easy to experiment with AI models without worrying about API costs or privacy concerns.
The complete source code is available above. Feel free to modify and extend it for your own projects!