DSPy: Prompt your LLM Like It's Code
- By Bruce Nielson
- ML & AI Specialist
DSPy (pronounced "dee-s-pie") is an open-source Python framework that lets you exchange prompt building for code. Sounds impossible, right? What it really does is cleverly turn Python code into prompts behind the scenes. You get to specify what you want using code. For example, you might define a Classify function with specific inputs and expected outputs. Behind the scenes, DSPy converts that into a prompt, sends it to your Large Language Model (LLM), and returns a result formatted according to your specifications. The end result feels like you wrote code instead of building natural language prompts.
Why DSPy Instead of Prompt Building?
Why might you want to do this? Well, first of all, maybe you don’t. I often like to have full control over everything, so when I don’t know what’s being sent to the LLM, I’m like a nervous public speaker in front of a large auditorium—without being told what my topic is.
However, think about this differently. Using DSPy forces you to treat your interactions with the LLM as if they were functions with defined inputs and outputs. It formalizes these interactions into Python functions that are independent of the underlying LLM. That alone makes it valuable—but the real power comes when you use DSPy to optimize those interactions.
Prompt optimization in DSPy works by running your program on a set of developer-provided training examples and collecting traces of input/output behavior. It then systematically proposes and evaluates new prompt instructions (and optionally fine-tunes the model itself), as well as few-shot examples, to maximize your chosen metric. Yes, you heard that right! Instead of writing each prompt and relying on your gut (or ad hoc testing), DSPy will take training examples and try out different prompts until it finds the ones that perform best.
Even if you don’t use DSPy’s optimizers, there’s still value in the framework thanks to its pre-built modules. In this post, I’ll show you how to build a basic chain-of-thought model without having to write your own chain-of-thought prompts.
Getting Started with DSPy
First install DSPy:
pip install -U dspy
Now here is how to use the DSPy Chain of Thought module. "Chain of thought" is a prompting technique for an LLM that tells it to think through the problem step-by-step and reason when giving an answer. It was invented by Google in a famous 2022 paper called "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". It allows an LLM to reason better. This example is a modified version of the the 'getting started' tutorial from the DSPy home page. I have modified it to use Google's Gemini as a model, which was basically just plug and play. I also added my standard get_secret function, as found here. It allows you to load a password out of a file of your choosing to avoid accidentally checking in an API key or password into your code base.
import dspy
from general_utils import get_secret
def set_model():
# Load Gemini secret and configure LM
gemini_secret: str = get_secret(r'D:\Documents\Secrets\gemini_secret.txt')
lm = dspy.LM("gemini/gemini-2.5-flash", api_key=gemini_secret)
dspy.configure(lm=lm)
return lm
def chain_of_thought():
math = dspy.ChainOfThought("question -> answer: float")
result = math(question="Two dice are tossed. What is the probability that the sum equals two?")
print(result)
if __name__ == "__main__":
set_model()
print("Chain of Thought Example:")
chain_of_thought()
You can find my full code here.
Understanding the ChainOfThought Signature
Here we're using the ChainOfThought module:
math = dspy.ChainOfThought("question -> answer: float")
This signature string ("question -> answer: float") is how DSPy describes the inputs and outputs of a module in a compact, human-readable way.
- The part before the arrow (
question) names the input parameter the module expects. - The part after the arrow (
answer: float) names the output and declares its type (floatin this case).
Why this matters:
- Prompt generation: DSPy uses that signature to automatically build the underlying prompt. Instead of you writing the prompt text,
DSPy knows you expect a
questionand that the finalanswershould be afloat, so it steers the model to produce a suitably formatted response. - Parsing & validation: After the LLM replies, DSPy parses and casts the model output into the declared type. In the example above, the returned
answeris converted to a Pythonfloat. - Structured traces: For chain-of-thought modules in particular, DSPy usually returns both a
reasoningtrace (the step-by-step explanation the model produced) and the typedanswer. That’s why your output looked like:
Chain of Thought Example:
Prediction(
reasoning='When two dice are tossed, each die can show a number from 1 to 6.\nThe total number of possible outcomes is 6 (for the first die) * 6 (for the second die) = 36.\nTo find the sum that equals two, we need to list the combinations:\nThe only combination that results in a sum of two is (1, 1).\nThere is only 1 favorable outcome.\nThe probability is the number of favorable outcomes divided by the total number of possible outcomes.\nProbability = 1/36.\n\nTo express this as a float: 1 / 36 = 0.027777777777777776\nRounding to a reasonable number of decimal places, or just providing the direct float value.',
answer=0.027777777777777776
)
Notice that the reasoning trace is how the LLM reasoned (via the chain-of-thought technique) to come up with an answer.
Conclusion
We have introduced DSPy and explained why you might want to treat your prompt building as code. We've installed it and used its out-of-the-box chain-of-thought module to show what you can do with minimal programming. Next time we'll look deeper at how DSPy works.