Here's how to build durable AI agents with Pydantic and Temporal

At Pydantic, we pride ourselves on building production-grade developer tooling that Python developers love. Pydantic itself became the foundation for countless libraries, including the OpenAI and Anthropic SDKs, because it brought type safety, speed, and reliability to everyday development.

We built Pydantic Logfire, our developer-first observability platform, because we believed observability tools could be both production-ready and pleasant to use. When you make essential infrastructure more ergonomic, developers actually use it. In AI development, observability is borderline essential for understanding LLM behavior, and working with AI developers using Logfire exposed us to how they were building agents. But when we evaluated the available agent frameworks ourselves, none met our bar for production use.

So we built Pydantic AI, an agent framework focused on production-grade AI applications. That means:

Type safety everywhere – structured inputs and outputs validated at runtime
Built-in observability – semantic convention-compliant OpenTelemetry instrumentation
100% test coverage – boring but essential for reliability
Unit-tested documentation – examples that actually work

But we also know that some problems are better solved by specialized tools than by building everything ourselves. Durable execution is one of those problems. That's why we've built native Temporal support in Pydantic AI, combining our type-safe agent framework with Temporal's durable execution to give you agents that:

Survive API failures, exceptions, app restarts, and deploys
Handle long-running, async, and human-in-the-loop interactions
Preserve progress and state with replay-based fault tolerance

Why Durable Execution matters for production AI#

Building reliable AI agents means handling the messy reality of production:

Your agent is halfway through a multi-step research task when the OpenAI API times out. With Temporal, your agent automatically picks up exactly where it left off, even after API timeouts or system crashes. This means no lost progress and no need for manual checkpointing.

Your agent needs human approval before executing an action. The workflow can sleep for hours or days for user input, then resume exactly where it stopped. The entire conversation history and agent state is preserved through Temporal's deterministic replay without any manual synchronization with a database.

Your deployment restarts while an agent is coordinating multiple tool calls. If your deployment restarts, Temporal ensures your agent restores its exact state and continues from where it left off, without re-running completed tasks.

This is what Temporal's durable execution model gives you. Your agent coordination logic runs deterministically and can be replayed. The non-deterministic work (LLM calls, tool calls, external API calls) runs once as Activities, and the results are recorded in workflow history.

Do I really need Temporal?#

If you’re new to Temporal, you might think: “Why introduce a whole framework? I could just save state to the database, add some exception handling, make sure everything is idempotent…” (Famous last words.)

And yes, in principle, you could do all that for each feature you add — just like you could rewrite your app in C to make it faster. But now you’re spending engineering time on fragile retry logic, tangled reliability code, and brittle assumptions about failure modes.

Temporal's model handles all the retry logic, state persistence, and fault tolerance for you, allowing you to focus on your agent's core business logic.

In fact, Temporal enables patterns that usually feel wrong — like “sleep for a week” or “pause until a user clicks approve” — because it virtualizes execution. Your workflow looks like it’s blocking, but it’s not. When the worker resumes, it replays from history deterministically.

Getting started#

Here's a regular Pydantic AI agent:

from pydantic_ai import Agent

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful research assistant"
)

# You can now make use of the agent inside an `async def` function via:
...
result = await agent.run("What are the best Python web frameworks?")
...

Nice and simple. The only catch is that agent calls aren’t deterministic — they depend on external APIs — so Temporal requires them to run as activities. The Pydantic AI Temporal integration takes care of that for you, wrapping all the I/O so you can keep using your agent normally inside workflows.

Here's that same agent, now ready for durable execution running on Temporal:

from pydantic_ai import Agent
from pydantic_ai.durable_exec.temporal import TemporalAgent

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful research assistant"
)

# Wrap it for Temporal execution
temporal_agent = TemporalAgent(agent)

# Now it survives failures and can be part of long-running workflows
# Inside a temporal workflow, you can just call:
...
result = await temporal_agent.run("What are the best Python web frameworks?")
...

That's it. One wrapper. When you run an agent with TemporalAgent, all the non-deterministic work (model calls, tool executions, external API calls) gets automatically offloaded to Temporal Activities with built-in retry policies. Your coordination logic runs deterministically as a Workflow. (For more details, see the Pydantic AI Temporal documentation.)

Want to see this in action? Join our live coding session → where we'll build a production AI agent system with the Pydantic team.

A real-world example: Multi-agent systems#

Let's look at something more interesting than a single agent. As a more fully-fledged demonstration of how to use Pydantic AI and Temporal together, I built a Slack bot that helps you decide what to order for dinner. It's deliberately low-stakes, but it demonstrates many patterns you'd use in production.

The bot uses a two-agent system:

Dispatcher agent (fast, cheap): Figures out user intent and gathers information
Research agent (thorough, expensive): Searches for restaurants and creates recommendations

Here's what it looks like to have a Temporal Workflow orchestrate both agents:

from temporalio import workflow
from pydantic_ai.durable_exec.temporal import TemporalAgent

temporal_dispatcher = TemporalAgent(dispatch_agent)
temporal_researcher = TemporalAgent(research_agent)

@workflow.defn
class DinnerBotWorkflow:
    @workflow.run
    async def run(self, user_message: str):
        # Run dispatcher to determine intent
        dispatch_result = await temporal_dispatcher.run(user_message)

        if isinstance(dispatch_result.output, NoResponse):
            return None

        if isinstance(dispatch_result.output, AskForMoreInfo):
            return dispatch_result.output.response

        # We have enough info, run research agent
        research_result = await temporal_researcher.run(
            dispatch_result.output
        )
        return research_result.output.recommendations

(Note: this code snippet is purposely not self-contained — it’s simplified from the SlackThreadWorkflow in the full example repository, but still demonstrates the core pattern of using Pydantic AI agents within a Temporal Workflow.)

Each agent call is durable. If OpenAI times out during the dispatcher call, Temporal retries it. If your worker crashes after the dispatcher succeeds but before the researcher starts, Temporal replays the workflow, but it doesn't re-run the dispatcher. It uses the recorded result from history and picks up at the research agent call.

The full example on GitHub includes additional production patterns: maintaining conversation state per Slack thread, handling asynchronous messages with Signals, coordinating Slack API calls as Activities with retries, and managing concurrent conversations, and implementing human-in-the-loop approval flows where agents wait for user confirmation before executing actions. This is what production AI systems actually look like, and Temporal handles the orchestration complexity so you can focus on your agent logic.

What you can build with this#

The patterns here apply to way more than dinner recommendations:

Customer support agents that handle long-running issues, coordinate with human agents, and maintain context across days of conversation.

Research assistants that gather information from multiple sources, wait for API rate limits to reset, and produce structured reports.

Approval workflows where an AI agent does preliminary work, waits for human approval, then executes actions, all with full audit history.

Multi-step pipelines that coordinate multiple specialized agents, handle partial failures gracefully, and retry only what's needed.

Any time you need more than a basic request-response interaction, Temporal's workflow model shines. And with Pydantic AI's integration, you get durable execution plus the production-grade features that set Pydantic AI apart: type safety throughout, built-in OpenTelemetry observability (works great with Pydantic Logfire), structured outputs with validation, first-class MCP support, and more.

Try it yourself: Join the live coding session#

I'm hosting a live coding session with other members of the Pydantic and Temporal teams where we'll build a real AI agent system from scratch using Pydantic AI and Temporal. We'll go deeper than this blog post: handling failures, debugging workflows, scaling patterns, and production considerations.

Come with questions. Bring your AI agent horror stories. Let's build something that actually works in production.

Get started now#

Want to dive in before the session?

Install Pydantic AI with Temporal support:

pip install pydantic-ai[temporal]

Start a local Temporal server:

brew install temporal
temporal server start-dev

Check out the docs and examples: Pydantic AI Temporal documentation

Check out the getting started with Temporal and Python Docs: Python Temporal Getting Started

Explore the full Slack bot example: GitHub: pydantic-ai-temporal-example

Join the conversation:

Temporal Community Slack: Check out the #python channel
Pydantic Logfire Slack: There's lots of great Pydantic AI discussion here
Temporal Community Forum

We're constantly working to make Pydantic AI better, and your feedback is invaluable. Whether you're building production systems or just experimenting, we'd love to hear about your experience.

Join our public Slack, open issues or discussions on GitHub, or come to the live coding session with questions.

See you at the live coding session!

Build durable AI agents with Pydantic AI and Temporal