AI Agents

Build intelligent, autonomous AI agents that can interact with external systems, make decisions, and orchestrate complex workflows using durable execution patterns and pluggable infrastructure.

From Development to Production

While Dapr Agents provides the powerful framework for building intelligent agents, Catalyst takes you from code to production by providing an enterprise-grade platform for running agents, giving you full observability and security out of the box. Learn more about Running Agents on Catalyst.

Durable Agents

The Dapr Agents framework enables durable agents that persist their execution state across restarts and failures. Unlike traditional agents that lose context when they crash, durable agents can resume exactly where they left off.

durable agent overview

Below is an example of an order processing agent that uses tools to process orders and makes decisions on approval workflows:

async def main():
   order_processor = DurableAgent(
       name="OrderProcessingAgent",
       role="Order Processing Specialist",
       goal="Process customer orders with appropriate approval workflows",
       instructions=[
           "You are an order processing specialist that handles customer orders.",
           "For orders under $1000, automatically approve them.",
           "For orders $1000 or more, escalate them for manual approval.",
           "Use the process_order tool to handle order processing.",
           "Provide clear status updates to customers."
       ],
       tools=[process_order],

       # Conversation history persistence
       memory=ConversationDaprStateMemory(
           store_name="statestore",
           session_id="customer-session-123"
       ),

       # LLM provider via Dapr Conversation API
       llm=DaprChatClient(
           component_name="openai-gpt-4o",
       ),

       # Execution state for durability and recovery
       state_store_name="statestore",
       state_key="execution-orders",
   )

Key capabilities:

Durable executions - Every step in the agent's reasoning and execution is automatically saved, allowing recovery from failures without losing progress or repeating expensive LLM calls
Built-in resiliency - Automatically retry failed operations and recover from transient failures in external systems or LLM APIs

Why Durability Matters

Traditional agents lose all context when they crash. If an agent fails after 10 LLM calls and 5 API interactions, you have to start over—repeating expensive operations and potentially getting different results. Durable agents checkpoint their state automatically, so they resume exactly where they left off.

Pluggable Memory and Context

AI agents need to retain context across interactions to provide coherent and adaptive responses. Dapr Agents provides a pluggable memory architecture that allows agents to store conversation history in any Dapr state store.

memory = ConversationDaprStateMemory(
    store_name="statestore",      # Maps to a configured Dapr state store component
    session_id="customer-session-123"
)

Rather than being limited to volatile in-memory storage, agents can use any of the 28+ Dapr state store components as their persistent memory implementation, from Redis and PostgreSQL to AWS DynamoDB and Azure Cosmos DB.

Swappable LLM Providers

The Dapr Conversation API provides an abstraction layer for LLMs, enabling agents to switch between model providers without code changes. Conversation components handle the integration with different LLM providers and add capabilities like response caching, PII protection, and resilience.

from dapr_agents import DaprChatClient

llm = DaprChatClient(
    component_name="openai-gpt-4o",     # Maps to a configured Dapr conversation component
)

Key benefits:

Provider agnostic - Swap between OpenAI, Azure OpenAI, AWS Bedrock, Google Vertex AI, and more without changing agent code
Prompt caching - Reduce latency and costs for repeated calls
Security and PII obfuscation - Protect sensitive data automatically
Built-in resiliency - Retries, timeouts, and circuit breakers
Observability - OpenTelemetry tracing and Prometheus metrics out of the box

Multi-Agent Workflows

For complex business processes, you can orchestrate multiple agents within deterministic workflows. Dapr Agents integrates seamlessly with Dapr Workflow, letting you combine the intelligence of LLMs with the reliability and predictability of workflow orchestration.

multi-agent workflow

This is ideal for scenarios like customer support triage, multi-stage data processing, or any flow where you need conditional logic, parallelization, or human-in-the-loop between AI reasoning steps.

Here's an example of a customer support workflow that routes through two specialized agents based on business logic:

# --------- AGENTS ---------
triage_agent = Agent(
    name="Triage Agent",
    role="Customer Support Triage Assistant",
    goal="Assess entitlement and urgency.",
    instructions=[
        "Determine whether the customer has entitlement.",
        "Classify urgency as URGENT or NORMAL.",
        "Return JSON with: entitlement, urgency.",
    ]
)

expert_agent = Agent(
    name="Expert Agent",
    role="Technical Troubleshooting Specialist",
    goal="Diagnose issue and propose a resolution.",
    instructions=[
        "Use the provided customer context and issue description.",
        "Summarize the resolution in a customer-friendly message.",
        "Return JSON with: resolution, customer_message.",
    ]
)

# --------- WORKFLOW ---------
@workflow(name="customer_support_workflow")
def customer_support_workflow(ctx: DaprWorkflowContext, input_data: dict):
    triage = yield ctx.call_activity(triage_activity, input=input_data)
    if not triage.get("entitlement"):
        return {"status": "rejected", "reason": "No entitlement"}
    expert = yield ctx.call_activity(expert_activity, input=input_data)
    return {"status": "completed", "result": expert}

@activity(name="triage_activity")
@agent_activity(agent=triage_agent)
def triage_activity(ctx) -> dict:
    """Customer: {name}. Issue: {issue}."""
    pass

@activity(name="expert_activity")
@agent_activity(agent=expert_agent)
def expert_activity(ctx) -> dict:
    """Customer: {name}. Issue: {issue}."""
    pass

This architectural pattern is ideal for business-critical applications where you need the intelligence of LLMs combined with the reliability and observability of deterministic workflows.

Common Orchestration Patterns

The following patterns demonstrate common scenarios for orchestrating AI agents within workflows:

Prompt Chaining

Decompose complex tasks into sequential steps where each LLM call processes the output of the previous one. Enables better control, validation between steps, and multi-stage analysis.

Classify inputs and direct them to specialized handlers for different query types. Enables resource optimization, and specialized experts for customer support and content creation.

Process multiple dimensions of a problem simultaneously with outputs aggregated programmatically. Improves efficiency for complex tasks with independent subtasks that can be processed concurrently.

A central orchestrator LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results. Ideal for highly complex tasks where the number of subtasks is not known in advance.

A dual-LLM process where one model generates responses while another provides evaluation and feedback in an iterative loop. Achieves quality through iteration and gradual refinement.

Running AI Agents with Catalyst

Catalyst provides an enterprise-grade platform for running agents in production, giving you full observability and security out of the box.

Key capabilities:

Complete observability – Gain full visibility into each step of the agent's execution, including timing information, inputs, outputs, and the decision-making process at every stage
Agent identity – Catalyst assigns a unique identity using an x.509 certificate that controls its access rights to resources and ensures that it has a security boundary for what it can accomplish
Secure app-to-app (or agent-to-agent) communication – Short-lived mTLS certificates ensure encrypted traffic and automatic mutual authentication between applications, with continuous rotation reducing exposure from compromised keys
Authorization & access control – Identity-based policies define which agents may call one another, what operations each can perform, what infrastructure each can access, and prevent unauthorized access
Cross-cloud identity federation – Integrates with AWS, Azure and GCP identity providers so agents can authenticate to cloud services without storing or distributing long-lived secrets
Centralized credentials – When required, API keys (such as LLM credentials) are stored and managed at the platform layer rather than in application code, preventing credential sprawl and enabling secure, unified access to multiple LLM providers
Auditing & traceability – All operations are tied to workload identity, enabling clear end-to-end audit trails for debugging, compliance, and forensic analysis

Getting Started

💻

AI Agents

Durable Agents

Pluggable Memory and Context

Swappable LLM Providers

Multi-Agent Workflows

Common Orchestration Patterns

Prompt Chaining

Routing

Parallelization

Orchestrator-Workers

Evaluator-Optimizer

Running AI Agents with Catalyst

Getting Started

Dapr Agents Quickstarts

Try Agents in Catalyst

Dapr Agents Documentation

Durable Agents​

Pluggable Memory and Context​

Swappable LLM Providers​

Multi-Agent Workflows​

Common Orchestration Patterns​

Prompt Chaining

Routing

Parallelization

Orchestrator-Workers

Evaluator-Optimizer

Running AI Agents with Catalyst​

Getting Started​

Dapr Agents Quickstarts

Try Agents in Catalyst

Dapr Agents Documentation

Durable Agents

Pluggable Memory and Context

Swappable LLM Providers

Multi-Agent Workflows

Common Orchestration Patterns

Running AI Agents with Catalyst

Getting Started