AzureServerlessintermediate

Durable Functions Patterns Guide

Build stateful workflows with Durable Functions: chaining, fan-out, human interaction, and eternal orchestrations.

CloudToolStack Editorial24 min readPublished Mar 14, 2026

Prerequisites

Experience with Azure Functions basics
Understanding of async programming and workflow patterns

What Are Durable Functions?

Durable Functions is an extension of Azure Functions that lets you write stateful workflows in a serverless environment. Instead of managing queues, state stores, and retry logic yourself, Durable Functions handles all of this through an orchestrator function that coordinates the execution of activity functions. The orchestrator's state is automatically checkpointed and replayed, so workflows can run for seconds or months without losing progress.

Durable Functions solves the problem of complex multi-step processes in serverless: order fulfillment pipelines, approval workflows, data processing with fan-out/fan-in, human interaction patterns, and long-running monitoring scenarios. Without Durable Functions, you would need to manually manage state in a database, handle retries with dead-letter queues, and coordinate between multiple functions using custom logic.

This guide covers the core patterns (chaining, fan-out/fan-in, async HTTP APIs, monitoring, human interaction, eternal orchestrations), implementation in C# and Python, error handling, and production best practices.

Runtime and Languages

Durable Functions supports C#, JavaScript/TypeScript, Python, Java, and PowerShell. The orchestration patterns are the same across languages, but the SDK syntax differs. This guide shows examples in both C# and Python. The underlying durable task framework uses Azure Storage (queues, tables, and blobs) or the newer Netherite and MSSQL storage providers for state management.

Pattern 1: Function Chaining

Function chaining executes a sequence of activity functions in a specific order, where the output of one function becomes the input to the next. This is the simplest Durable Functions pattern and replaces complex queue-based choreography.

python

import azure.functions as func
import azure.durable_functions as df

app = func.FunctionApp()
bp = df.Blueprint()

# Orchestrator function
@bp.orchestration_trigger(context_name="context")
def order_processing_orchestrator(context: df.DurableOrchestrationContext):
    """Process an order through a series of steps."""
    order = context.get_input()

    # Step 1: Validate the order
    validated_order = yield context.call_activity("validate_order", order)

    # Step 2: Process payment
    payment_result = yield context.call_activity("process_payment", validated_order)

    # Step 3: Reserve inventory
    inventory_result = yield context.call_activity("reserve_inventory", {
        "order": validated_order,
        "payment": payment_result
    })

    # Step 4: Send confirmation
    yield context.call_activity("send_confirmation", {
        "order": validated_order,
        "payment": payment_result,
        "inventory": inventory_result
    })

    return {"status": "completed", "orderId": order["orderId"]}

# Activity functions
@bp.activity_trigger(input_name="order")
def validate_order(order: dict) -> dict:
    # Validate order items, customer, shipping address
    if not order.get("items"):
        raise ValueError("Order must contain items")
    order["validated"] = True
    return order

@bp.activity_trigger(input_name="data")
def process_payment(data: dict) -> dict:
    # Call payment gateway
    return {"transactionId": "txn-abc123", "status": "charged"}

@bp.activity_trigger(input_name="data")
def reserve_inventory(data: dict) -> dict:
    # Reserve inventory for each item
    return {"reserved": True, "warehouse": "us-east-1"}

@bp.activity_trigger(input_name="data")
def send_confirmation(data: dict) -> None:
    # Send email/SMS confirmation
    pass

# HTTP starter
@bp.route(route="start-order/{orderId}")
@bp.durable_client_input(client_name="client")
async def start_order(req: func.HttpRequest, client) -> func.HttpResponse:
    order_data = req.get_json()
    instance_id = await client.start_new(
        "order_processing_orchestrator", None, order_data
    )
    return client.create_check_status_response(req, instance_id)

app.register_functions(bp)

Pattern 2: Fan-Out / Fan-In

Fan-out/fan-in executes multiple activity functions in parallel, waits for all of them to complete, and then aggregates the results. This pattern is ideal for batch processing, parallel API calls, and map-reduce workloads.

python

@bp.orchestration_trigger(context_name="context")
def parallel_processing_orchestrator(context: df.DurableOrchestrationContext):
    """Process multiple items in parallel and aggregate results."""
    items = context.get_input()  # List of items to process

    # Fan-out: launch parallel tasks
    parallel_tasks = []
    for item in items:
        task = context.call_activity("process_item", item)
        parallel_tasks.append(task)

    # Fan-in: wait for all tasks to complete
    results = yield context.task_all(parallel_tasks)

    # Aggregate results
    summary = yield context.call_activity("aggregate_results", results)

    return summary

@bp.activity_trigger(input_name="item")
def process_item(item: dict) -> dict:
    """Process a single item (runs in parallel with other instances)."""
    # Simulate processing (image resize, data transformation, etc.)
    return {
        "itemId": item["id"],
        "status": "processed",
        "size": len(str(item))
    }

@bp.activity_trigger(input_name="results")
def aggregate_results(results: list) -> dict:
    """Aggregate results from parallel processing."""
    return {
        "totalProcessed": len(results),
        "successCount": sum(1 for r in results if r["status"] == "processed"),
        "totalSize": sum(r["size"] for r in results)
    }

Use task_any for Racing

Use context.task_any() instead of task_all() to wait for the first task to complete (racing pattern). This is useful for implementing timeouts, redundant API calls where you take the fastest response, or competitive processing where you want the first result.

Pattern 3: Async HTTP APIs

The async HTTP API pattern provides a standard way to start long-running operations via HTTP and poll for their status. Durable Functions provides built-in HTTP endpoints for checking status, sending events to running orchestrations, and terminating instances.

bash

# Start an orchestration
curl -X POST "https://myfuncapp.azurewebsites.net/api/start-order/ord-123" \
  -H "Content-Type: application/json" \
  -d '{"orderId": "ord-123", "items": [{"sku": "WIDGET-1", "qty": 5}]}'

# Response includes status query URLs:
# {
#   "id": "abc123",
#   "statusQueryGetUri": "https://myfuncapp.azurewebsites.net/runtime/webhooks/durabletask/instances/abc123",
#   "sendEventPostUri": "https://myfuncapp.azurewebsites.net/runtime/webhooks/durabletask/instances/abc123/raiseEvent/{eventName}",
#   "terminatePostUri": "https://myfuncapp.azurewebsites.net/runtime/webhooks/durabletask/instances/abc123/terminate?reason={text}"
# }

# Poll for status
curl "https://myfuncapp.azurewebsites.net/runtime/webhooks/durabletask/instances/abc123"

# Response when running:
# {"runtimeStatus": "Running", "input": {...}, "output": null}

# Response when completed:
# {"runtimeStatus": "Completed", "input": {...}, "output": {"status": "completed"}}

Pattern 4: Human Interaction

The human interaction pattern implements approval workflows where an orchestration pauses and waits for a human to take action (approve, reject, provide input). The orchestrator uses wait_for_external_event to pause execution until an external event is raised, with an optional timeout.

python

import datetime

@bp.orchestration_trigger(context_name="context")
def approval_workflow(context: df.DurableOrchestrationContext):
    """Approval workflow with timeout and escalation."""
    request = context.get_input()

    # Send approval request
    yield context.call_activity("send_approval_request", {
        "approver": request["manager"],
        "details": request["details"],
        "instanceId": context.instance_id
    })

    # Wait for approval with 72-hour timeout
    timeout = context.current_utc_datetime + datetime.timedelta(hours=72)
    approval_event = context.wait_for_external_event("ApprovalResponse")
    timeout_event = context.create_timer(timeout)

    winner = yield context.task_any([approval_event, timeout_event])

    if winner == approval_event:
        approval = approval_event.result
        if approval["decision"] == "approved":
            yield context.call_activity("process_approved_request", request)
            return {"status": "approved", "approver": approval["approvedBy"]}
        else:
            yield context.call_activity("notify_rejection", request)
            return {"status": "rejected", "reason": approval.get("reason")}
    else:
        # Timeout: escalate to VP
        yield context.call_activity("escalate_to_vp", request)
        return {"status": "escalated", "reason": "timeout"}

# Raise an event from outside (e.g., from an approval API)
@bp.route(route="approve/{instanceId}")
@bp.durable_client_input(client_name="client")
async def approve_request(req: func.HttpRequest, client) -> func.HttpResponse:
    instance_id = req.route_params["instanceId"]
    body = req.get_json()

    await client.raise_event(
        instance_id,
        "ApprovalResponse",
        {"decision": body["decision"], "approvedBy": body["approvedBy"]}
    )
    return func.HttpResponse("Event raised", status_code=202)

Pattern 5: Eternal Orchestrations

Eternal orchestrations are long-running (potentially infinite) workflows that periodically perform work. Unlike a timer-triggered function, eternal orchestrations maintain state between iterations, making them suitable for monitoring, polling external systems, and periodic data synchronization.

python

@bp.orchestration_trigger(context_name="context")
def monitoring_orchestrator(context: df.DurableOrchestrationContext):
    """Monitor a resource and alert on anomalies (runs forever)."""
    config = context.get_input()
    endpoint = config["endpoint"]
    threshold = config.get("threshold", 95)

    # Check the resource
    status = yield context.call_activity("check_health", endpoint)

    if status["cpu_percent"] > threshold:
        yield context.call_activity("send_alert", {
            "endpoint": endpoint,
            "metric": "cpu",
            "value": status["cpu_percent"],
            "threshold": threshold
        })

    # Wait 5 minutes before next check
    next_check = context.current_utc_datetime + datetime.timedelta(minutes=5)
    yield context.create_timer(next_check)

    # Continue as new (restart orchestration with fresh history)
    # This prevents history from growing unbounded
    context.continue_as_new(config)

Continue As New Is Essential

Always use continue_as_new in eternal orchestrations. Without it, the orchestration history grows with every iteration, eventually causing performance degradation and memory issues. continue_as_new resets the history while preserving the orchestration instance ID.

Error Handling and Retries

Durable Functions provides built-in retry policies for activity functions. You can configure the number of retries, initial retry interval, backoff coefficient, and maximum retry interval. Failed activities throw exceptions in the orchestrator, which you can catch and handle with custom logic.

python

from azure.durable_functions import RetryOptions

@bp.orchestration_trigger(context_name="context")
def resilient_orchestrator(context: df.DurableOrchestrationContext):
    """Orchestrator with retry policies and error handling."""
    order = context.get_input()

    # Configure retry policy
    retry_options = RetryOptions(
        first_retry_interval_in_milliseconds=5000,  # 5 seconds
        max_number_of_attempts=3,
        backoff_coefficient=2.0,  # Exponential backoff
        max_retry_interval_in_milliseconds=60000,  # Max 60 seconds
    )

    try:
        # Call with retry policy
        result = yield context.call_activity_with_retry(
            "call_external_api",
            retry_options,
            order
        )
    except Exception as e:
        # All retries failed - run compensation logic
        yield context.call_activity("compensate_order", {
            "order": order,
            "error": str(e)
        })
        return {"status": "failed", "error": str(e)}

    return {"status": "success", "result": result}

Terraform Deployment

hcl

resource "azurerm_linux_function_app" "durable" {
  name                       = "myapp-durable-func"
  resource_group_name        = azurerm_resource_group.main.name
  location                   = azurerm_resource_group.main.location
  storage_account_name       = azurerm_storage_account.func.name
  storage_account_access_key = azurerm_storage_account.func.primary_access_key
  service_plan_id            = azurerm_service_plan.main.id

  site_config {
    application_stack {
      python_version = "3.11"
    }
  }

  app_settings = {
    "AzureWebJobsFeatureFlags"       = "EnableWorkerIndexing"
    "FUNCTIONS_WORKER_RUNTIME"       = "python"
    "AzureWebJobsStorage"            = azurerm_storage_account.func.primary_connection_string
    "WEBSITE_RUN_FROM_PACKAGE"       = "1"
  }
}

resource "azurerm_service_plan" "main" {
  name                = "myapp-plan"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  os_type             = "Linux"
  sku_name            = "EP1"  # Elastic Premium for Durable Functions
}

Production Best Practices

Practice	Recommendation	Why
Hosting plan	Elastic Premium (EP1+) or Dedicated	Consumption plan has cold start and timeout limits
Orchestrator rules	No I/O, non-deterministic code, or blocking calls	Orchestrators replay; side effects cause bugs
Instance management	Purge completed instances regularly	Storage costs grow with instance history
Versioning	Use side-by-side versioning for breaking changes	Running instances cannot handle schema changes
Monitoring	Use Durable Functions Monitor extension	Visibility into orchestration state and history
Testing	Unit test orchestrators with mocked context	Verify workflow logic without Azure dependencies

bash

# Purge completed orchestration instances (older than 30 days)
az functionapp durable purge-history \
  --app-name myapp-durable-func \
  --resource-group myapp-rg \
  --created-before "2026-02-14T00:00:00Z" \
  --runtime-status Completed Terminated Failed

# Query orchestration instances
az functionapp durable get-instances \
  --app-name myapp-durable-func \
  --resource-group myapp-rg \
  --runtime-status Running

Orchestrator Code Constraints

Orchestrator functions must be deterministic because they replay from history. Never usedatetime.now() (use context.current_utc_datetime), random numbers, environment variables that change, or direct I/O (HTTP calls, database queries). All side effects must happen in activity functions, not the orchestrator.

Azure Functions Hosting Plans Azure Messaging Services Compared

Key Takeaways

1Durable Functions provides five core patterns: chaining, fan-out/fan-in, async HTTP, human interaction, and eternal.
2Orchestrator functions must be deterministic because they replay from history.
3Built-in retry policies with exponential backoff handle transient failures automatically.
4Use continue_as_new in eternal orchestrations to prevent unbounded history growth.

Frequently Asked Questions

What hosting plan should I use for Durable Functions?

Use Elastic Premium (EP1+) for production Durable Functions. The Consumption plan has cold start latency and timeout limits that can cause issues with long-running orchestrations. Dedicated plans work but lack auto-scaling. Elastic Premium provides warm instances with scale-out capability.

Can Durable Functions run for days or weeks?

Yes. Durable Functions can run indefinitely. The orchestrator state is checkpointed to Azure Storage, so it survives restarts and deployments. For very long-running workflows, use eternal orchestrations with continue_as_new to keep history manageable.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.