Skip to main content

Demo: Mock LLM and MCP

This is the first hands-on demo in getting started — no API keys required. You deploy two bundled manifests, start one saga with two steps, and watch the worker run a short reasoning loop on each step. When both steps finish, the saga is COMPLETED in Postgres.

You need a running stack from Installationwarden ping should already succeed. No extra .env beyond what Installation set up.

This demo focuses on the core orchestration loop. Policy rules and human-in-the-loop review appear in the later GitHub MCP demo.

What you'll use

ArtifactPath
Workerconfig/worker.mock-mcp.yamlmock-mcp-worker, mock LLM provider
Sagaconfig/saga.mock-mcp.yamlgreet then summarize (reads prior step output)
Promptsconfig/prompts/mock-greet.j2, config/prompts/mock-summarize.j2
Mock runtimeworkers/fixtures/mock_mcp_server.py, workers/llm/mock.py

Before you deploy, open the files in the table — worker provider, step order, prompts, and tool allowlists make more sense once you've read the YAML.

Before you start

  1. Finish Installationwarden ping returns healthy.
  2. No OPENAI_API_KEY or other cloud credentials needed.

Walkthrough

From the repo root with ENGINE_URL set. Run commands with source .venv/bin/activate or prefix with uv run — see Installation.

1. Deploy the manifests

Deploy the worker first, then the saga. Warden rejects the saga deploy if the worker is not registered yet.

warden deploy -f config/worker.mock-mcp.yaml
warden deploy -f config/saga.mock-mcp.yaml

2. Start the saga

warden start saga -n mock-mcp-saga -v 0.1.0 --input '{"name":"Ada"}'

Copy the trace_id from the response. You will need it for the commands below and for the next demo. The input name is passed into the first step's prompt; any value works for this mock run.

3. Check saga status

warden list sagas --trace-id <YOUR_TRACE_ID> --namespace default

You'll watch the saga move PENDINGRUNNINGCOMPLETED. On the mock stack that transition usually finishes before you can type the next command, so seeing COMPLETED on your first list sagas is expected — not a sign anything was skipped.

4. Check step progress

warden list steps --trace-id <YOUR_TRACE_ID> --namespace default

You should see two rows: greet, then summarize. On the happy path, each step moves PENDINGIN_PROGRESSCOMPLETED. If a step errors, Warden marks it FAILED — inspect it with show step (next section).

list steps is a status index: order, lifecycle state, timing buckets. It does not include resolved inputs, outputs, or prompt references. Add --json when you want machine-readable timing or error_details on a failed step. For polling instead of one-shot listing, add --watch (Start and monitor).

5. Inspect step data

warden show step <YOUR_TRACE_ID> --step-id greet --namespace default
warden show step <YOUR_TRACE_ID> --step-id summarize --namespace default

show step answers what actually ran: resolved_arguments, output_payload, and prompt_ref.

On greet, expect resolved_arguments.name from your saga input and output_payload.data.greeting from the mock worker. On summarize, expect resolved_arguments.greeting chained from the first step and output_payload.data.summary.

Use --json for scripting. If human output looks truncated (large ReAct payloads on other demos), add --raw or --json for the full blob.

Mock timing often omits llm_ms and tool_ms because those buckets are sub-millisecond and zero values are dropped. dispatch_to_ingest_ms usually dominates on the mock stack — see the outbox polling note on the next demo page.

What just happened

You ran the core loop end to end across two steps: Warden queued work on the outbox, the worker claimed each step, started a local MCP subprocess, and the mock LLM finished a ReAct turn per step. Step summarize consumed output from step greet before submitting its own result. The saga landed in COMPLETED in Postgres.

Warden uses this exact same transactional loop to run live production workflows. To see how Warden maps non-deterministic LLM outputs to deterministic Postgres state across reason and commit steps, read Durable execution boundaries. The GitHub MCP demo layers policy and HITL on top after the next two demos.

If something goes wrong

SymptomLikely causeFix
ENGINE_URL / health check failEngine not reachableTroubleshooting → Stack and CLI
Saga RUNNING but a step stuck IN_PROGRESSWorker down or slow claimmake doctor; worker logs
Tool not in allowlistTool name mismatch in sagaAllow echo in the step's tools.allow list
MCP subprocess spawn failWorker image missing the fixtureRebuild: docker compose build worker; see Local stack diagnostics
summarize fails after greet completesStale saga definition (one-step version)Redeploy config/saga.mock-mcp.yaml

For make doctor, log dumps, and reset workflows, see Troubleshooting.

What's next

Continue with Demo: Observe Execution Timing using the trace_id from this run.