AI-assisted test development via MCP #

Litmus exposes a Model Context Protocol (MCP) server whose tools expose run/query/authoring actions to AI assistants. The platform does not call LLMs itself — it only exposes tools that an AI agent drives.

This page is the operational how-to: registering Litmus with each supported AI client. For motivation see concepts/why-ai-integration; for the end-to-end workflow walkthrough see datasheet-to-test; for the full inventory of shipped skills + sub-agents + slash commands see reference/skills. Per-tool MCP reference: api.md → MCP tools.

CLI as a peer surface. Any agent with a terminal — Claude Code with Bash, Cursor with terminal, the GitHub Copilot CLI — can drive Litmus through litmus … commands instead of (or alongside) MCP. The CLI surface mirrors most of the MCP tools (litmus runs, litmus show, litmus discover, litmus metrics, litmus schema, litmus validate, …). See reference/cli. This page is for AI clients that speak MCP natively.

Prerequisites. litmus installed and on $PATH (pip install litmus-test — distribution litmus-test, import litmus). One of the supported AI clients listed below — Claude Code, Claude Desktop, GitHub Copilot, Cursor, or Cline. A working project directory (litmus init to scaffold one). For litmus_run, a station configured in stations/ — note litmus_run always executes in mock mode (see below).

Setup #

litmus setup <client> writes the right MCP config file for each supported client. All litmus setup <client> commands accept --print-only to show the config that would be written without modifying anything on disk.

Client	Command	What gets written
Claude Code (CLI)	`litmus setup claude-code`	Registers the MCP server via `claude mcp add litmus`, copies skill stubs to `.claude/commands/`, and creates / updates `./CLAUDE.md`
Claude Desktop	`litmus setup claude-desktop`	Builds a `litmus.mcpb` Desktop Extension bundle on the user's Desktop (zip) for double-click install. Use `--legacy` to write `~/.config/Claude/claude_desktop_config.json` directly instead.
GitHub Copilot Chat	`litmus setup copilot`	Project-local `.vscode/mcp.json` plus `.github/copilot-instructions.md`
Cursor	`litmus setup cursor`	Project-local `.cursor/mcp.json`
Cline (VS Code)	`litmus setup cline`	`cline_mcp_settings.json` in VS Code User settings (`~/.config/Code/User/` on Linux, `~/Library/Application Support/Code/User/` on macOS, `~/AppData/Roaming/Code/User/` on Windows)
Anything else	`litmus mcp serve` directly	You configure your AI client manually

After running any setup command, restart the client to pick up the new MCP server. To confirm it registered, ask the assistant to list its tools (or open the client's MCP panel) — the litmus_* tools should appear. If they don't, the client didn't load the server: re-run the setup command, restart the client again, and confirm the config file from the table above was written.

If the claude CLI isn't on $PATH, litmus setup claude-code prints the manual claude mcp add … command for you to run instead of registering automatically. For the VS Code clients, run with --print-only first to preview the exact .vscode/mcp.json / .cursor/mcp.json it will write.

To print the exact command Litmus registers (for a manual setup):

litmus setup show

For the manual path (any client that doesn't have a litmus setup subcommand), start the server with:

litmus mcp serve
# command: litmus
# args: ["mcp", "serve"]
# transport: stdio

Add a server entry to your AI client's MCP config pointing at the above. See litmus setup claude-desktop --print-only for a working example you can adapt.

The MCP tools #

Tool	Purpose	Detail
`litmus_project`	Unified CRUD: init, list, get, save, read	details below
`litmus_discover`	Scan for connected instruments across all registered protocols (VISA, NI, serial, …)	Returns the list of resources reachable on this host
`litmus_match`	Find compatible instruments and stations	Two modes: requirements (catalog recommendation) and station (compatibility check)
`litmus_run`	Execute a test file via pytest, return exit summary	details below
`litmus_open`	Get a browser URL for the operator UI	Allowed `type`: `part`, `station`, `run`, `fixture`
`litmus_schema`	Get the JSON Schema for a YAML type	For AI clients that want to validate before saving
`litmus_events`	Query the event store	Filter by session / event type
`litmus_sessions`	List sessions with metadata	Each session = one `connect()` lifetime or pytest run
`litmus_channels`	Query channel data from the streaming store	For waveform / time-series readouts referenced by events
`litmus_files`	List FileStore artifacts (blobs, waveforms, streaming captures)	Each row carries its `file://` URI, name, format, session / run id, created_at
`litmus_metrics`	Compute yield / Pareto / Ppk / retest / time-loss	Aggregations over a date range
`litmus_runs`	Query the runs view (filtered, paginated)	Same data the operator-UI runs list reads
`litmus_steps`	Query the steps view (one row per step execution)	Step-level rollup with outcome and timing

For each tool's full parameter list and return shape, see api.md.

`litmus_project` #

The CRUD entry point. One tool with an action: argument; the rest of the workflow goes through it.

# Initialize a project (call this first)
result = litmus_project(action="init", path="~/my-project")
project = result["project_root"]
 
# List entities of a type
litmus_project(action="list", type="part", project=project)
 
# Get one entity
litmus_project(action="get", type="part", id="tps54302", project=project)
 
# Save an entity (validated against schema)
litmus_project(action="save", type="part", id="tps54302",
               content={...}, project=project)
 
# Read a file or a template
litmus_project(action="read", path="parts/tps54302.yaml", project=project)
litmus_project(action="read", path="template:test", project=project)

Entity types depend on the action:

list / get accept: station, part, fixture, catalog, instrument_asset, run
save accepts: station, part, fixture, catalog, instrument_asset, test

test is save-only; run is read-only. project is not a type — it's the path argument every other call passes.

Saving test code with `action="save", type="test"` #

When type="test", the tool writes a Python file under <project_root>/tests/. The id is treated as the path, and if it doesn't end in .py the tool appends .py. So this tool cannot write the colocated sidecar (tests/test_<module>.yaml) — it would force a .py extension. Write the sidecar YAML directly to disk with your AI client's filesystem tool, not via litmus_project.

`litmus_run` #

Runs the test file with pytest in mock mode and returns the pass/fail summary. It does not return structured measurement results — those land in the parquet store and are queried separately via litmus_runs / litmus_metrics / litmus_steps.

litmus_run always runs with --mock-instruments — station= selects which station's mock_config to use, but no real hardware is touched. To run against a real bench, drive pytest directly: pytest --station=<bench> --uut-serial=<sn> (see writing tests).

result = litmus_run(
    test="tests/test_tps54302.py",
    station="bench_1",
    serial="SN001",
    project=project,
)

Return shape:

{
    "run_id": "abc12345...",                # UUID of the run (or "unknown")
    "status": "passed",                     # one of: "passed" | "failed" | "error"
    "summary": "1 passed in 0.42s",         # pytest's bottom-line summary
    "test": "tests/test_tps54302.py",
    "station": "bench_1",
    "serial": "SN001",
    "started_at": "2026-05-17T...",
    "output": "<last 2000 chars of pytest stdout>",
}

status is a quick pass/fail/error from the pytest run, not the full outcome — fetch the stored run for the real outcome (next block).

For the full Outcome value (passed/failed/errored/skipped/done/terminated/aborted) that the runtime produces, fetch the run's stored row after it finishes:

run = litmus_runs(action="get", run_id=result["run_id"], project=project)["run"]
print(run["outcome"])               # one of the Outcome values

See outcomes for what each value means.

What the agent does next #

Once the server is registered, the agent drives the datasheet → test workflow through these tools: initialize a project, create a part spec from the datasheet, set up the station, generate tests, run, and inspect results.

Start the conversation by having the agent call:

result = litmus_project(action="init", path="~/my-hardware-tests")
project = result["project_root"]

Then hand the agent the datasheet. See Datasheet → tests for the full walkthrough.

AI-assisted test development via MCP #

Setup #

The MCP tools #

litmus_project #

Saving test code with action="save", type="test" #

litmus_run #

What the agent does next #

See also #

`litmus_project` #

Saving test code with `action="save", type="test"` #

`litmus_run` #