asiai mcp
Start the MCP (Model Context Protocol) server, enabling AI agents to monitor and benchmark your inference infrastructure.
Usage
asiai mcp # stdio transport (Claude Code)
asiai mcp --transport sse # SSE transport (network agents)
asiai mcp --transport sse --port 9000
Options
| Option | Description |
|---|---|
--transport |
Transport protocol: stdio (default), sse, streamable-http |
--host |
Bind address (default: 127.0.0.1) |
--port |
Port for SSE/HTTP transport (default: 8900) |
--register |
Opt-in registration with asiai agent network (anonymous) |
Tools (11)
| Tool | Description | Read-only |
|---|---|---|
check_inference_health |
Quick health check: engines up/down, memory pressure, thermal, GPU | Yes |
get_inference_snapshot |
Full system snapshot with all metrics | Yes |
list_models |
List all loaded models across engines | Yes |
detect_engines |
Re-scan for inference engines | Yes |
run_benchmark |
Run a benchmark or cross-model comparison (rate-limited to 1/min) | No |
get_recommendations |
Hardware-aware engine/model recommendations | Yes |
diagnose |
Run diagnostic checks (like asiai doctor) |
Yes |
get_metrics_history |
Query historical metrics (1-168 hours) | Yes |
get_benchmark_history |
Query past benchmark results with filters | Yes |
compare_engines |
Compare engine performance for a model with verdict; supports multi-model comparison from history | Yes |
refresh_engines |
Re-detect engines without restarting the server | Yes |
Resources (3)
| Resource | URI | Description |
|---|---|---|
| System Status | asiai://status |
Current system health (memory, thermal, GPU) |
| Models | asiai://models |
All loaded models across engines |
| System Info | asiai://system |
Hardware info (chip, RAM, cores, OS, uptime) |
Claude Code integration
Add to your Claude Code MCP config (~/.claude/claude_desktop_config.json):
{
"mcpServers": {
"asiai": {
"command": "asiai",
"args": ["mcp"]
}
}
}
Then ask Claude: "Check my inference health" or "Compare Ollama vs LM Studio for qwen3.5".
Benchmark cards
The run_benchmark tool supports card generation via the card parameter. When card=true, a 1200x630 SVG benchmark card is generated and card_path is returned in the response.
{"tool": "run_benchmark", "arguments": {"model": "qwen3.5", "card": true}}
Cross-model comparison (mutually exclusive with model, max 8 slots):
{"tool": "run_benchmark", "arguments": {"compare": ["qwen3.5:4b", "deepseek-r1:7b"], "card": true}}
CLI equivalent for PNG + sharing:
asiai bench --quick --card --share # Quick bench + card + share (~15s)
See the Benchmark Card page for details.
Agent registration
Join the asiai agent network to get community features (leaderboard, comparison, percentile stats):
asiai mcp --register # Register on first run, heartbeat on subsequent runs
asiai unregister # Remove local credentials
Registration is opt-in and anonymous — only hardware info (chip, RAM) and engine names are sent. No IP, hostname, or personal data is stored. Credentials are saved in ~/.local/share/asiai/agent.json (chmod 600).
On subsequent asiai mcp --register calls, a heartbeat is sent instead of re-registering. If the API is unreachable, the MCP server starts normally without registration.
Check your registration status with asiai version.
Network agents
For agents on other machines (e.g., monitoring a headless Mac Mini):
asiai mcp --transport sse --host 0.0.0.0 --port 8900
See the Agent Integration guide for detailed setup instructions.