asiai detect
Auto-detect running inference engines using a 3-layer cascade.
Usage
asiai detect # Auto-detect (3-layer cascade)
asiai detect --url http://host:port # Scan specific URL(s) only
Output
Detected engines:
● ollama 0.17.4
URL: http://localhost:11434
● lmstudio 0.4.5
URL: http://localhost:1234
Running: 1 model(s)
- qwen3.5-35b-a3b MLX
● omlx 0.9.2
URL: http://localhost:8800
How it works: 3-layer detection
asiai uses a cascade of three detection layers, from fastest to most thorough:
Layer 1: Config (fastest, ~100ms)
Reads ~/.config/asiai/engines.json — engines discovered in previous runs. This catches engines on non-standard ports (e.g., oMLX on 8800) without rescanning.
Layer 2: Port scan (~200ms)
Scans default ports plus an extended range:
| Port | Engine |
|---|---|
| 11434 | Ollama |
| 1234 | LM Studio |
| 8080 | mlx-lm or llama.cpp |
| 8000-8009 | oMLX or vllm-mlx |
| 52415 | Exo |
Layer 3: Process detection (fallback)
Uses ps and lsof to find engine processes listening on any port. Catches engines running on completely unexpected ports.
Auto-persist
Any engine discovered in Layer 2 or 3 is automatically saved to the config file (Layer 1) for faster detection next time. Auto-discovered entries are pruned after 7 days of inactivity.
When multiple engines share a port (e.g., mlx-lm and llama.cpp on 8080), asiai uses API endpoint probing to identify the correct engine.
Explicit URLs
When using --url, only the specified URLs are scanned. No config is read or written — useful for one-off checks.
asiai detect --url http://192.168.0.16:11434,http://localhost:8800
See also
- config — Manage persistent engine configuration