CADUCEUS is a physical robot built for the NousResearch Hermes Agent Creative Hackathon that demonstrates end-to-end agentic control of hardware via MCP tool calls — no command parser required.
## Architecture
Voice → Whisper STT → Hermes Agent (hermes -z, Gemini Flash Lite) → caduceus_mcp.py MCP server (Wi-Fi) → ESP32 microcontroller → PCA9685 PWM driver → MG90S servo motors → physical movement → Edge TTS → spoken response
## Key Design Decision
Each servo motor / motion primitive is exposed as an MCP tool with a behavioral description. Hermes selects tools from natural conversation — no explicit syntax required. Greet it in any language; it waves first, talks second.
## Constraints (Documented)
- End-to-end latency: ~25–30 seconds per command (hermes -z reloads agent each call; persistent process would fix this)
- Chassis tipping: single row of legs, prone to tipping during walks
- Audio I/O: INMP441 mic + PAM8403 amplifier wired to ESP32, but Whisper accuracy was too low on device-captured audio; routed through host PC mic for submission
## Significance
Demonstrates that the MCP tool abstraction is hardware-agnostic: the same protocol used for software tool calls (filesystem, web search) works for physical actuator control. The agent does not need to know it is controlling a robot — it just calls tools.
Source: https://github.com/devorun/caduceus-robot