Aximo — a local Rust STT API for CPU-only inference
I built a local speech-to-text API in Rust that runs on CPU I recently built Aximo, a self-hosted speech-to-text microservice designed to run locally on CPU, without depending on cloud APIs or external SaaS. The idea was straightforward: I wanted an STT service that could be deployed like any other backend, stay fully local, and still be clean enough architecturally to evolve beyond a quick experiment. Aximo is written in Rust, uses Parakeet v3 for local inference, exposes an HTTP API for transcription, and includes a WebSocket layer for realtime use cases. I also added Docker, OpenAPI, and a multi-crate workspace layout to keep the codebase modular from the start. One detail I particularly liked: I extended Swagger UI so I can record audio directly from the microphone and send it to the API for testing. It’s a small feature, but it makes the developer experience much nicer when iterating on the service. At this point, I’d call it a solid MVP rather than a production-ready system, but it already works well for local experimentation and as a foundation for a self-hosted STT stack. One notable addition: I extended Swagger to support sending recordings directly from the microphone. Repo: github.com/aximo
