oMLX — LLM inference, optimized for your Mac

oMLX — LLM inference, optimized for your Machttps://omlx.ai/

A native macOS inference server built on MLX. Paged SSD KV caching drops agent TTFT from 30-90s to under 5s. OpenAI & Anthropic compatible API for Apple Silicon.