Show HN: ZeroGate – API gateway to scale cloud GPUs to zero when idle
Category: infrastructure
Tags: gpu-orchestration, scale-to-zero, inference-cost-optimization
Score: 6.5/10 (Innovation: 6, Technical: 7, Documentation: 7, Utility: 6)
ZeroGate is an open-source, event-driven GPU orchestration fabric that automatically scales cloud GPU infrastructure to zero when idle, targeting multi-tenant vLLM inference pipelines. It combines scale-to-zero daemons, distributed locks via Redis, and dynamic market arbitrage across providers, with a local mock mode for testing. The project addresses a real cost-saving need in AI inference but is early-stage, with few stars and limited production maturity.
Target audience: mlops, backend-devs, devops
Repository: https://github.com/noah-garner/zerogate · Python · Apache-2.0 · 1 stars
View on Hacker News