Show HN: Ludion – routing AI inference by observed WebGPU behavior
Category: infrastructure
Tags: ai-inference, load-balancer, webgpu
Score: 7.3/10 (Innovation: 7, Technical: 8, Documentation: 7, Utility: 7)
Ludion is a TypeScript load balancer that routes AI inference requests to either the user's local GPU via WebGPU or to a cloud server, based on empirical device capability observation. It uniquely combines on-device AI execution with cloud fallback in a single API, offering an OpenAI-compatible drop-in that dynamically decides where inference runs per request. This project is interesting for its pragmatic approach to hybrid edge/cloud inference and its detailed documentation on setup and runtime configuration.
Target audience: backend devs, frontend devs, devops
Repository: https://ludion.ai/ · TypeScript · MIT · 2 stars
View on Hacker News