Show HN: EdgeRunner – run GGUF models with Swift and Metal
Category: library
Tags: llm-inference, apple-silicon, metal-gpu
Score: 7.0/10 (Innovation: 7, Technical: 7, Documentation: 8, Utility: 6)
EdgeRunner is a Swift/Metal library for running GGUF large language models locally on Apple Silicon, enabling fast private inference with Metal 4 optimizations and over 230 tokens per second. It combines bleeding-edge GPU kernel fusion with memory-mapped loading for instant startup, making on-device AI apps viable for iOS and macOS. The project is particularly interesting for its low time-to-first-token and full reliance on Apple's latest APIs.
Target audience: iOS and macOS developers, AI engineers, Swift developers
Repository: https://github.com/christopherkarani/EdgeRunner · Swift · MIT · 38 stars
View on Hacker News