Show HN: Gemma 3 inference in pure C++ with Metal acceleration
Category: ai-ml
Tags: llm-inference, metal, apple-silicon, c-plus-plus, local-ai, gpl-3.0
Score: 5.5/10 (Innovation: 5, Technical: 6, Documentation: 6, Utility: 5)
MetalChat provides a C++ framework and CLI for running LLM inference on Apple Silicon using Metal acceleration, supporting Llama and Gemma models. It is interesting for combining local, hardware-accelerated AI inference with a portable build system and Homebrew installation. The project is early stage but fills a niche for Apple-centric LLM deployment.
Target audience: backend devs, data engineers, AI/ML researchers
Repository: https://github.com/ybubnov/metalchat · C++ · GPL-3.0 · 20 stars
View on Hacker News