Show HN: Gemma 3 inference in pure C++ with Metal acceleration

Category: ai-ml

Tags: llm-inference, metal, apple-silicon, c-plus-plus, local-ai, gpl-3.0

Score: 5.5/10 (Innovation: 5, Technical: 6, Documentation: 6, Utility: 5)

MetalChat provides a C++ framework and CLI for running LLM inference on Apple Silicon using Metal acceleration, supporting Llama and Gemma models. It is interesting for combining local, hardware-accelerated AI inference with a portable build system and Homebrew installation. The project is early stage but fills a niche for Apple-centric LLM deployment.

Target audience: backend devs, data engineers, AI/ML researchers

Repository: https://github.com/ybubnov/metalchat · C++ · GPL-3.0 · 20 stars

View on Hacker News