Show HN: Alloy – a Torch backend and inference engine for Apple Silicon
Category: infrastructure
Tags: gpu-compute, apple-silicon, llm-inference, torch-compile, metal
Score: 7.8/10 (Innovation: 8, Technical: 9, Documentation: 7, Utility: 7)
Alloy is a compiler and runtime for GPU compute kernels on Apple Silicon, providing a `torch.compile` backend and LLM serving. It allows writing GPU kernels in Python that compile to Metal through a tile IR pipeline, enabling advanced features like cooperative tiled GEMM, automatic operator fusion, and supports inference with various model formats including GGUF and MLX.
Target audience: backend devs, machine learning engineers, devops
Repository: https://github.com/rayanht/alloy · Python · Apache-2.0 · 1 stars
View on Hacker News