Show HN: Ported Cerebras REAP to MLX – Prune MoE Experts on a MacBook
Category: ai-ml
Tags: mlx, model-pruning, mixture-of-experts
Score: 6.3/10 (Innovation: 5, Technical: 6, Documentation: 8, Utility: 6)
REAP MLX ports the Cerebras REAP method for pruning Mixture-of-Experts (MoE) models to run entirely on Apple Silicon using MLX-LM, enabling local MoE compression experiments without a CUDA stack. It is interesting because it brings advanced model pruning to consumer hardware, uses a clean adapter-based architecture for model family support, and provides detailed run telemetry for comparison.
Target audience: machine learning engineers, researchers interested in model compression, Apple Silicon developers
Repository: https://github.com/egesabanci/reap-mlx · Python · MIT
View on Hacker News