Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model
Category: ai-ml
Tags: tiny-ai, tool-calling, on-device-ml
Score: 7.8/10 (Innovation: 8, Technical: 9, Documentation: 7, Utility: 7)
Needle is a 26M parameter model that distills Gemini's tool-calling capability into a tiny, efficient architecture using Simple Attention Networks, enabling on-device inference and fine-tuning. It's interesting for its novel encoder-decoder design with cross-attention, tied embeddings, and no FFN layers, achieving high performance on single-shot function calls despite its small size.
Target audience: backend devs, data engineers, ai researchers
Repository: https://github.com/cactus-compute/needle · Python · MIT · 2 stars
View on Hacker News