Show HN: Sipp – Run small local LLMs in browser 3x faster

Category: library

Tags: llm-inference, webgpu, edge-ai

Score: 7.3/10 (Innovation: 7, Technical: 8, Documentation: 7, Utility: 7)

Sipp is a high-performance AI framework for running small LLMs locally in the browser or on edge devices, offering a unified SDK with symmetric local and cloud inference APIs. It achieves significantly faster time-to-first-token and decode speeds compared to alternatives like WebLLM and Transformers.js, making it interesting for performance-critical and privacy-sensitive AI applications.

Target audience: backend devs, frontend devs, data engineers

Repository: https://www.sipp.sh · Rust · Apache-2.0 · 2 stars

View on Hacker News