Show HN: Memory for LLM apps that cuts input tokens up to 80% (avg 68%)

Category: library

Tags: llm-memory, token-savings, embeddings

Score: 6.8/10 (Innovation: 7, Technical: 7, Documentation: 7, Utility: 6)

Street AI provides a memory layer for LLM applications that uses signal-based retrieval, chunking, embedding, and automatic decay to drastically reduce input token usage (average 68% savings). Its self-organizing stacks, outcome-based boost/demote, and drop-in adapters for major LLM providers make it a practical innovation for reducing latency and cost in conversational AI.

Target audience: backend devs, ai engineers

Repository: https://github.com/Tem-Degu/streetai-memory · Python · NOASSERTION · 1 stars

View on Hacker News