Show HN: Memory for LLM apps that cuts input tokens up to 80% (avg 68%)
Category: library
Tags: llm-memory, token-savings, embeddings
Score: 6.8/10 (Innovation: 7, Technical: 7, Documentation: 7, Utility: 6)
Street AI provides a memory layer for LLM applications that uses signal-based retrieval, chunking, embedding, and automatic decay to drastically reduce input token usage (average 68% savings). Its self-organizing stacks, outcome-based boost/demote, and drop-in adapters for major LLM providers make it a practical innovation for reducing latency and cost in conversational AI.
Target audience: backend devs, ai engineers
Repository: https://github.com/Tem-Degu/streetai-memory · Python · NOASSERTION · 1 stars
View on Hacker News