Show HN: Tiny long-memory benchmark with Harbor running across Islo sandboxes
Category: ai-ml
Tags: long-memory, benchmark, ai-agents
Score: 4.8/10 (Innovation: 4, Technical: 5, Documentation: 6, Utility: 4)
This project provides a tiny, inspectable benchmark for evaluating long-term memory in AI agents, focusing on retrieval, updates, and abstention. It uses Islo sandboxes and Harbor task wrapping to run parallel evaluations, making failure modes obvious in a compact demo. While small in scope, it offers a clear, practical starting point for testing memory systems beyond simple RAG.
Target audience: AI/ML researchers and engineers working on agent memory systems
Repository: https://zozo123.github.io/longmem-mini-on-islo/ ยท MIT
View on Hacker News