Show HN: Tiny long-memory benchmark with Harbor running across Islo sandboxes

Category: ai-ml

Tags: long-memory, benchmark, ai-agents

Score: 4.8/10 (Innovation: 4, Technical: 5, Documentation: 6, Utility: 4)

This project provides a tiny, inspectable benchmark for evaluating long-term memory in AI agents, focusing on retrieval, updates, and abstention. It uses Islo sandboxes and Harbor task wrapping to run parallel evaluations, making failure modes obvious in a compact demo. While small in scope, it offers a clear, practical starting point for testing memory systems beyond simple RAG.

Target audience: AI/ML researchers and engineers working on agent memory systems

Repository: https://zozo123.github.io/longmem-mini-on-islo/ · MIT

View on Hacker News