memlocal
your AI's memory belongs on your device.
memlocal is an open-source, local-first memory layer that gives AI agents persistent, structured recall — designed to run entirely on a smartphone.
it draws from cognitive psychology's multi-store model, stores memories in embedded CozoDB, and retrieves context through a multi-channel pipeline with semantic search, full-text search, graph traversal, and cross-encoder reranking.
locomo pass rate
80.0%
published benchmark result
avg llm score
4.21 / 5
on locomo
deployment
local-first
embedded, on-device
why memlocal
llms have no persistent memory. when a conversation ends, everything is lost. most memory products solve that by moving recall into cloud infrastructure — external vector databases, graph stores, and vendor-controlled APIs.
memlocal takes the opposite path: memory stays inside the app, inside the device, and inside a single embedded engine.
three constraints
privacy
personal memories stay on the device. no cloud sync, no server-side retention, no third-party memory store.
latency
local retrieval avoids network round-trips, so context assembly happens fast enough to feel native instead of bolted on.
offline
the database, vector index, full-text search, and graph queries all run in-process, so memory keeps working without a connection.
what ships
the core engine is written in rust and compiles to a shared library of roughly 4 MB. a typical store for 1,000 memories with 1536-dimensional embeddings is about 25 MB on disk.
that means the database, vector index, bm25 index, and knowledge graph all run inside one embedded process instead of three or four separate services.
flutter is available first, with native platform sdk support planned next.