memlocal

your AI's memory belongs on your device.

memlocal is an open-source, local-first memory layer that gives AI agents persistent, structured recall — designed to run entirely on a smartphone.

it draws from cognitive psychology's multi-store model, stores memories in embedded CozoDB, and retrieves context through a multi-channel pipeline with semantic search, full-text search, graph traversal, and cross-encoder reranking.

locomo pass rate

80.0%

published benchmark result

avg llm score

4.21 / 5

on locomo

deployment

local-first

embedded, on-device

why memlocal

llms have no persistent memory. when a conversation ends, everything is lost. most memory products solve that by moving recall into cloud infrastructure — external vector databases, graph stores, and vendor-controlled APIs.

memlocal takes the opposite path: memory stays inside the app, inside the device, and inside a single embedded engine.

three constraints

privacy

personal memories stay on the device. no cloud sync, no server-side retention, no third-party memory store.

latency

local retrieval avoids network round-trips, so context assembly happens fast enough to feel native instead of bolted on.

offline

the database, vector index, full-text search, and graph queries all run in-process, so memory keeps working without a connection.

what ships

the core engine is written in rust and compiles to a shared library of roughly 4 MB. a typical store for 1,000 memories with 1536-dimensional embeddings is about 25 MB on disk.

that means the database, vector index, bm25 index, and knowledge graph all run inside one embedded process instead of three or four separate services.

flutter is available first, with native platform sdk support planned next.