memlocal

your AI's memory belongs on your device.

your application

agent, assistant, or app runtime

llm extraction

↓

sensory buffer

ring buffer · ttl 5s · capacity 100 · noise filter

context assembly

↓

working memory

flat single-hop or sectioned multi-hop context

key facts → top evidence → raw excerpts → session context

store / retrieve

↓

long-term memory

episodic

factual

semantic

procedural

social

spatial

prospective

affective

CozoDB: HNSW vectors + BM25 full-text + Datalog graph + LSH dedup + triples FTS + session summaries

on a smartphone, you cannot justify running pinecone, elasticsearch, and neo4j just to give an agent memory. CozoDB collapses graph, vector, full-text, and relational storage into a single embedded engine.

how a query is answered

the raw query enters the retrieval pipeline.
a query classifier determines whether the question is single-hop, multi-hop, temporal, or open-ended.
six retrieval channels run in parallel: per-type semantic search, bm25 keyword matching, recursive graph traversal, triple fts, session-window expansion, and speaker-filtered search.
all candidates are pooled, deduplicated, and reranked with a cross-encoder.
the top-ranked items are assembled into a context block that adapts to query complexity.
the context is injected into the llm prompt so most questions can be answered in a single call.

status

memlocal is under active development. the rust core compiles to shared libraries, with the flutter sdk available now and more native bindings planned.

open source under the apache license 2.0.

if you care about local-first ai, private-by-default memory, or just think agents should remember things without phoning home, the project is being built in the open.

white paper