2.0 KiB
Memory Architecture
DuckLM currently has two memory layers:
- SQLite memory in
duck_core.memory.store.MemoryStorefor durable structured records. - Vector memory in
duck_core.memory.vector_memory.VectorMemoryfor semantic search through Qdrant.
SQLite Memory
SQLite is the primary durable store. Runtime writes memory records after
memory_policy decides that a completed task contains reusable information.
Manual memory records can also be added through /v1/memory and the WebChat
memory drawer.
SQLite memory remains available even when Qdrant is down.
Vector Memory
Vector memory stores the same useful memory summaries in Qdrant when vector storage is configured and reachable. Qdrant is managed by the local service scripts:
bash scripts/duck.sh start
bash scripts/duck.sh status --probe
bash scripts/duck.sh stop
The MTP stack uses the same memory lifecycle through scripts/duck-mtp.sh.
Embeddings
The default embedding source is a local sentence-transformers model:
./models/all-MiniLM-L6-v2
VectorMemory lazy-loads that model only when it needs to write or search
vectors. Health checks do not load the embedding model; they only probe Qdrant.
A remote OpenAI-compatible embeddings endpoint can be used by setting
embeddings_base_url, but the normal local stack does not rely on
llama-server embeddings.
If no embedding source is configured, VectorMemory raises
EmbeddingsUnavailableError. It does not silently invent fallback embeddings.
Status And Verification
Runtime status is available through:
curl --noproxy '*' 'http://127.0.0.1:8000/v1/status?probe=true'
scripts/duck.sh status --probe prints the same backend result plus Docker
Compose state for Qdrant. WebChat also shows model and vector memory state in
the Runtime panel.
The live smoke test for Qdrant write/search is:
.venv/bin/python -m pytest tests/smoke/test_vector_memory_live.py -q
The test skips when Qdrant is not reachable, and runs a real add/search cycle when the local stack is up.