Back to Theory
Theory6 min read · July 1, 2026

What Is the Living Context Engine?

The Living Context Engine is Feather DB's model of AI memory: a system that actively manages what it knows through continuous decay, recall stickiness, and graph relationships — so the working memory evolves with usage rather than accumulating indiscriminately.

F
Feather DB
Engineering

The Living Context Engine is the model that defines how Feather DB manages AI memory. "Living" means the memory system is not static — it decays, strengthens, and reorganizes based on how information is actually used. Facts recalled frequently resist decay and stay accessible. Facts never recalled fade automatically. The result is a working memory layer that reflects what's been recently relevant, not a growing accumulation of everything ever stored.

Why "living" is the key distinction

Standard memory systems for AI agents are passive. You store embeddings. You retrieve by similarity. The store grows indefinitely. Nothing is ever forgotten unless you explicitly delete it. Every stored fact has equal retrieval eligibility in perpetuity.

This creates two failure modes as agents run over time. First, retrieval quality degrades — stale facts from months ago compete equally with fresh facts from yesterday, so the model receives a mix of current and outdated information. Second, storage costs grow without bound — every session adds to the store, nothing leaves, and the index grows until performance degrades.

A living context engine solves both. It applies time-based decay so old, unused facts lose retrieval priority. It applies recall stickiness so frequently-used facts resist decay. It manages the working set size automatically through the scoring mechanics — no manual curation, no explicit deletion rules, no TTL configurations per item.

The five layers of Feather DB's Living Context Engine

Layer 1: Adaptive Memory. Each stored memory carries a recall count and an importance weight. The scoring formula combines these with a configurable half-life:

stickiness    = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency       = 0.5 ^ (effective_age / half_life_days)
final_score   = ((1 - time_weight) × similarity
                 + time_weight × recency) × importance

A memory with a recall count of 10 has stickiness of ~3.4 — it ages at 29% of the base rate. A memory never recalled decays at full speed. Importance weights (set at ingest time, updatable at runtime) scale the final score — core brand guidelines stored at importance 1.0 never decay below retrieval threshold.

Layer 2: Context Graph. Memories connect to each other via typed weighted edges. Nine relationship types: supports, contradicts, supersedes, same_session, caused_by, derived_from, related_to, parent, child. BFS traversal from a retrieved memory surfaces connected context up to configurable hop depth. A single query can return not just the most similar fact but the chain of related facts that give it meaning.

Layer 3: Semantic Search. HNSW (Hierarchical Navigable Small World) indexing with AVX2/AVX512 SIMD acceleration. 0.19ms p50 approximate nearest-neighbor search on 500K vectors. 97.2% recall@10. Hybrid BM25+dense search via Reciprocal Rank Fusion for queries that benefit from both keyword and semantic matching.

Layer 4: Metadata Intelligence. Rich per-memory attributes, namespace isolation for multi-tenant deployments, and filtered search. Metadata can include session IDs, user IDs, timestamps, source type, confidence level — all queryable at retrieval time without affecting vector search performance.

Layer 5: Deploy Anywhere. Single .feather file, no server, embedded operation. pip install feather-db. Self-hosted Docker available. The same API runs in a laptop process, a cloud VM, or a serverless function.

The Read-Reason-Update-Decay loop

The "living" property comes from a continuous loop that runs on every interaction:

Read: Retrieve the top-k most relevant memories using adaptive scoring — not flat cosine similarity, but similarity weighted by recency, stickiness, and importance.

Reason: The AI agent uses retrieved memories as context for its response. Graph traversal surfaces additional context connected to the retrieved facts. The model output is grounded in the current working memory state.

Update: After retrieval, recall counts increment on every returned memory. Importance scores can be updated programmatically based on outcome quality. New memories from the session are ingested with fresh timestamps.

Decay: On the next retrieval pass, the scoring formula re-applies. Time has passed. Memories not recalled have lower effective scores. The working set has shifted slightly toward what's been recently relevant.

No manual curation happens in this loop. No explicit deletion. No TTL management. The system curates itself through the mechanics of the scoring formula.

What the Living Context Engine achieves on benchmarks

LongMemEval is the standard benchmark for AI long-term memory accuracy. It tests recall, temporal reasoning, preference tracking, and knowledge updates across simulated long-running agent sessions.

SystemLongMemEval scoreRelative cost per query
GPT-4o full context0.64040× (baseline)
Feather DB + GPT-4o0.6931× (40× cheaper)
Feather DB + Gemini Flash0.657~$2.40 full benchmark run

Feather DB scores 8.3% higher than full-context GPT-4o at 40× lower cost. The accuracy improvement comes from retrieval quality: the Living Context Engine surfaces fresher, higher-signal memories than a flat store or full-context window that treats all information equally.

FAQ

What does "living" mean in the Living Context Engine?

"Living" means the memory system actively manages what it knows — facts that are recalled frequently stay accessible and resist decay, while facts never recalled automatically fade. The working memory evolves based on usage patterns rather than accumulating indiscriminately.

How is the Living Context Engine different from RAG?

RAG retrieves documents by similarity without tracking recency or usage history. The Living Context Engine adds decay, stickiness, and graph relationships so retrieval reflects not just semantic similarity but temporal relevance and connection to other known facts.

Does the Living Context Engine require manual configuration to manage memory size?

No. The decay mechanics manage working set size automatically. Set the half-life parameter (e.g., 30 days) and time_weight (e.g., 0.3), and memories naturally fade below retrieval threshold without manual deletion or TTL rules per item.

What benchmark does Feather DB use to measure memory quality?

Feather DB reports scores on LongMemEval, the standard benchmark for AI long-term memory. Feather DB with GPT-4o scores 0.693 vs 0.640 for GPT-4o full-context — 8.3% higher at 40× lower per-query cost.

Is the Living Context Engine available as a hosted service?

Feather DB currently runs as an embedded library (single .feather file, pip install feather-db) and self-hosted Docker. A Cloud offering is planned for Q3 2026. The embedded model is production-ready today.