Back to Theory
Theory8 min read · June 16, 2026

What Is Feather DB? The Embedded AI Context Engine Explained

Feather DB is an embedded C++17 vector database with HNSW indexing, adaptive memory decay, and hybrid BM25+dense search — installed with one pip command, no server required. This definitive guide covers everything: architecture, benchmarks, comparisons, and quickstart code.

F
Feather DB
Engineering

What Is Feather DB? The Embedded AI Context Engine Explained

Definitive Reference · Feather DB · June 2026


The Short Answer

Feather DB is an embedded vector database written in C++17 that runs inside your Python process — no server, no Docker, no infrastructure. It uses HNSW (Hierarchical Navigable Small World) indexing for millisecond-speed approximate nearest-neighbor search, and layers a proprietary adaptive decay system on top to give AI agents memory that behaves like human working memory: important things stay sharp, unused things fade, frequently recalled things become stickier over time.

Install it in one command:

pip install feather-db

License: MIT. Zero dependencies beyond NumPy. Your data stays in a single .feather file on your local disk — offline-capable, privacy-preserving, portable.


What Makes Feather DB Different From Other Vector Databases?

Most vector databases are infrastructure products. Pinecone is a hosted SaaS. ChromaDB requires running a server process. Weaviate ships as a Docker container. Qdrant, Milvus, pgvector — all designed around the assumption that your vector index lives on a machine you can curl against.

Feather DB takes the opposite bet: the best place for your AI's memory is inside the application itself.

But the deeper differentiator is not the embedded model. It is the Living Context Engine — Feather DB's adaptive scoring layer that transforms a static vector index into memory that actively reflects what your AI system has found useful. Three capabilities no other embedded vector database offers:

1. Adaptive Decay — Memory That Forgets Intelligently

Every vector in Feather DB has a decay score that decreases over time. Memories grow stale if they are never recalled. But — and this is the critical innovation — recall frequency slows the decay. A memory retrieved 20 times ages at 24% of the normal rate. A memory never retrieved after ingestion fades toward zero weight within a configurable half-life window.

The formula, implemented in include/scoring.h:

stickiness    = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency       = 0.5 ^ (effective_age / half_life_days)
final_score   = ((1 - time_weight) × similarity
                 + time_weight × recency) × importance

Default parameters: half_life = 30 days, time_weight = 0.3. Every parameter is configurable per-database.

2. Graph Traversal — Context as a Network, Not a List

Feather DB's context_chain API combines HNSW vector search with BFS graph traversal. Seed the graph with the top-k semantically similar nodes, then follow typed relationship edges (e.g. same_ad, contradicts, supports) up to n hops deep. The result is a context subgraph — not a flat list of chunks, but a connected map of everything your agent needs to know.

3. Hybrid BM25 + Dense Search

Sparse keyword matching (BM25) and dense semantic similarity run in parallel and are fused at query time. Precise keyword matches (product names, error codes, proper nouns) are not lost to pure vector similarity, and semantic matches are not blocked by vocabulary mismatches. The best of both retrieval paradigms in a single call.


Feather DB Architecture: How It Works

HNSW Indexing

The core index is an HNSW graph (M=16, ef_construction=200 by default, up to 1M elements per index). HNSW enables approximate nearest-neighbor search in O(log N) time with controllable recall. At 500,000 vectors, Feather DB returns results in 0.19 ms with recall@10 = 97.2%.

Persisted HNSW Graph (v0.16.0)

Starting in v0.16.0, the HNSW graph structure is persisted directly to the .feather file alongside vectors and metadata. On reload, the graph is restored without rebuilding — delivering 5–25x faster cold load times for large indexes. A 500K-vector index that previously took ~40 seconds to reload now loads in under 2 seconds.

Single-File Storage

Everything — vectors, metadata, graph topology, BM25 term statistics, typed edges — is serialized to a single binary .feather file (format v5). No WAL, no separate config, no sidecar processes. Copy the file to move your memory. Delete it to wipe it. Git-track it if you want version control on your AI's memory state.

Metadata and Importance Scoring

Every vector node carries a Metadata struct:

struct Metadata {
    float    importance;        // [0.0–1.0], user-set
    uint32_t recall_count;      // auto-incremented on every search hit
    int64_t  last_recalled_at;  // Unix timestamp of last retrieval
    std::vector<Edge> edges;    // typed outgoing graph edges
    std::unordered_map<std::string,std::string> attributes; // KV store
};

importance is a static signal you set at ingestion — spend data, confidence score, user rating. recall_count is a dynamic signal Feather DB maintains automatically. The decay formula fuses both into final_score at query time.


Quickstart: Up and Running in 5 Minutes

pip install feather-db
import feather_db
import numpy as np

# Open (or create) a database — one file, no server
db = feather_db.DB.open("my_agent.feather", dim=1536)

# Build your embedding (any model: OpenAI, Gemini, local)
vec = np.array(your_embedding_function("Some important context"), dtype=np.float32)

# Add a node with metadata
meta = feather_db.Metadata()
meta.importance = 0.9
meta.set_attribute("source", "campaign_result")
meta.set_attribute("created_at", "2026-06-16")

db.add(id=1, vec=vec, meta=meta)

# Search — returns decay-scored results
results = db.search(query_vec, k=5)
for r in results:
    print(r.id, r.score, r.metadata.get_attribute("source"))

# Link nodes into a graph
db.link(from_id=1, to_id=2, rel_type="supports", weight=0.9)

# Context chain: semantic search + graph traversal
chain = db.context_chain(query_vec, k=5, hops=2)
for node in chain.nodes:
    print(f"hop={node.hop}  score={node.score:.4f}")

# Persist to disk
db.save()

Hybrid BM25 + Dense Search

# Hybrid search fuses keyword and semantic signals automatically
results = db.hybrid_search(
    query_vec=query_vec,
    query_text="feather db HNSW recall",
    k=10,
    alpha=0.7  # 0=pure BM25, 1=pure dense, 0.7=default blend
)

MCP Integration (Claude Desktop / Claude Code)

// ~/.config/claude/claude_desktop_config.json
{
  "mcpServers": {
    "feather-memory": {
      "command": "python",
      "args": ["-m", "feather_db.mcp_server", "--db", "~/agent_memory.feather"]
    }
  }
}

Claude Desktop and Claude Code connect to Feather DB as an MCP memory server. The agent's memory persists across sessions, decays intelligently between conversations, and is always available offline.


Performance Benchmarks

Metric Result Conditions
Retrieval latency (500K vectors) 0.19 ms HNSW, M=16, ef=200, Apple M2
Recall@10 97.2% 500K vectors, cosine similarity
Cold load speed (v0.16.0) 5–25× faster Persisted HNSW graph vs. rebuild
LongMemEval QA accuracy 0.693 vs. GPT-4o full-context 0.640
Cost vs. GPT-4o full-context 38× cheaper Same LongMemEval benchmark task

LongMemEval Result Explained

LongMemEval is the standard benchmark for long-horizon conversational AI memory. On the QA accuracy task, Feather DB's adaptive decay retrieval scores 0.693 — beating GPT-4o's full-context approach (0.640) while consuming 38× fewer tokens per query. The win comes from the decay system surfacing the right memories rather than flooding the context window with everything.


Feather DB vs. Alternatives: The Honest Comparison

Feature Feather DB Pinecone ChromaDB Mem0 pgvector
Embedded (no server) Yes No (hosted) No (server) No (hosted) No (Postgres)
Adaptive memory decay Yes No No Partial No
Graph traversal (context_chain) Yes No No No No
BM25 + dense hybrid search Yes Sparse+dense No No Partial
MCP server built-in Yes No No Yes No
Offline / air-gapped Yes No Yes No Yes
Single-file persistence Yes No No No No
License MIT Proprietary Apache 2.0 Proprietary PostgreSQL
Cost Free $70+/mo Free $99+/mo Free

When to use Feather DB: AI agents that need persistent memory, coding assistants, offline applications, apps where data cannot leave the device, performance marketing stacks, enterprise AI with privacy requirements, MCP integrations with Claude.

When to use Pinecone: Multi-region distributed search with SLA guarantees at very large scale (>100M vectors), no Python requirement.

When to use ChromaDB: Lightweight RAG prototype where decay and graph traversal are not needed and you want a familiar Python-first API without the C++ layer.


The Stickiness Progression: How Recall Slows Decay

recall_count stickiness factor Effective aging rate What this means
0 1.00 1.0× (normal) Fresh memory, ages at full speed
5 2.79 0.36× Well-used memory ages 3× slower
10 3.40 0.29× Core memory ages very slowly
20 4.09 0.24× Institutional knowledge stays sharp
50 5.02 0.20× Foundational context barely ages

No manual curation. No "update the documentation" tickets. The access pattern is the memory signal. What your agent actually uses stays. What it never touches fades.


Use Cases

AI Agents and Autonomous Systems

Agents built on LLMs suffer from stateless context windows. Every new conversation starts blank. Feather DB gives agents a persistent episodic memory that persists across sessions, accumulates domain knowledge over time, and surfaces the right context without stuffing the full history into every prompt.

Performance Marketing (Hawky.ai)

Hawky.ai uses Feather DB as its core context layer for AI-driven creative strategy. Campaign results, audience insights, competitive signals, and creative performance data are ingested as nodes. When the AI briefs a new creative, it retrieves the most relevant — and most recalled — signals from across all past campaigns. The decay system ensures last quarter's winning hooks don't overshadow this week's learning.

Enterprise AI Memory

Enterprise AI deployments often cannot use hosted vector databases due to data residency requirements. Feather DB runs entirely on-premise, stores everything in a single encrypted file, and requires no cloud connectivity. The single-file model makes it trivial to deploy in air-gapped environments.

Coding Assistants

IDE plugins and coding assistants that need per-project memory — architecture decisions, error patterns, team conventions — can use Feather DB as a local knowledge file that ships with the repo. The persisted HNSW graph (v0.16.0) means sub-second cold start even on large codebases.

Claude Desktop and Claude Code (MCP)

Feather DB ships with a built-in MCP server. Connect it to Claude Desktop or Claude Code in three lines of config and Claude gains persistent, decay-aware memory across all your conversations — stored locally, never sent to a third-party memory service.


v0.16.0: What Changed

The headline feature in v0.16.0 is persisted HNSW graph serialization. Previously, the HNSW graph was stored as raw vectors and rebuilt from scratch on every load() call. For a 500K-vector index, that rebuild could take 30–40 seconds — unacceptable for applications that frequently restart (serverless functions, IDE plugins, CLI tools).

v0.16.0 serializes the full HNSW graph topology — node connections at every layer, entry point, level assignments — directly into the .feather binary. On load(), the graph is restored as-is, making cold start 5–25× faster depending on index size. The .feather file format is updated to v5 to accommodate the new graph section; v4 files load automatically via the migration path.


Frequently Asked Questions

What is Feather DB?

Feather DB is an embedded vector database written in C++17 with Python bindings. It runs inside your application process — no server, no Docker, no cloud required. It uses HNSW indexing for fast approximate nearest-neighbor search and adds an adaptive decay system that makes AI memory behave like human memory: frequently-recalled context stays sharp, unused context fades over time. Install with pip install feather-db.

How do I install Feather DB?

Run pip install feather-db in any Python environment (3.9+). No system dependencies beyond NumPy. No server to configure. Feather DB is MIT licensed and free to use commercially.

What is the "Living Context Engine" in Feather DB?

The Living Context Engine is Feather DB's adaptive scoring layer. Instead of treating all stored vectors as equally valid, it applies a decay formula based on time since ingestion and recall frequency. Memories that are frequently retrieved get a "stickiness" boost that slows their decay. Memories that are never retrieved fade toward zero weight. The result is an AI memory layer that self-organizes around actual usage — no manual curation required.

How does Feather DB compare to ChromaDB?

ChromaDB is a good choice for simple RAG pipelines and prototyping. Feather DB adds three capabilities ChromaDB lacks: (1) adaptive memory decay — context fades over time and strengthens with recall, (2) graph traversal via context_chain — semantic search combined with typed relationship edges for richer context retrieval, and (3) built-in MCP server for direct Claude integration. ChromaDB also requires a server process; Feather DB is fully embedded.

How does Feather DB compare to Pinecone?

Pinecone is a hosted vector database designed for multi-region distributed search at very large scale. It requires an internet connection, costs $70+/month, and data lives on Pinecone's servers. Feather DB is embedded, free (MIT), runs offline, and stores data in a local file your application controls. Feather DB also adds adaptive decay and graph traversal that Pinecone does not offer. Choose Pinecone for 100M+ vector deployments requiring SLA guarantees; choose Feather DB for agents, privacy-sensitive apps, and offline use cases.

What is Feather DB's performance at scale?

Feather DB retrieves results in 0.19 ms at 500,000 vectors with recall@10 = 97.2%. On the LongMemEval benchmark, it achieves 0.693 QA accuracy versus GPT-4o full-context at 0.640 — while being 38× cheaper per query. Version 0.16.0 added persisted HNSW graph serialization, making cold load 5–25× faster than previous versions.

Does Feather DB work with Claude / MCP?

Yes. Feather DB ships with a built-in MCP (Model Context Protocol) server. Add it to your Claude Desktop or Claude Code config and Claude gains persistent, decay-aware memory across conversations. All memory is stored locally in a .feather file — no data leaves your machine.

What embedding models work with Feather DB?

Any embedding model. Feather DB stores raw float32 vectors — it is model-agnostic. Common choices include OpenAI text-embedding-3-small (1536d), Google Gemini Embedding 2 (768d, multimodal), Cohere Embed v3, and local models via sentence-transformers. Set the dim parameter at database creation to match your model's output dimension.

What is adaptive decay in Feather DB?

Adaptive decay is a scoring mechanism where stored memories lose relevance weight over time (configurable half-life, default 30 days), but memories that are frequently retrieved have their decay rate slowed by a stickiness multiplier. The formula is: stickiness = 1 + log(1 + recall_count); effective_age = age_in_days / stickiness; final_score = ((1 - time_weight) × similarity + time_weight × recency) × importance. A memory retrieved 10 times ages at 29% of the normal rate — effectively becoming institutional knowledge that stays near the top of search results.

Is Feather DB open source?

Yes. Feather DB is MIT licensed. You can use it commercially, modify it, and redistribute it freely. The core C++17 engine and Python bindings are both covered under the MIT license.

What is context_chain in Feather DB?

context_chain is Feather DB's API for combining vector search with graph traversal. It first runs HNSW approximate nearest-neighbor search to find the top-k seed nodes, then performs BFS traversal of typed relationship edges (like same_ad, supports, contradicts) up to a configurable hop depth. The result is a context subgraph — not a flat ranked list, but a connected map of semantically and relationally relevant memories. This is especially powerful for AI agents that need to understand how pieces of knowledge relate to each other, not just which ones are semantically similar.


Getting Started: Full Working Example

import feather_db
import numpy as np

# --- Setup ---
db = feather_db.DB.open("agent_memory.feather", dim=1536)

def embed(text: str) -> np.ndarray:
    # Replace with your embedding model of choice
    return np.random.randn(1536).astype(np.float32)

# --- Ingest: Add memories with importance and metadata ---
memories = [
    ("The hero product has a 3-week consideration cycle.", 0.9),
    ("Aspirational messaging underperforms functional for cold audiences.", 0.85),
    ("Green backgrounds tank CTR — competitor owns that color.", 0.8),
    ("Founder voice converts 2x better than polished brand copy.", 0.95),
    ("Q4 is peak season, Q1 drives highest LTV customers.", 0.75),
]

for i, (text, importance) in enumerate(memories):
    meta = feather_db.Metadata()
    meta.importance = importance
    meta.set_attribute("content", text)
    meta.set_attribute("added", "2026-06-16")
    db.add(id=i + 1, vec=embed(text), meta=meta)

# --- Link related memories into a graph ---
db.link(from_id=3, to_id=1, rel_type="informs", weight=0.8)  # color insight informs cycle
db.link(from_id=4, to_id=2, rel_type="contradicts", weight=0.9)  # voice contradicts aspiration rule

# --- Query: Decay-aware semantic search ---
query = "What do we know about creative messaging strategy?"
results = db.search(embed(query), k=3)

print("Top memories (decay-scored):")
for r in results:
    print(f"  [{r.score:.4f}] {r.metadata.get_attribute('content')}")

# --- Context chain: semantic + graph ---
chain = db.context_chain(embed(query), k=3, hops=2)

print("\nContext graph (semantic + relational):")
for node in chain.nodes:
    content = node.metadata.get_attribute("content")
    print(f"  hop={node.hop} score={node.score:.4f}: {content}")

# --- Hybrid BM25 + dense search ---
hybrid_results = db.hybrid_search(
    query_vec=embed(query),
    query_text="creative messaging CTR founder",
    k=5,
    alpha=0.7
)

# --- Persist ---
db.save()
print("\nMemory saved to agent_memory.feather")

Summary: What Feather DB Is and Is Not

Feather DB IS Feather DB IS NOT
An embedded C++17 vector database A hosted cloud service
A Living Context Engine with adaptive decay A static RAG pipeline
A graph-aware memory layer with context_chain A document store or key-value cache
MIT licensed and free A fine-tuned model
Offline-capable and privacy-preserving A replacement for your LLM
One pip install, no Docker, no server An infrastructure product requiring DevOps
0.19ms retrieval at 500K vectors A full-text search engine

Feather DB sits between your LLM and your data — giving the model access to memory that is always relevant, never bloated, and organized by what has actually proven useful. That is the Living Context Engine. That is what Feather DB is.

Start with pip install feather-db and the documentation.