Feather DB Context Graph: Typed Edges, BFS Traversal, and context_chain
Vector search finds similar memories. Graph traversal finds related ones. Feather DB's context graph combines both — typed edges, n-hop BFS, and a single API that returns the full context cluster around any query.
The limitation of vector search alone
Vector similarity finds semantically similar content. It's powerful but incomplete. Given a query about "the authentication bug," cosine similarity can find memory nodes whose embeddings are close to the query vector — nodes that discuss authentication, bugs, or both. But it cannot find a node titled "Fix merged in PR #114" if that fix's embedding doesn't happen to be semantically close to "authentication bug" in the embedding space.
In practice, knowledge has structure. A bug leads to a fix. A hypothesis leads to an experiment. A user preference supports a design decision. A new policy supersedes an old one. These relationships are real and load-bearing — knowing the bug without knowing the fix is half the picture. But the relationships live in the structure of knowledge, not in the embedding geometry. A flat vector search can't traverse them.
Feather DB's context graph layer adds explicit, typed relationships between memory nodes. Combined with HNSW search, it enables context_chain(): a single API that retrieves semantically similar nodes and then traverses their graph neighborhood to surface structurally related context.
The nine built-in edge types
Feather DB ships with nine built-in edge types, each encoding a distinct semantic relationship:
| Edge type | Meaning | Example |
|---|---|---|
supports | Node A provides evidence for Node B | benchmark result supports design decision |
contradicts | Node A conflicts with Node B | new finding contradicts prior hypothesis |
refines | Node A is a more precise version of Node B | specific preference refines general preference |
leads_to | Node A causally or temporally leads to Node B | bug report leads to PR |
same_session | Nodes A and B occurred in the same session | two messages in one conversation |
same_ad | Nodes A and B are from the same ad or campaign | ad copy and its performance data |
supersedes | Node A replaces Node B | updated preference supersedes old one |
causes | Node A is a cause of Node B | configuration change causes bug |
resolves | Node A resolves Node B | fix resolves bug report |
Edge types are directional. db.link(a, b, "leads_to") means a leads to b, not b leads to a. BFS traversal respects direction by default, though the API supports bidirectional traversal for undirected queries.
Creating edges with db.link()
Edges are created with a single call. There is no schema to define and no migration to run — edges can be added at any time between any two existing nodes.
import feather_db as fdb
import time
db = fdb.DB.open("coding_assistant.feather", dim=768)
def store(text: str, importance: float = 0.5) -> int:
nid = int(time.time() * 1000) % (2**31)
meta = fdb.Metadata(importance=importance)
meta.set_attribute("text", text)
db.add(id=nid, vec=embed(text), meta=meta)
return nid
# Store a sequence of related events
bug_id = store("Async race condition in payment handler — intermittent 500 errors", importance=0.8)
diagnosis_id = store("Root cause: missing await on db.commit() in payment_service.py line 247", importance=0.85)
fix_id = store("Fix: added await to db.commit(), also added explicit transaction rollback", importance=0.8)
pr_id = store("PR #114 merged — payment handler async fix, includes regression test", importance=0.75)
test_id = store("Regression test added: test_payment_handler_concurrent_commits", importance=0.6)
# Wire up the relationships
db.link(bug_id, diagnosis_id, edge_type="leads_to")
db.link(diagnosis_id, fix_id, edge_type="leads_to")
db.link(fix_id, pr_id, edge_type="leads_to")
db.link(pr_id, test_id, edge_type="leads_to")
db.link(fix_id, bug_id, edge_type="resolves")
db.link(diagnosis_id, bug_id, edge_type="causes") # root cause causes the bug
How context_chain() works
context_chain() is a two-phase retrieval algorithm:
Phase 1: ANN search. Run HNSW approximate nearest neighbor search to find the k nodes with highest final scores (combining semantic similarity, recency, stickiness, and importance). This is the seed set.
Phase 2: BFS traversal. Starting from each seed node, perform breadth-first search over the context graph up to hops steps. All nodes reachable within the hop limit are added to the result set, deduplicated, and returned sorted by their traversal distance (1-hop neighbors before 2-hop, etc.).
The result is a list of nodes that are either semantically similar to the query or structurally connected to semantically similar nodes — the full context cluster.
# Query: developer asks about the payment bug
query = "What was the issue with the payment handler?"
query_vec = embed(query)
# context_chain: ANN seed + BFS expansion
chain = db.context_chain(
query_vec,
k=3, # top-3 ANN seeds
hops=2, # expand 2 hops from each seed
half_life=30,
time_weight=0.3
)
for node in chain:
text = node.meta.get_attribute("text") if node.meta else ""
print(f"[score={node.score:.3f}] {text}")
# Output (paraphrased):
# [score=0.921] Async race condition in payment handler — intermittent 500 errors
# [score=0.887] Root cause: missing await on db.commit() in payment_service.py line 247
# [score=0.834] Fix: added await to db.commit(), also added explicit transaction rollback
# [score=0.791] PR #114 merged — payment handler async fix, includes regression test
# [score=0.743] Regression test added: test_payment_handler_concurrent_commits
The ANN search found the bug node (highest semantic similarity). BFS traversal from the bug node via leads_to edges surfaced the diagnosis, fix, PR, and test — even though the query phrase "payment handler" might not embed close to "regression test" or "PR #114."
Worked example: a coding assistant with linked decisions
Here is a more complete example: a coding assistant that links architectural decisions to their rationale and consequences.
import feather_db as fdb
import time
db = fdb.DB.open("arch_decisions.feather", dim=768)
def node(text, importance=0.6):
nid = int(time.time() * 1000) % (2**31)
time.sleep(0.001) # ensure unique ms timestamps
meta = fdb.Metadata(importance=importance)
meta.set_attribute("text", text)
db.add(id=nid, vec=embed(text), meta=meta)
return nid
# Architecture decision record
requirement = node("System must handle 10K concurrent WebSocket connections", 0.9)
benchmark = node("asyncio handles 10K concurrent connections at 0.8ms p99 on t3.xlarge", 0.8)
decision = node("Chose asyncio + FastAPI over thread-per-connection Flask", 0.9)
consequence = node("All handlers must be async — no blocking calls allowed in request path", 0.85)
violation = node("Bug: requests.get() call in auth middleware blocks event loop", 0.8)
resolution = node("Fix: replaced requests.get() with httpx.AsyncClient in auth.py", 0.8)
# Wire the knowledge graph
db.link(requirement, benchmark, "leads_to")
db.link(benchmark, decision, "supports")
db.link(decision, consequence, "leads_to")
db.link(consequence, violation, "leads_to")
db.link(resolution, violation, "resolves")
db.link(violation, consequence, "causes")
# Later: developer asks about the auth middleware bug
chain = db.context_chain(embed("auth middleware blocking"), k=2, hops=3)
# Surfaces: violation, consequence, decision, benchmark, resolution
# The developer sees the full chain: why async was chosen → what that requires →
# how the bug violated it → how it was fixed
Filtering traversal by edge type
BFS traversal can be filtered to follow only specific edge types. This is useful when you want to trace only causal chains, or only supersession history.
# Only follow 'leads_to' and 'resolves' edges in BFS
chain = db.context_chain(
query_vec,
k=3,
hops=2,
edge_types=["leads_to", "resolves"] # filter BFS to these types
)
# Only follow 'supersedes' edges — get the history of a fact
chain = db.context_chain(
query_vec,
k=1,
hops=5,
edge_types=["supersedes"] # trace the full update history
)
The context graph as first-class architecture
Most vector databases treat their storage as a flat collection of embedding points. Relationships — if supported at all — are implemented as metadata filters, not first-class graph traversal. This means the structural knowledge encoded in relationships is invisible to retrieval.
Feather DB treats the context graph as a first-class retrieval structure, co-equal with the HNSW index. context_chain() is not a post-processing step on top of vector search — it's a unified algorithm where graph structure and vector geometry both contribute to what gets surfaced.
For AI agent memory, this means the agent's knowledge is not a pile of semantically indexed fragments. It's a structured graph of facts, relationships, decisions, and consequences — and retrieval can traverse that structure to surface the right context for any query.
Install: pip install feather-db · GitHub: github.com/feather-store/feather