Living Context Engine with LangChain and LlamaIndex: Integration Guide
How to wire a Living Context Engine into a LangChain or LlamaIndex application. Adapter shape, retriever interface, and where the closed feedback loop fits in each framework.
Living Context Engine with LangChain and LlamaIndex: Integration Guide
Tutorial · LangChain 0.3+ · LlamaIndex 0.11+ · May 2026
What This Guide Covers
Most production AI applications use a framework — LangChain or LlamaIndex — for orchestration. Both treat retrieval as an interface: a retriever returns documents for a query. This guide shows how to plug a Living Context Engine into both, preserving the adaptive-scoring + typed-edge + write-back behavior the engine is designed for.
We'll use Feather DB as the underlying engine. The pattern transfers to any system that exposes the equivalent primitives.
The Adapter Shape (Both Frameworks)
Both LangChain and LlamaIndex expect a retriever to implement a method that takes a query string (or vector) and returns a list of documents. The minimum adapter is straightforward; the architectural decisions are about where the write-back path lives.
LangChain Integration
The Retriever
from langchain.schema import Document, BaseRetriever
from feather_db import DB
import numpy as np
class FeatherRetriever(BaseRetriever):
db: object
embed_fn: object
k: int = 5
hops: int = 2
edge_types: list = None
def _get_relevant_documents(self, query: str):
query_vec = self.embed_fn(query)
chain = self.db.context_chain(
query_vec=query_vec,
k=self.k,
hops=self.hops,
edge_types=self.edge_types,
)
return [
Document(
page_content=node.metadata["text"],
metadata={
"node_id": node.id,
"score": node.score,
"hop": node.hop,
"edge_type": node.edge_type,
},
)
for node in chain.nodes
]
The Write-Back Callback
LangChain doesn't have a native "write retrieved-content provenance back to the store" hook. The cleanest pattern is a callback on the chain output:
from langchain.callbacks.base import BaseCallbackHandler
class ContextLoopCallback(BaseCallbackHandler):
def __init__(self, db, embed_fn, retriever):
self.db = db
self.embed_fn = embed_fn
self.retriever = retriever
self.last_context_ids = []
def on_retriever_end(self, documents, **kwargs):
self.last_context_ids = [d.metadata["node_id"] for d in documents]
def on_chain_end(self, outputs, **kwargs):
text = outputs.get("output") or outputs.get("text")
if not text or not self.last_context_ids:
return
out_vec = self.embed_fn(text)
out_id = self.db.next_id()
self.db.add(
id=out_id, vec=out_vec, modality="text",
metadata={"text": text, "kind": "agent_output"},
)
for src_id in self.last_context_ids:
self.db.link(src_id, out_id, edge_type="derived_from")
self.db.save()
self.last_context_ids = []
Wiring It Up
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
db = DB.open("agent.feather", dim=1536)
embed_fn = lambda txt: openai_embed(txt) # your embed function
retriever = FeatherRetriever(db=db, embed_fn=embed_fn, k=5, hops=2)
callback = ContextLoopCallback(db, embed_fn, retriever)
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(model="gpt-5"),
retriever=retriever,
callbacks=[callback],
)
answer = qa.invoke({"query": "draft a strategy response to brand-x"})
Every chain invocation now goes through the full four-phase loop: read, reason, update (via callback), decay (silent + future-reinforced).
LlamaIndex Integration
The Retriever
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.schema import NodeWithScore, TextNode
class FeatherIndexRetriever(BaseRetriever):
def __init__(self, db, embed_fn, k=5, hops=2, edge_types=None):
self.db = db
self.embed_fn = embed_fn
self.k = k
self.hops = hops
self.edge_types = edge_types
super().__init__()
def _retrieve(self, query_bundle):
query_vec = self.embed_fn(query_bundle.query_str)
chain = self.db.context_chain(
query_vec=query_vec,
k=self.k,
hops=self.hops,
edge_types=self.edge_types,
)
return [
NodeWithScore(
node=TextNode(
text=n.metadata["text"],
id_=str(n.id),
metadata={"hop": n.hop, "edge_type": n.edge_type},
),
score=n.score,
)
for n in chain.nodes
]
The Write-Back
LlamaIndex query pipelines support post-response callbacks. The cleanest approach is to subclass the query engine and override _aquery / _query:
from llama_index.core.query_engine import RetrieverQueryEngine
class FeatherQueryEngine(RetrieverQueryEngine):
def __init__(self, retriever, response_synthesizer, db, embed_fn, **kwargs):
super().__init__(retriever=retriever, response_synthesizer=response_synthesizer, **kwargs)
self.db = db
self.embed_fn = embed_fn
def _query(self, query_bundle):
response = super()._query(query_bundle)
text = str(response)
source_ids = [int(n.node.id_) for n in response.source_nodes]
out_vec = self.embed_fn(text)
out_id = self.db.next_id()
self.db.add(
id=out_id, vec=out_vec, modality="text",
metadata={"text": text, "kind": "agent_output"},
)
for src_id in source_ids:
self.db.link(src_id, out_id, edge_type="derived_from")
self.db.save()
return response
Where the Two Frameworks Differ
- LangChain exposes the loop via callbacks. The write-back is async-friendly but a bit invisible — easy to forget which callback fires when.
- LlamaIndex encourages subclassing the query engine. The write-back is right next to the retrieval call — more visible and easier to reason about.
Neither framework currently treats "memory that gets written back" as a first-class abstraction. Both can host a Living Context Engine — but the integration is bring-your-own.
The Stack You End Up With
┌──────────────────────────────────────────┐
│ LangChain / LlamaIndex orchestration │
├──────────────────────────────────────────┤
│ FeatherRetriever (read) │
│ ContextLoopCallback / FeatherQueryEngine │
│ (reason + update + decay) │
├──────────────────────────────────────────┤
│ Feather DB engine │
│ (HNSW + typed graph + decay kernel) │
├──────────────────────────────────────────┤
│ agent.feather (single file) │
└──────────────────────────────────────────┘
Recommended Practice
Keep one .feather file per agent or per tenant. The orchestration framework lives at the application level; the engine file lives next to the agent's checkpoint. Both LangChain and LlamaIndex have memory-per-session abstractions you can map onto this — wire each session to its own file and isolation comes for free.
Related: Build one from scratch in Python · Integrations docs.