# Feather DB + LangGraph: Agent Memory Across Graph Runs > LangGraph checkpoints let you replay a run. Feather DB gives your graph semantic memory — find relevant past context by meaning, not position. Here's how to wire them together with FeatherMemoryNode as a first-class subgraph node. - **Category**: Deploy - **Read time**: 9 min read - **Date**: June 16, 2026 - **Author**: Feather DB Engineering (Engineering Team) - **URL**: https://getfeather.store/theory/feather-db-langgraph-integration --- # Feather DB + LangGraph: Agent Memory Across Graph Runs *Tutorial · LangGraph 0.2+ · Feather DB v0.16.0 · June 2026* --- ## The Gap in LangGraph's Persistence Model LangGraph ships with a solid persistence story. The `MemorySaver` checkpointer serializes your graph's full state dict after every node execution. You get replay: given a `thread_id`, you can resume an interrupted run or rewind to any checkpoint. That's useful for debugging and for long-running workflows that must survive restarts. What checkpoints don't give you is *semantic recall across runs*. A checkpoint is a snapshot of a specific run's state. It doesn't let you ask: *"what did this agent learn about pricing strategy across the last 40 conversations?"* You can't query a checkpoint by meaning. You can only replay it by position. The gap looks like this: ```text LangGraph checkpoint store thread_id=abc123 → [state_t0, state_t1, state_t2, ...] ← replay by position thread_id=def456 → [state_t0, state_t1, ...] What's missing: "find everything relevant to 'pricing objections'" → ??? across all threads, all time ``` Feather DB fills that gap. It sits alongside LangGraph's checkpointer — not replacing it — and adds a semantic memory layer that persists across runs, users, and sessions. The two systems are complementary: checkpoints for replay, Feather for recall. ## What Feather Adds: Semantic Memory, Not Replay Feather DB is an embedded vector database with adaptive decay scoring. Every insight your agent produces can be stored as a vector. At the start of the next run, a semantic search surfaces the most relevant past context — regardless of which thread generated it or how long ago it was stored. Three properties make this useful in a LangGraph context: - **Adaptive decay.** Memories that get retrieved repeatedly stay sharp. Memories that stop being relevant fade. No manual curation — the retrieval pattern becomes the memory signal. - **Metadata filters.** Scope memory per user, per session, or per topic with `filter_attributes`. One `.feather` file can serve many tenants safely. - **Fast cold start.** Parallel HNSW load (`FEATHER_LOAD_THREADS=8`) brings a 40K-vector index online in under 50ms — fast enough for serverless node execution. ## Integration Pattern: FeatherMemoryNode The cleanest integration pattern treats Feather DB as two nodes in your StateGraph: a read node at the top of the graph and a write node at the bottom. Together they form a closed memory loop around every run. ```text ┌────────────────────────────────────────────┐ │ StateGraph │ │ │ │ [memory_read] ─→ [agent] ─→ [memory_write]│ │ ↑ │ │ │ └──── Feather DB ────────────┘ │ ├────────────────────────────────────────────┤ │ LangGraph MemorySaver (checkpoints) │ │ thread_id: replay by position │ ├────────────────────────────────────────────┤ │ agent.feather (semantic recall) │ └────────────────────────────────────────────┘ ``` The state carries a `memory_context` field that `memory_read` populates. Every downstream node can read it. `memory_write` stores the final agent output back to Feather, closing the loop. ## Complete Working Example ### Install ```bash pip install feather-db langgraph langchain-openai ``` ### State definition ```python from typing import TypedDict, Annotated from langgraph.graph import StateGraph, END from langgraph.checkpoint.memory import MemorySaver import operator class AgentState(TypedDict): # User input for this run user_query: str # Feather DB populates this at the start of each run memory_context: str # The agent's final response response: str # Metadata for scoping memory (user_id, session_id, etc.) user_id: str ``` ### Feather DB setup ```python import os import feather_db as fdb import numpy as np from openai import OpenAI # Parallel HNSW load — 48ms cold start on 40K vectors (v0.16.0) os.environ["FEATHER_LOAD_THREADS"] = "8" openai_client = OpenAI() def embed(text: str) -> np.ndarray: resp = openai_client.embeddings.create( model="text-embedding-3-small", input=text ) return np.array(resp.data[0].embedding, dtype=np.float32) # One file for all agent memory — scoped per user via metadata filters db = fdb.DB.open("agent_memory.feather", dim=1536) ``` ### Memory read node ```python def memory_read_node(state: AgentState) -> dict: """Retrieve semantically relevant past context at the start of each run.""" query_vec = embed(state["user_query"]) user_id = state.get("user_id", "default") # Scope to this user's memories with metadata filter results = db.search( query_vec, k=5, filter_attributes={"user_id": user_id} ) if not results: return {"memory_context": ""} # Format retrieved memories into a context block context_parts = [] for i, r in enumerate(results, 1): text = r.metadata.get_attribute("text") score = r.score context_parts.append(f"[Memory {i} | relevance={score:.3f}]\n{text}") memory_context = "\n\n".join(context_parts) return {"memory_context": memory_context} ``` ### Agent node ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-4o", temperature=0) def agent_node(state: AgentState) -> dict: """Core agent reasoning — receives past context from Feather.""" system_prompt = "You are a helpful assistant with access to relevant past context." messages = [{"role": "system", "content": system_prompt}] # Inject semantic memory from Feather if available if state.get("memory_context"): messages.append({ "role": "system", "content": f"Relevant past context:\n\n{state['memory_context']}" }) messages.append({"role": "user", "content": state["user_query"]}) response = llm.invoke(messages) return {"response": response.content} ``` ### Memory write node ```python import time _next_id = int(time.time() * 1000) # simple monotonic ID def memory_write_node(state: AgentState) -> dict: """Store the agent's response as a new memory in Feather DB.""" global _next_id response_text = state["response"] user_id = state.get("user_id", "default") query = state["user_query"] # Store the (query, response) pair as a memory unit memory_text = f"Q: {query}\nA: {response_text}" vec = embed(memory_text) meta = fdb.Metadata(importance=0.7) meta.set_attribute("text", memory_text) meta.set_attribute("user_id", user_id) meta.set_attribute("kind", "agent_turn") meta.set_attribute("timestamp", str(int(time.time()))) _next_id += 1 db.add(id=_next_id, vec=vec, metadata=meta) db.save() return {} # no state update — write is a side effect ``` ### Wiring the graph ```python from langgraph.graph import StateGraph, END from langgraph.checkpoint.memory import MemorySaver # Build the graph builder = StateGraph(AgentState) builder.add_node("memory_read", memory_read_node) builder.add_node("agent", agent_node) builder.add_node("memory_write", memory_write_node) # Linear flow: read → agent → write builder.set_entry_point("memory_read") builder.add_edge("memory_read", "agent") builder.add_edge("agent", "memory_write") builder.add_edge("memory_write", END) # LangGraph checkpointer for replay — runs alongside Feather checkpointer = MemorySaver() graph = builder.compile(checkpointer=checkpointer) ``` ### Running the graph ```python config = { "configurable": { "thread_id": "user-alice-session-1" # LangGraph checkpoint key } } result = graph.invoke( { "user_query": "What's our current pricing strategy for enterprise deals?", "user_id": "alice", "memory_context": "", "response": "" }, config=config ) print(result["response"]) ``` On the first run, `memory_context` will be empty. On subsequent runs — across different sessions, different `thread_id`s — Feather surfaces past turns that are semantically relevant to the new query. LangGraph's `MemorySaver` handles replay within a thread; Feather handles recall across threads. ## Using Metadata Filters to Scope Memory Per User A single `.feather` file can store memories for many users. The filter keeps retrieval scoped: ```python # Only Alice's memories results = db.search( query_vec, k=5, filter_attributes={"user_id": "alice"} ) # Scope to a specific session results = db.search( query_vec, k=5, filter_attributes={"user_id": "alice", "session_id": "q3-planning"} ) # Scope to a topic tag results = db.search( query_vec, k=5, filter_attributes={"user_id": "alice", "kind": "pricing_insight"} ) ``` Filter attributes are exact-match AND conditions applied before scoring. They don't touch recall — only pre-filter the candidate set before HNSW traversal. Zero overhead on unfiltered recall@10 (97.2%). ## Adaptive Decay for Time-Sensitive State Not all agent memory should age the same way. A short-term planning note from last Tuesday should fade faster than a core product insight from six months ago. Feather's decay formula handles this with per-query `half_life` control: ```python import feather_db as fdb # Short-term plans: half-life of 7 days # After 7 days, a plan that hasn't been recalled sits at 50% of its peak score short_term_cfg = fdb.ScoringConfig(half_life=7.0, weight=0.4, min=0.0) recent_plans = db.search( query_vec, k=3, filter_attributes={"user_id": user_id, "kind": "short_term_plan"}, scoring=short_term_cfg ) # Long-term insights: half-life of 60 days long_term_cfg = fdb.ScoringConfig(half_life=60.0, weight=0.2, min=0.0) durable_insights = db.search( query_vec, k=5, filter_attributes={"user_id": user_id, "kind": "strategic_insight"}, scoring=long_term_cfg ) ``` The decay formula from `include/scoring.h`: ```text stickiness = 1 + log(1 + recall_count) effective_age = age_in_days / stickiness recency = 0.5 ^ (effective_age / half_life_days) final_score = ((1 - time_weight) × similarity + time_weight × recency) × importance ``` A short-term plan recalled 5 times (stickiness = 2.79) ages at 36% of normal rate — it stays sharp during the window when it matters, then fades once retrieval stops reinforcing it. No manual expiration logic. ## Combining LangGraph Checkpoints with Feather Recall The two systems solve different problems. The right mental model: CapabilityLangGraph MemorySaverFeather DB Replay a specific runYes — full state snapshotNo Resume interrupted runYes — resume from checkpointNo Find relevant past context by meaningNoYes — semantic search Memory across different thread_idsNoYes — cross-thread recall Memory that evolves with useNoYes — adaptive decay + stickiness Per-user / per-tenant isolationVia thread_id conventionVia metadata filter_attributes In production you'll want both. Use `MemorySaver` (or a `SqliteSaver` / `PostgresSaver`) for checkpoint durability and run recovery. Use Feather for the semantic layer that makes each new run informed by everything the agent has learned before. ## Production: Seeding Memory with add_batch() If you're deploying an agent with a history of past conversations, don't loop over them with individual `db.add()` calls. Use `add_batch()`, which releases the GIL and builds the HNSW graph in parallel — 3.4× faster than sequential on a 4-core machine, 5–6× on 8 cores. ```python import feather_db as fdb import numpy as np os.environ["FEATHER_LOAD_THREADS"] = "8" # parallel cold-start load db = fdb.DB.open("agent_memory.feather", dim=1536) # Load historical conversations from your data store history = load_conversation_history() # returns list of dicts # Embed all turns in one batch call to your embedding API texts = [f"Q: {h['query']}\nA: {h['response']}" for h in history] vecs_list = embed_batch(texts) # your batched embed function vecs = np.array(vecs_list, dtype=np.float32) # Build metadata metas = [] for h in history: m = fdb.Metadata(importance=0.7) m.set_attribute("text", f"Q: {h['query']}\nA: {h['response']}") m.set_attribute("user_id", h["user_id"]) m.set_attribute("kind", "agent_turn") m.set_attribute("timestamp", str(h["timestamp"])) metas.append(m) ids = list(range(len(history))) # Single parallel call — 3.4× faster than a loop over db.add() db.add_batch(ids, vecs, metas=metas) db.save() print(f"Seeded {len(history)} memories into agent_memory.feather") ``` At 50k turns × 1536-dim, `add_batch()` completes in ~10s on a 4-core machine. The subsequent `DB.open()` with `FEATHER_LOAD_THREADS=8` loads that index in under 2s. Serverless cold start on a 40K-vector index: 48ms (v0.16.0 parallel HNSW load). ## Production-Ready Graph Here's the full pattern with persistent SQLite checkpointing (for production durability) and batch memory seeding: ```python import os import time import numpy as np import feather_db as fdb from openai import OpenAI from langchain_openai import ChatOpenAI from langgraph.graph import StateGraph, END from langgraph.checkpoint.sqlite import SqliteSaver from typing import TypedDict # -- Config -- os.environ["FEATHER_LOAD_THREADS"] = "8" FEATHER_PATH = "agent_memory.feather" SQLITE_PATH = "checkpoints.sqlite" DIM = 1536 openai_client = OpenAI() llm = ChatOpenAI(model="gpt-4o", temperature=0) def embed(text: str) -> np.ndarray: resp = openai_client.embeddings.create(model="text-embedding-3-small", input=text) return np.array(resp.data[0].embedding, dtype=np.float32) db = fdb.DB.open(FEATHER_PATH, dim=DIM) _id_counter = [int(time.time() * 1000)] # -- State -- class AgentState(TypedDict): user_query: str user_id: str memory_context: str response: str # -- Nodes -- def memory_read_node(state: AgentState) -> dict: vec = embed(state["user_query"]) results = db.search(vec, k=5, filter_attributes={"user_id": state["user_id"]}) if not results: return {"memory_context": ""} parts = [ f"[Memory {i} | score={r.score:.3f}]\n{r.metadata.get_attribute('text')}" for i, r in enumerate(results, 1) ] return {"memory_context": "\n\n".join(parts)} def agent_node(state: AgentState) -> dict: msgs = [{"role": "system", "content": "You are a helpful assistant."}] if state.get("memory_context"): msgs.append({ "role": "system", "content": f"Relevant past context:\n\n{state['memory_context']}" }) msgs.append({"role": "user", "content": state["user_query"]}) return {"response": llm.invoke(msgs).content} def memory_write_node(state: AgentState) -> dict: text = f"Q: {state['user_query']}\nA: {state['response']}" vec = embed(text) meta = fdb.Metadata(importance=0.7) meta.set_attribute("text", text) meta.set_attribute("user_id", state["user_id"]) meta.set_attribute("kind", "agent_turn") meta.set_attribute("timestamp", str(int(time.time()))) _id_counter[0] += 1 db.add(id=_id_counter[0], vec=vec, metadata=meta) db.save() return {} # -- Graph -- builder = StateGraph(AgentState) builder.add_node("memory_read", memory_read_node) builder.add_node("agent", agent_node) builder.add_node("memory_write", memory_write_node) builder.set_entry_point("memory_read") builder.add_edge("memory_read", "agent") builder.add_edge("agent", "memory_write") builder.add_edge("memory_write", END) checkpointer = SqliteSaver.from_conn_string(SQLITE_PATH) graph = builder.compile(checkpointer=checkpointer) # -- Invoke -- result = graph.invoke( {"user_query": "Summarise our Q2 pricing decisions", "user_id": "alice", "memory_context": "", "response": ""}, config={"configurable": {"thread_id": "alice-q2-review"}} ) print(result["response"]) ``` ## What You Get With this pattern in place: - Every graph run starts informed by semantically relevant past context — not just the last turn, but anything relevant across all prior runs. - Memory that gets retrieved repeatedly stays sharp via adaptive decay. Memory that stops being relevant fades — no manual cleanup. - Short-term plans age out in 7 days. Strategic insights persist for 60. You set the half-life per query. - LangGraph checkpoints still handle replay and run recovery. Feather handles the semantic layer that checkpoints can't. - `add_batch()` seeds production history in a single parallel call. Parallel HNSW load keeps cold starts under 50ms. The `.feather` file lives alongside your graph. No infrastructure to provision. The agent's accumulated knowledge ships with it. --- *Feather DB — [github.com/feather-store/feather](https://github.com/feather-store/feather) · `pip install feather-db`* *Related: [LangChain + LlamaIndex integration](/blog/living-context-engine-langchain-llamaindex) · [add_batch() deep dive](/blog/feather-db-add-batch-parallel-ingestion) · [Parallel HNSW load](/blog/feather-db-parallel-hnsw-load)* --- *This is the machine-readable mirror of the theory post at [getfeather.store/theory/feather-db-langgraph-integration](https://getfeather.store/theory/feather-db-langgraph-integration). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*