# Feather DB + LangGraph: Agent Memory Across Graph Runs

> LangGraph checkpoints let you replay a run. Feather DB gives your graph semantic memory — find relevant past context by meaning, not position. Here's how to wire them together with FeatherMemoryNode as a first-class subgraph node.

- **Category**: Deploy
- **Read time**: 9 min read
- **Date**: June 16, 2026
- **Author**: Feather DB Engineering (Engineering Team)
- **URL**: https://getfeather.store/theory/feather-db-langgraph-integration

---

# Feather DB + LangGraph: Agent Memory Across Graph Runs

*Tutorial · LangGraph 0.2+ · Feather DB v0.16.0 · June 2026*

---

## The Gap in LangGraph's Persistence Model

LangGraph ships with a solid persistence story. The `MemorySaver` checkpointer serializes your graph's full state dict after every node execution. You get replay: given a `thread_id`, you can resume an interrupted run or rewind to any checkpoint. That's useful for debugging and for long-running workflows that must survive restarts.

What checkpoints don't give you is *semantic recall across runs*. A checkpoint is a snapshot of a specific run's state. It doesn't let you ask: *"what did this agent learn about pricing strategy across the last 40 conversations?"* You can't query a checkpoint by meaning. You can only replay it by position.

The gap looks like this:

```text
LangGraph checkpoint store
  thread_id=abc123 → [state_t0, state_t1, state_t2, ...]  ← replay by position
  thread_id=def456 → [state_t0, state_t1, ...]

What's missing:
  "find everything relevant to 'pricing objections'" → ??? across all threads, all time

```

Feather DB fills that gap. It sits alongside LangGraph's checkpointer — not replacing it — and adds a semantic memory layer that persists across runs, users, and sessions. The two systems are complementary: checkpoints for replay, Feather for recall.

## What Feather Adds: Semantic Memory, Not Replay

Feather DB is an embedded vector database with adaptive decay scoring. Every insight your agent produces can be stored as a vector. At the start of the next run, a semantic search surfaces the most relevant past context — regardless of which thread generated it or how long ago it was stored.

Three properties make this useful in a LangGraph context:

  - **Adaptive decay.** Memories that get retrieved repeatedly stay sharp. Memories that stop being relevant fade. No manual curation — the retrieval pattern becomes the memory signal.

  - **Metadata filters.** Scope memory per user, per session, or per topic with `filter_attributes`. One `.feather` file can serve many tenants safely.

  - **Fast cold start.** Parallel HNSW load (`FEATHER_LOAD_THREADS=8`) brings a 40K-vector index online in under 50ms — fast enough for serverless node execution.

## Integration Pattern: FeatherMemoryNode

The cleanest integration pattern treats Feather DB as two nodes in your StateGraph: a read node at the top of the graph and a write node at the bottom. Together they form a closed memory loop around every run.

```text
┌────────────────────────────────────────────┐
│  StateGraph                                │
│                                            │
│  [memory_read] ─→ [agent] ─→ [memory_write]│
│       ↑                            │       │
│       └──── Feather DB ────────────┘       │
├────────────────────────────────────────────┤
│  LangGraph MemorySaver (checkpoints)       │
│  thread_id: replay by position             │
├────────────────────────────────────────────┤
│  agent.feather  (semantic recall)          │
└────────────────────────────────────────────┘

```

The state carries a `memory_context` field that `memory_read` populates. Every downstream node can read it. `memory_write` stores the final agent output back to Feather, closing the loop.

## Complete Working Example

### Install

```bash
pip install feather-db langgraph langchain-openai
```

### State definition

```python
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
import operator

class AgentState(TypedDict):
    # User input for this run
    user_query: str
    # Feather DB populates this at the start of each run
    memory_context: str
    # The agent's final response
    response: str
    # Metadata for scoping memory (user_id, session_id, etc.)
    user_id: str

```

### Feather DB setup

```python
import os
import feather_db as fdb
import numpy as np
from openai import OpenAI

# Parallel HNSW load — 48ms cold start on 40K vectors (v0.16.0)
os.environ["FEATHER_LOAD_THREADS"] = "8"

openai_client = OpenAI()

def embed(text: str) -> np.ndarray:
    resp = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return np.array(resp.data[0].embedding, dtype=np.float32)

# One file for all agent memory — scoped per user via metadata filters
db = fdb.DB.open("agent_memory.feather", dim=1536)

```

### Memory read node

```python
def memory_read_node(state: AgentState) -> dict:
    """Retrieve semantically relevant past context at the start of each run."""
    query_vec = embed(state["user_query"])
    user_id = state.get("user_id", "default")

    # Scope to this user's memories with metadata filter
    results = db.search(
        query_vec,
        k=5,
        filter_attributes={"user_id": user_id}
    )

    if not results:
        return {"memory_context": ""}

    # Format retrieved memories into a context block
    context_parts = []
    for i, r in enumerate(results, 1):
        text = r.metadata.get_attribute("text")
        score = r.score
        context_parts.append(f"[Memory {i} | relevance={score:.3f}]\n{text}")

    memory_context = "\n\n".join(context_parts)
    return {"memory_context": memory_context}

```

### Agent node

```python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0)

def agent_node(state: AgentState) -> dict:
    """Core agent reasoning — receives past context from Feather."""
    system_prompt = "You are a helpful assistant with access to relevant past context."

    messages = [{"role": "system", "content": system_prompt}]

    # Inject semantic memory from Feather if available
    if state.get("memory_context"):
        messages.append({
            "role": "system",
            "content": f"Relevant past context:\n\n{state['memory_context']}"
        })

    messages.append({"role": "user", "content": state["user_query"]})

    response = llm.invoke(messages)
    return {"response": response.content}

```

### Memory write node

```python
import time

_next_id = int(time.time() * 1000)  # simple monotonic ID

def memory_write_node(state: AgentState) -> dict:
    """Store the agent's response as a new memory in Feather DB."""
    global _next_id

    response_text = state["response"]
    user_id = state.get("user_id", "default")
    query = state["user_query"]

    # Store the (query, response) pair as a memory unit
    memory_text = f"Q: {query}\nA: {response_text}"
    vec = embed(memory_text)

    meta = fdb.Metadata(importance=0.7)
    meta.set_attribute("text", memory_text)
    meta.set_attribute("user_id", user_id)
    meta.set_attribute("kind", "agent_turn")
    meta.set_attribute("timestamp", str(int(time.time())))

    _next_id += 1
    db.add(id=_next_id, vec=vec, metadata=meta)
    db.save()

    return {}  # no state update — write is a side effect

```

### Wiring the graph

```python
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver

# Build the graph
builder = StateGraph(AgentState)
builder.add_node("memory_read", memory_read_node)
builder.add_node("agent", agent_node)
builder.add_node("memory_write", memory_write_node)

# Linear flow: read → agent → write
builder.set_entry_point("memory_read")
builder.add_edge("memory_read", "agent")
builder.add_edge("agent", "memory_write")
builder.add_edge("memory_write", END)

# LangGraph checkpointer for replay — runs alongside Feather
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)

```

### Running the graph

```python
config = {
    "configurable": {
        "thread_id": "user-alice-session-1"   # LangGraph checkpoint key
    }
}

result = graph.invoke(
    {
        "user_query": "What's our current pricing strategy for enterprise deals?",
        "user_id": "alice",
        "memory_context": "",
        "response": ""
    },
    config=config
)

print(result["response"])

```

On the first run, `memory_context` will be empty. On subsequent runs — across different sessions, different `thread_id`s — Feather surfaces past turns that are semantically relevant to the new query. LangGraph's `MemorySaver` handles replay within a thread; Feather handles recall across threads.

## Using Metadata Filters to Scope Memory Per User

A single `.feather` file can store memories for many users. The filter keeps retrieval scoped:

```python
# Only Alice's memories
results = db.search(
    query_vec,
    k=5,
    filter_attributes={"user_id": "alice"}
)

# Scope to a specific session
results = db.search(
    query_vec,
    k=5,
    filter_attributes={"user_id": "alice", "session_id": "q3-planning"}
)

# Scope to a topic tag
results = db.search(
    query_vec,
    k=5,
    filter_attributes={"user_id": "alice", "kind": "pricing_insight"}
)

```

Filter attributes are exact-match AND conditions applied before scoring. They don't touch recall — only pre-filter the candidate set before HNSW traversal. Zero overhead on unfiltered recall@10 (97.2%).

## Adaptive Decay for Time-Sensitive State

Not all agent memory should age the same way. A short-term planning note from last Tuesday should fade faster than a core product insight from six months ago. Feather's decay formula handles this with per-query `half_life` control:

```python
import feather_db as fdb

# Short-term plans: half-life of 7 days
# After 7 days, a plan that hasn't been recalled sits at 50% of its peak score
short_term_cfg = fdb.ScoringConfig(half_life=7.0, weight=0.4, min=0.0)
recent_plans = db.search(
    query_vec,
    k=3,
    filter_attributes={"user_id": user_id, "kind": "short_term_plan"},
    scoring=short_term_cfg
)

# Long-term insights: half-life of 60 days
long_term_cfg = fdb.ScoringConfig(half_life=60.0, weight=0.2, min=0.0)
durable_insights = db.search(
    query_vec,
    k=5,
    filter_attributes={"user_id": user_id, "kind": "strategic_insight"},
    scoring=long_term_cfg
)

```

The decay formula from `include/scoring.h`:

```text
stickiness    = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency       = 0.5 ^ (effective_age / half_life_days)
final_score   = ((1 - time_weight) × similarity + time_weight × recency) × importance

```

A short-term plan recalled 5 times (stickiness = 2.79) ages at 36% of normal rate — it stays sharp during the window when it matters, then fades once retrieval stops reinforcing it. No manual expiration logic.

## Combining LangGraph Checkpoints with Feather Recall

The two systems solve different problems. The right mental model:

  CapabilityLangGraph MemorySaverFeather DB

  Replay a specific runYes — full state snapshotNo
  Resume interrupted runYes — resume from checkpointNo
  Find relevant past context by meaningNoYes — semantic search
  Memory across different thread_idsNoYes — cross-thread recall
  Memory that evolves with useNoYes — adaptive decay + stickiness
  Per-user / per-tenant isolationVia thread_id conventionVia metadata filter_attributes

In production you'll want both. Use `MemorySaver` (or a `SqliteSaver` / `PostgresSaver`) for checkpoint durability and run recovery. Use Feather for the semantic layer that makes each new run informed by everything the agent has learned before.

## Production: Seeding Memory with add_batch()

If you're deploying an agent with a history of past conversations, don't loop over them with individual `db.add()` calls. Use `add_batch()`, which releases the GIL and builds the HNSW graph in parallel — 3.4× faster than sequential on a 4-core machine, 5–6× on 8 cores.

```python
import feather_db as fdb
import numpy as np

os.environ["FEATHER_LOAD_THREADS"] = "8"   # parallel cold-start load
db = fdb.DB.open("agent_memory.feather", dim=1536)

# Load historical conversations from your data store
history = load_conversation_history()   # returns list of dicts

# Embed all turns in one batch call to your embedding API
texts = [f"Q: {h['query']}\nA: {h['response']}" for h in history]
vecs_list = embed_batch(texts)  # your batched embed function
vecs = np.array(vecs_list, dtype=np.float32)

# Build metadata
metas = []
for h in history:
    m = fdb.Metadata(importance=0.7)
    m.set_attribute("text", f"Q: {h['query']}\nA: {h['response']}")
    m.set_attribute("user_id", h["user_id"])
    m.set_attribute("kind", "agent_turn")
    m.set_attribute("timestamp", str(h["timestamp"]))
    metas.append(m)

ids = list(range(len(history)))

# Single parallel call — 3.4× faster than a loop over db.add()
db.add_batch(ids, vecs, metas=metas)
db.save()

print(f"Seeded {len(history)} memories into agent_memory.feather")

```

At 50k turns × 1536-dim, `add_batch()` completes in ~10s on a 4-core machine. The subsequent `DB.open()` with `FEATHER_LOAD_THREADS=8` loads that index in under 2s. Serverless cold start on a 40K-vector index: 48ms (v0.16.0 parallel HNSW load).

## Production-Ready Graph

Here's the full pattern with persistent SQLite checkpointing (for production durability) and batch memory seeding:

```python
import os
import time
import numpy as np
import feather_db as fdb
from openai import OpenAI
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.sqlite import SqliteSaver
from typing import TypedDict

# -- Config --
os.environ["FEATHER_LOAD_THREADS"] = "8"
FEATHER_PATH = "agent_memory.feather"
SQLITE_PATH  = "checkpoints.sqlite"
DIM          = 1536

openai_client = OpenAI()
llm = ChatOpenAI(model="gpt-4o", temperature=0)

def embed(text: str) -> np.ndarray:
    resp = openai_client.embeddings.create(model="text-embedding-3-small", input=text)
    return np.array(resp.data[0].embedding, dtype=np.float32)

db = fdb.DB.open(FEATHER_PATH, dim=DIM)
_id_counter = [int(time.time() * 1000)]

# -- State --
class AgentState(TypedDict):
    user_query: str
    user_id: str
    memory_context: str
    response: str

# -- Nodes --
def memory_read_node(state: AgentState) -> dict:
    vec = embed(state["user_query"])
    results = db.search(vec, k=5, filter_attributes={"user_id": state["user_id"]})
    if not results:
        return {"memory_context": ""}
    parts = [
        f"[Memory {i} | score={r.score:.3f}]\n{r.metadata.get_attribute('text')}"
        for i, r in enumerate(results, 1)
    ]
    return {"memory_context": "\n\n".join(parts)}

def agent_node(state: AgentState) -> dict:
    msgs = [{"role": "system", "content": "You are a helpful assistant."}]
    if state.get("memory_context"):
        msgs.append({
            "role": "system",
            "content": f"Relevant past context:\n\n{state['memory_context']}"
        })
    msgs.append({"role": "user", "content": state["user_query"]})
    return {"response": llm.invoke(msgs).content}

def memory_write_node(state: AgentState) -> dict:
    text = f"Q: {state['user_query']}\nA: {state['response']}"
    vec  = embed(text)
    meta = fdb.Metadata(importance=0.7)
    meta.set_attribute("text", text)
    meta.set_attribute("user_id", state["user_id"])
    meta.set_attribute("kind", "agent_turn")
    meta.set_attribute("timestamp", str(int(time.time())))
    _id_counter[0] += 1
    db.add(id=_id_counter[0], vec=vec, metadata=meta)
    db.save()
    return {}

# -- Graph --
builder = StateGraph(AgentState)
builder.add_node("memory_read",  memory_read_node)
builder.add_node("agent",        agent_node)
builder.add_node("memory_write", memory_write_node)
builder.set_entry_point("memory_read")
builder.add_edge("memory_read",  "agent")
builder.add_edge("agent",        "memory_write")
builder.add_edge("memory_write", END)

checkpointer = SqliteSaver.from_conn_string(SQLITE_PATH)
graph = builder.compile(checkpointer=checkpointer)

# -- Invoke --
result = graph.invoke(
    {"user_query": "Summarise our Q2 pricing decisions", "user_id": "alice",
     "memory_context": "", "response": ""},
    config={"configurable": {"thread_id": "alice-q2-review"}}
)
print(result["response"])

```

## What You Get

With this pattern in place:

  - Every graph run starts informed by semantically relevant past context — not just the last turn, but anything relevant across all prior runs.

  - Memory that gets retrieved repeatedly stays sharp via adaptive decay. Memory that stops being relevant fades — no manual cleanup.

  - Short-term plans age out in 7 days. Strategic insights persist for 60. You set the half-life per query.

  - LangGraph checkpoints still handle replay and run recovery. Feather handles the semantic layer that checkpoints can't.

  - `add_batch()` seeds production history in a single parallel call. Parallel HNSW load keeps cold starts under 50ms.

The `.feather` file lives alongside your graph. No infrastructure to provision. The agent's accumulated knowledge ships with it.

---

*Feather DB — [github.com/feather-store/feather](https://github.com/feather-store/feather) · `pip install feather-db`*

*Related: [LangChain + LlamaIndex integration](/blog/living-context-engine-langchain-llamaindex) · [add_batch() deep dive](/blog/feather-db-add-batch-parallel-ingestion) · [Parallel HNSW load](/blog/feather-db-parallel-hnsw-load)*

---

*This is the machine-readable mirror of the theory post at [getfeather.store/theory/feather-db-langgraph-integration](https://getfeather.store/theory/feather-db-langgraph-integration). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*