# Feather DB Python Quickstart: From pip install to First Search

> Everything you need to get from zero to a working semantic search in one Python file. Install, open, add, search, save — plus real embeddings, namespaces, batch ingestion, and a complete chatbot memory example.

- **Category**: Deploy
- **Read time**: 8 min read
- **Date**: June 16, 2026
- **Author**: Feather DB (Engineering)
- **URL**: https://getfeather.store/theory/feather-db-python-quickstart-2026

---

## What you'll build

By the end of this guide you'll have a Python script that opens a `.feather` file, adds vectors with metadata, searches them with time-weighted adaptive scoring, and persists everything to disk. Then we'll extend it into a real chatbot memory backed by sentence-transformers or the Gemini API.

No server. No Docker. One file on disk.

## 1. Prerequisites

  - **Python 3.9 or newer** — check with `python --version`

  - **pip** — comes bundled with Python 3.9+

  - A terminal (macOS, Linux, or Windows WSL)

That's it. Feather DB ships a compiled C++/Rust core as a wheel, so there's nothing to build from source.

## 2. Install

```bash
pip install feather-db
```

The wheel includes the HNSW index, SIMD-optimised vector math (AVX2/AVX512 on x86, NEON on Apple Silicon), the BM25 hybrid layer, and the adaptive scoring engine. One command, no extra dependencies required for basic usage.

Verify the install:

```python
import feather_db as fdb
print(fdb.__version__)  # e.g. 0.16.0
```

## 3. Basic usage: open, add, search, save

This example uses random float arrays instead of real embeddings so you can run it immediately without any API key.

```python
import feather_db as fdb
import numpy as np

# Open (or create) a database file.
# dim= must match the dimension of your embedding vectors.
db = fdb.DB.open("quickstart.feather", dim=128)

# Add three vectors with integer IDs
rng = np.random.default_rng(42)

vec_a = rng.random(128).astype(np.float32)
vec_b = rng.random(128).astype(np.float32)
vec_c = rng.random(128).astype(np.float32)

db.add(id=1, vec=vec_a)
db.add(id=2, vec=vec_b)
db.add(id=3, vec=vec_c)

# Search: find the 2 nearest neighbours to a query vector
query = rng.random(128).astype(np.float32)
results = db.search(query, k=2)

for r in results:
    print(f"id={r.id}  score={r.score:.4f}")

# Persist to disk — the .feather file is now loadable in any future process
db.save()
db.close()
```

`DB.open()` creates the file if it doesn't exist, or loads it from disk if it does. `db.save()` writes the HNSW graph and all metadata atomically. `db.close()` flushes and releases the file handle.

## 4. Real embeddings

Random vectors show the mechanics. Real embeddings make search semantically meaningful. Here are two drop-in options.

### Option A — sentence-transformers (local, no API key)

```bash
pip install sentence-transformers
```

```python
import feather_db as fdb
import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")  # 384-dim, ~90 MB download

def embed(text: str) -> np.ndarray:
    return model.encode(text, normalize_embeddings=True).astype(np.float32)

db = fdb.DB.open("semantic.feather", dim=384)

facts = [
    (1, "The Eiffel Tower is in Paris."),
    (2, "Python is a high-level programming language."),
    (3, "Feather DB stores vectors in a single .feather file."),
    (4, "The speed of light is approximately 299,792 km/s."),
]

for id_, text in facts:
    db.add(id=id_, vec=embed(text))

results = db.search(embed("Where is the Eiffel Tower?"), k=2)
for r in results:
    print(f"id={r.id}  score={r.score:.4f}")
# id=1  score=0.8341  ← correct top hit

db.save()
db.close()
```

### Option B — Gemini API (768-dim, free tier)

```bash
pip install google-generativeai
```

```python
import os
import feather_db as fdb
import numpy as np
import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

def embed(text: str) -> np.ndarray:
    result = genai.embed_content(
        model="models/text-embedding-004",
        content=text,
        task_type="retrieval_document"
    )
    return np.array(result["embedding"], dtype=np.float32)

# Gemini text-embedding-004 outputs 768-dim vectors
db = fdb.DB.open("gemini_semantic.feather", dim=768)

db.add(id=1, vec=embed("User prefers concise answers."))
db.add(id=2, vec=embed("User is building a FastAPI backend."))
db.add(id=3, vec=embed("User's favourite language is Python."))

query_vec = embed("What programming language does the user like?")
results = db.search(query_vec, k=2)
for r in results:
    print(f"id={r.id}  score={r.score:.4f}")

db.save()
db.close()
```

Both options produce `np.float32` arrays. The only thing that changes is the `dim=` argument you pass to `DB.open()`.

## 5. Key API surface

Here's the core API you'll use in 95% of Feather DB programs:

```python
import feather_db as fdb
import numpy as np

# --- Open / close ---
db = fdb.DB.open("my.feather", dim=768)   # create or load
db.save()                                  # write to disk (call after mutations)
db.close()                                 # flush + release file handle

# --- Add a single vector ---
db.add(id=42, vec=np.zeros(768, dtype=np.float32))

# --- Add with metadata ---
meta = fdb.Metadata(importance=0.9)
db.add(id=43, vec=np.zeros(768, dtype=np.float32), meta=meta)

# --- Search ---
results = db.search(vec=np.zeros(768, dtype=np.float32), k=5)
for r in results:
    print(r.id, r.score)

# --- Count vectors in the index ---
print(db.count())
```

Every method is synchronous and thread-safe for concurrent reads. Writes serialize internally.

## 6. Metadata: Metadata(), set_attribute(), get_attribute()

`Metadata` attaches structured information to each vector. The `importance` field is a first-class multiplicative weight in the scoring formula — higher values float the memory to the top regardless of age.

```python
import feather_db as fdb
import numpy as np

db = fdb.DB.open("meta_demo.feather", dim=128)
rng = np.random.default_rng(0)

# importance= is a float multiplier on the final score (default 1.0)
meta = fdb.Metadata(importance=1.5)

# set_attribute() stores arbitrary key-value string pairs
meta.set_attribute("text", "User explicitly stated they prefer dark mode.")
meta.set_attribute("source", "session-12")
meta.set_attribute("type", "preference")

db.add(id=1, vec=rng.random(128).astype(np.float32), meta=meta)
db.save()

# Reload from disk and read back attributes
db2 = fdb.DB.open("meta_demo.feather", dim=128)
results = db2.search(rng.random(128).astype(np.float32), k=1)
node = results[0]

if node.meta:
    print(node.meta.importance)              # 1.5
    print(node.meta.get_attribute("text"))   # "User explicitly stated..."
    print(node.meta.get_attribute("source")) # "session-12"

db2.close()
```

### Common mistake: don't use dict-style attribute assignment

```python
# WRONG — silently does nothing due to pybind11 copy semantics
meta.attributes["key"] = "value"

# CORRECT — always use set_attribute()
meta.set_attribute("key", "value")
```

The `meta.attributes` dict is a Python-side copy of the underlying C++ map. Assigning into it doesn't write back to the C++ object. `set_attribute()` does. Use it every time.

## 7. Search parameters: k, half_life, time_weight

Feather DB's scoring formula is:

```python
stickiness    = 1 + log(1 + recall_count)
effective_age = age_days / stickiness
recency       = 0.5 ** (effective_age / half_life)
score         = ((1 - time_weight) * similarity + time_weight * recency) * importance
```

Three parameters control the retrieval behaviour:

  - **`k`** — how many results to return. Start at 5 for most applications.

  - **`half_life`** — days until an unrecalled memory's recency score halves. Default: `14`. Use 7 for news agents, 30–60 for personal assistants, 180–365 for architecture decisions.

  - **`time_weight`** — fraction of the score allocated to recency vs similarity. Default: `0.3`. At `0.0` this is pure similarity search (identical to a standard vector DB). At `1.0` only freshness matters.

```python
import feather_db as fdb
import numpy as np

db = fdb.DB.open("params_demo.feather", dim=128)
rng = np.random.default_rng(7)
for i in range(20):
    db.add(id=i, vec=rng.random(128).astype(np.float32))

query = rng.random(128).astype(np.float32)

# Pure similarity — ignore time entirely
results_sim = db.search(query, k=5, time_weight=0.0)

# Default — 30% recency, 14-day half-life
results_default = db.search(query, k=5, half_life=14, time_weight=0.3)

# News agent — heavy recency, 7-day half-life
results_news = db.search(query, k=5, half_life=7, time_weight=0.5)

# Long-term knowledge base — 180-day half-life, low time weight
results_kb = db.search(query, k=5, half_life=180, time_weight=0.1)

db.close()
```

Tune `half_life` first — it has the biggest effect on which memories surface. Adjust `time_weight` only if you need freshness to compete with or override similarity.

## 8. Namespaces: default and custom

Namespaces partition the index. Memories in namespace `"user-alice"` are completely invisible to searches in namespace `"user-bob"`. ANN traversal is scoped per namespace, so search latency scales with the number of memories for that user — not the total number of memories across all users.

```python
import feather_db as fdb
import numpy as np

db = fdb.DB.open("multi_user.feather", dim=128)
rng = np.random.default_rng(99)

def rand_vec():
    return rng.random(128).astype(np.float32)

# Default namespace — no namespace= argument needed
db.add(id=1, vec=rand_vec())

# Custom namespaces — hard tenant boundaries
db.add(id=10, vec=rand_vec(), namespace="user-alice")
db.add(id=11, vec=rand_vec(), namespace="user-alice")
db.add(id=20, vec=rand_vec(), namespace="user-bob")

query = rand_vec()

# Default namespace search — never returns user-alice or user-bob results
default_results = db.search(query, k=5)

# alice's namespace — bob's memories never appear
alice_results = db.search(query, k=5, namespace="user-alice")

# bob's namespace — alice's memories never appear
bob_results = db.search(query, k=5, namespace="user-bob")

db.save()
db.close()
```

One `.feather` file can hold thousands of namespaces. Use `namespace=user_id` for multi-tenant SaaS, `namespace=agent_id` for multi-agent systems, or leave it unset for single-user applications.

## 9. add_batch() for bulk ingestion

Sequential `db.add()` calls cross the Python/C++ boundary on every insert. For large corpora, use `add_batch()` — it releases the GIL and builds the HNSW graph in parallel across all available CPU cores.

```python
import feather_db as fdb
import numpy as np

db = fdb.DB.open("corpus.feather", dim=768)

# Prepare data as numpy arrays
N = 50_000
ids  = list(range(N))
vecs = np.random.randn(N, 768).astype(np.float32)

# Optional: attach metadata to each vector
metas = []
for i in range(N):
    m = fdb.Metadata(importance=0.8)
    m.set_attribute("source", "batch_import")
    m.set_attribute("chunk_index", str(i))
    metas.append(m)

# Single parallel call — ~3.4× faster than a sequential add() loop
db.add_batch(ids, vecs, metas=metas)
db.save()
db.close()
```

**Benchmark on a 4-core machine, 50k × 768-dim vectors:**

  MethodTimeSpeedup
  
    Sequential `add()` loop~34s1×
    `add_batch()`~10s3.4×
  

On an 8-core machine expect 5–6×. Use `add_batch()` any time you're inserting more than ~1k vectors at once. Use sequential `add()` for real-time, single-item inserts (e.g. writing a new memory immediately after a conversation turn).

## 10. v0.16.0 fast cold load: save() persists the graph, open() loads in 48ms

Before v0.16.0, `DB.open()` on a non-empty file rebuilt the HNSW graph from scratch. At 40k vectors that took ~7 seconds. v0.16.0 persists the compiled graph structure inside the `.feather` file so `open()` can memory-map it directly.

```python
import time
import feather_db as fdb
import numpy as np

# First run: build and save (happens once)
db = fdb.DB.open("fast_load.feather", dim=768)
vecs = np.random.randn(40_000, 768).astype(np.float32)
db.add_batch(list(range(40_000)), vecs)
db.save()   # persists HNSW graph — subsequent opens skip rebuild
db.close()

# Every subsequent run: graph loads from disk in ~48ms
t0 = time.perf_counter()
db2 = fdb.DB.open("fast_load.feather", dim=768)
elapsed = (time.perf_counter() - t0) * 1000
print(f"Loaded {db2.count()} vectors in {elapsed:.0f}ms")
# → Loaded 40000 vectors in 48ms

db2.close()
```

This matters most for serverless functions, Kubernetes pods with frequent restarts, and development loops where you restart the process after every code change. Call `db.save()` after any batch of mutations to keep the on-disk graph current.

Combine with `FEATHER_LOAD_THREADS` for even faster loads on larger indexes:

```python
import os
os.environ["FEATHER_LOAD_THREADS"] = "8"   # parallel HNSW reconstruction
db = fdb.DB.open("large_corpus.feather", dim=768)
```

## 11. Common mistakes

### meta.attributes[key] = val — silently does nothing

```python
# WRONG
meta = fdb.Metadata(importance=1.0)
meta.attributes["text"] = "User prefers Python"  # no-op — pybind11 copy issue

# CORRECT
meta = fdb.Metadata(importance=1.0)
meta.set_attribute("text", "User prefers Python")  # writes to C++ object
```

### Forgetting db.save() after mutations

```python
db.add(id=1, vec=vec)
# Process crashes or exits here — the add() is lost

# Correct pattern: save after every logical batch
db.add(id=1, vec=vec)
db.add(id=2, vec=vec2)
db.save()  # atomic write — all-or-nothing
```

### Mixing dim= across open() calls

```python
# Created with dim=768
db = fdb.DB.open("memory.feather", dim=768)
db.add(id=1, vec=np.zeros(768, dtype=np.float32))
db.save()
db.close()

# Reopened with wrong dim — raises ValueError
db2 = fdb.DB.open("memory.feather", dim=1536)  # mismatch!
```

The `dim=` is stored in the file header. Always pass the same value that was used when the file was first created.

## Complete example: chatbot memory

This is a minimal but complete persistent chatbot memory. It works with any embedding function — swap in `sentence-transformers`, Gemini, or OpenAI by replacing the `embed()` function.

```python
"""
chatbot_memory.py — persistent memory for a chatbot using Feather DB.

Install:
    pip install feather-db sentence-transformers openai

Usage:
    python chatbot_memory.py
"""
import os
import time
import numpy as np
import feather_db as fdb
from openai import OpenAI

# --- Embedding setup (swap this block for any provider) ---
from sentence_transformers import SentenceTransformer
_model = SentenceTransformer("all-MiniLM-L6-v2")

def embed(text: str) -> np.ndarray:
    return _model.encode(text, normalize_embeddings=True).astype(np.float32)

DIM = 384   # all-MiniLM-L6-v2 output dimension

# --- Feather DB setup ---
db = fdb.DB.open("chatbot_memory.feather", dim=DIM)

# --- Memory helpers ---
def remember(text: str, importance: float = 0.5, source: str = "chat") -> int:
    """Store a piece of text in persistent memory."""
    node_id = int(time.time() * 1000) % (2 ** 31)
    meta = fdb.Metadata(importance=importance)
    meta.set_attribute("text", text)
    meta.set_attribute("source", source)
    meta.set_attribute("ts", str(time.time()))
    db.add(id=node_id, vec=embed(text), meta=meta)
    return node_id

def recall(query: str, k: int = 5) -> list[str]:
    """Retrieve the k most relevant memories for a query."""
    results = db.search(
        vec=embed(query),
        k=k,
        half_life=30,       # memories decay over 30 days if not recalled
        time_weight=0.3     # 30% recency, 70% semantic similarity
    )
    texts = []
    for r in results:
        if r.meta:
            t = r.meta.get_attribute("text")
            if t:
                texts.append(t)
    return texts

# --- Chat loop ---
llm = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def chat(user_message: str) -> str:
    # 1. Retrieve relevant memories
    memories = recall(user_message, k=5)
    memory_block = "\n".join(f"- {m}" for m in memories)

    # 2. Call the LLM with memory context injected
    response = llm.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a helpful assistant with persistent memory.\n"
                    "Relevant facts from past conversations:\n"
                    + (memory_block or "(no memories yet)")
                )
            },
            {"role": "user", "content": user_message}
        ]
    )
    reply = response.choices[0].message.content

    # 3. Write the exchange back to memory
    exchange = f"User: {user_message[:120]}  Assistant: {reply[:200]}"
    remember(exchange, importance=0.4, source="session")

    # 4. Save after every turn — 48ms cold load means no batching needed
    db.save()

    return reply

# Seed some initial facts about the user (run once)
if db.count() == 0:
    remember("User's name is Ashwath.", importance=0.9, source="identity")
    remember("User is building an AI agent for customer support.", importance=0.8, source="context")
    remember("User prefers concise answers, no filler phrases.", importance=0.85, source="preference")
    db.save()
    print("Memory seeded. Starting chat...\n")

# Interactive loop
while True:
    try:
        user_input = input("You: ").strip()
    except (EOFError, KeyboardInterrupt):
        break
    if not user_input or user_input.lower() in ("exit", "quit"):
        break
    print(f"Assistant: {chat(user_input)}\n")

db.close()
print("Bye. Memories saved to chatbot_memory.feather.")
```

What this does on each turn:

  - Embeds the user message and searches the `.feather` file for the 5 most relevant memories, scored by semantic similarity × recency × importance.

  - Injects those memories into the system prompt so the LLM knows what it needs to know — without sending the entire history every time.

  - Writes the exchange back as a new memory node so knowledge accumulates across sessions.

  - Calls `db.save()` so nothing is lost if the process exits.

Restart the script and the memories survive. The chatbot remembers across processes, sessions, and deployments — because it's just a file.

## What's next

  - **Namespaces** — serve multiple users from one file: [Namespace and Entity Design](/blog/feather-db-namespace-entity-design)

  - **Faster bulk ingest** — [add_batch(): 3.4× Faster Bulk Ingestion](/blog/feather-db-add-batch-parallel-ingestion)

  - **Adaptive scoring deep-dive** — [The Feather DB Adaptive Scoring Formula](/blog/feather-db-adaptive-scoring-explained)

  - **MCP integration** — use Feather DB as Claude Desktop's memory: [MCP + Claude Desktop Setup](/blog/feather-db-mcp-claude-desktop-setup)

  - **GitHub:** [github.com/feather-store/feather](https://github.com/feather-store/feather)

**Install:** `pip install feather-db`

---

*This is the machine-readable mirror of the theory post at [getfeather.store/theory/feather-db-python-quickstart-2026](https://getfeather.store/theory/feather-db-python-quickstart-2026). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*