# Feather DB File Format: From v3 to v9 — What Changed

> Six format versions, one embedded file. Feather DB's .feather format evolved from a plain float32+HNSW layout (v3) to a fully self-contained persisted-graph format (v9) that cold-loads a 500K-node index in 48ms. Here's every change that mattered.

- **Category**: Architecture
- **Read time**: 6 min read
- **Date**: June 18, 2026
- **Author**: Feather DB (Engineering)
- **URL**: https://getfeather.store/theory/feather-db-file-format-evolution

---

## One file that does everything

Feather DB has always been a single-file database. The entire persistent state — vectors, HNSW graph, typed edges, metadata, quantization factors, namespace tables — lives in one `.feather` binary. Copy it to move a database. Delete it to drop one.

That constraint has driven every file format decision since v3. The format has to be self-describing, backward-compatible, and fast to open. From v3 through v9, it evolved through six distinct versions to hit all three. This post covers every version, what it added, and the benchmark numbers that motivated each change.

## The version history at a glance

VersionReleasedKey changeHNSW on load

v3v0.10.0Initial stable format. Float32 vectors, HNSW graph, flat metadata.Rebuild from scratch
v4v0.11.0Typed context graph edges. Breaking change from v3.Rebuild from scratch
v5v0.12.0Recall count tracking added to metadata. Non-breaking.Rebuild from scratch
v6v0.13.0Section index introduced. Prior versions were fully sequential. Breaking change.Rebuild from scratch
v7v0.14.0Namespace table. Optimized variable-length HNSW neighbor list encoding.Rebuild from scratch
v8v0.15.0Per-vector int8 scale factors. Metadata offset table for O(log n) lookup. Parallel load via `FEATHER_LOAD_THREADS`.Parallel rebuild (4.79× faster)
v9v0.16.0Persisted HNSW graph embedded in file. Per-modality `persist_graph` flag. int8 modalities get int8 graph with exact round-trip.Direct load — no rebuild

## v3 to v7: the base era

The v3 format established the structure that all later versions built on: a fixed-width magic + version header, followed by sections for the vector store, HNSW graph, metadata, and context graph edges.

The critical design choice in this era — one that had to be revisited in v9 — was how the HNSW graph was stored. The file persisted the raw graph data (neighbor lists, level assignments, the global entry point), but the in-memory graph structure was always reconstructed from those bytes on `DB.open()`. Reconstruction means allocating the in-memory nodes, wiring up the neighbor pointers, and validating HNSW invariants. For a large index, this is expensive CPU work.

The v6 section index was the most structurally important change in this era. Before v6, sections were written sequentially with no directory: a reader had to scan from the start to find any section. The section index — a table of `(section_type, byte_offset, byte_length)` triples written immediately after the header — lets `DB.open()` seek directly to any section. It also made the format extensible: unknown section types are skipped, so future versions can add sections without breaking older readers.

v7 added the namespace table and tightened the HNSW neighbor list encoding to variable-length arrays, reducing file size for sparse upper-level nodes.

## v8: int8 quantization and parallel load

v8 shipped with Feather DB v0.15.0 and addressed two problems simultaneously: memory footprint and cold start latency.

### int8 quantization support

v8 added a per-vector int8 quantization section to the file. Each float32 vector can be stored alongside its int8 encoding and a single float32 scale factor. When `set_int8_ram()` is called after `DB.open()`, Feather quantizes the in-memory graph in-place, keeping vectors as int8 with per-vector `max_abs` scale.

At 60K × 768-dim, RAM drops from 227 MB (float32) to 129 MB (int8) — a 1.76× reduction. Recall@10 moves from 0.972 to ~0.88. For context retrieval workloads where you're surfacing 5–10 chunks, 0.88 is fine.

### Parallel HNSW load

The more operationally impactful v8 change was parallel graph reconstruction. The graph build phase — wiring up in-memory neighbor pointers from the serialized edge lists — parallelizes nearly linearly: each node's neighbor list is independent of every other node's reconstruction.

The `FEATHER_LOAD_THREADS` environment variable controls the thread count:

```python
import os
import feather_db as fdb

os.environ["FEATHER_LOAD_THREADS"] = "8"
db = fdb.DB.open("memory.feather", dim=768)
# Cold start at 40K × 128-dim: ~1.7s vs 7.6s serial

```

At 8 threads, a 40K × 128-dim index drops from 7.6s to 1.7s (4.5×). The speedup saturates around 8–16 threads because the IO phase and cache contention become the bottleneck. But this was still a rebuild — every `DB.open()` call reconstructed the graph from the raw edge data. For a 500K-node index, that means 13.4s serial or 2.7s parallel. Every time.

## v9: persisted HNSW graph

v9 ships with Feather DB v0.16.0. The change is direct: the HNSW graph is now embedded in the `.feather` file in its final in-memory representation. On load, Feather maps the persisted graph directly — no reconstruction, no pointer wiring, no invariant validation. The graph is ready to use as-is.

### What "persisted graph" means structurally

In v3–v8, the graph section stored raw edge data: neighbor ID lists per node, level assignments, the global entry point. Reconstruction turned that into the in-memory graph structure. In v9, the persisted graph section stores the in-memory structure itself — including the exact link lists that HNSW traversal reads during search. On load, Feather reads the section and has a live graph without any build step.

### Per-modality persist_graph flag

Persisting the graph is opt-in per modality. The v9 file header includes a per-modality `persist_graph` flag. When `True`, the full link list structure for that modality is embedded. When `False`, the modality falls back to the v8 behavior (parallel rebuild from edge data).

This matters because not every modality benefits equally. A modality with 5K vectors costs little to rebuild. A modality with 500K vectors is exactly where the 48ms vs 2.7s difference is felt.

### int8 modalities get int8 graphs

For modalities with int8 quantization enabled, v9 persists the link lists in int8 form. The round-trip is exact: the int8 graph written at `save()` is byte-identical to the int8 graph used during search before the save. No precision is lost moving through the file boundary.

### File size tradeoff

Persisting the link lists increases file size by approximately 25%. For a 500K-vector index at M=16, the embedded link lists add meaningful bytes — but the cold load benchmark makes the tradeoff obvious:

## Cold load benchmarks across format versions

ScenarioLoad timeSpeedup vs v8 serial

v8 file, serial rebuild (`FEATHER_LOAD_THREADS=1`)13.4s1×
v8 file, parallel rebuild (`FEATHER_LOAD_THREADS=8`)2.7s5×
v9 file, persisted graph load48ms279× faster than serial, 56× faster than parallel

48ms versus 2.7s. The difference is whether your serverless function warms up before the request times out, whether your Kubernetes pod is ready before the liveness probe fails, whether your dev server feels instant or sluggish after a restart.

## When the fast path does not apply

The 48ms load requires a v9 file with a persisted graph. Three conditions cause Feather to fall back to parallel rebuild instead:

- **`forget()` or `purge()` called since the last `save()`**. These mutate the graph. The persisted section is now stale. Feather marks the index dirty and rebuilds on the next load.

- **On-disk quantization enabled for the modality.** When vectors are stored in quantized form on disk, the link list geometry may not match what the full-precision graph would produce. Feather skips graph persistence for these modalities and falls back to rebuild.

- **The file is v3–v8.** v9 readers load older files without issue — they just rebuild the graph as those versions always did.

The fix for the first case is straightforward: after any `forget()` or `purge()` batch, run `compact()` then `save()`. The `compact()` call defragments the graph after deletions, and the subsequent `save()` persists the clean graph at v9. The next `DB.open()` gets the 48ms path again.

```python
import feather_db as fdb

db = fdb.DB.open("memory.feather", dim=768)

# Delete stale entries
db.forget(old_node_ids)

# Re-enable fast load path
db.compact()
db.save("memory.feather")

# Next open() will use the persisted v9 graph

```

## Backward compatibility

v9 readers load v3–v8 files without modification. The section index design (introduced in v6) means the new persisted-graph section is simply absent in older files — the reader treats its absence as "fall back to rebuild." No migration tool required to read old files. Migration is only needed to get the new fast-load benefit, and it's a single `save()` call from a v0.16.0 install.

## The format philosophy

Each format version solved a specific problem that emerged at scale. v6's section index solved discoverability. v8's parallel load solved the serverless cold start. v9's persisted graph solves the remaining gap: large indexes that were fast to query but slow to re-initialize.

The constraint — one file, zero infrastructure — has held across every version. The complexity lives in the format. The API stays the same: `DB.open()`, `db.search()`, `db.save()`.

**Install:** `pip install feather-db` · **GitHub:** [github.com/feather-store/feather](https://github.com/feather-store/feather)

---

*This is the machine-readable mirror of the theory post at [getfeather.store/theory/feather-db-file-format-evolution](https://getfeather.store/theory/feather-db-file-format-evolution). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*