# v0.15.3: Adaptive HNSW Index Capacity — 7.7× Less Memory

> Feather DB v0.15.3 drops the hardcoded max_elements=1,000,000 cap on every HNSW index. Indexes now start at 4096 slots and grow on demand. A 19-namespace workload went from 709 MB to 92 MB index overhead — a 7.7× reduction with zero API or file-format changes.

- **Category**: Performance
- **Read time**: 7 min read
- **Date**: June 17, 2026
- **Author**: Feather DB (Engineering)
- **URL**: https://getfeather.store/theory/feather-db-v0153-adaptive-hnsw-index-capacity

---

## The problem: every index was pre-allocated for a million vectors

Until v0.15.3, every HNSW index created inside Feather DB was initialized with `max_elements=1,000,000`. The hnswlib library pre-allocates neighbor-list storage for every slot up front. That means even an index with 100 vectors was eating memory sized for one million.

Feather DB uses a separate HNSW index per namespace per modality. An application with 19 namespaces — users, agents, topics, or any other logical boundary — would create 19 (or more) of these indexes. Each one reserved enough RAM for a million vectors. The overhead was paid unconditionally, regardless of how many vectors actually lived in each namespace.

In a real 19-namespace workload we measured before the fix:

MetricBefore v0.15.3After v0.15.3

HNSW index overhead709 MB92 MB
Improvement—**7.7× less**

The fix didn't change a single line of user-facing API. No migration required. Same file format (v8).

## Why max_elements matters so much

hnswlib's `max_elements` parameter controls the size of several pre-allocated data structures inside the graph. The dominant cost is the neighbor-list array: for each of the `max_elements` slots, hnswlib reserves space for up to `M * 2` neighbor IDs at layer 0 and `M` neighbor IDs at upper layers (where M=16 by default). At a million elements, this is roughly 37 MB of neighbor-list storage alone per index — before a single vector is added.

With 19 namespaces, each namespace holding a separate index, the pre-allocated overhead was approximately 37 MB × 19 = 703 MB just for neighbor lists, plus ancillary data structures. Most of those indexes held dozens or hundreds of vectors, not millions.

The solution is to start small and grow.

## What changed internally

v0.15.3 introduces two constants and a resizing path:

```cpp
// C++ internals — src/db.cpp
static constexpr size_t INITIAL_MAX_ELEMENTS = 4096;

// When the index fills up, double its capacity
void DB::reserve(const std::string& ns, const std::string& modality) {
    auto& idx = modality_indices_[ns][modality];
    size_t current = idx.hnsw->getCurrentElementCount();
    size_t capacity = idx.hnsw->getMaxElements();

    if (current >= capacity) {
        size_t new_capacity = capacity * 2;
        idx.hnsw->resizeIndex(new_capacity);
    }
}

```

Every index now starts at 4,096 slots instead of 1,000,000. The `reserve()` helper is called in two places:

- **Before each individual `add()`** — checks if the index is at capacity and doubles it if so.

- **At the start of `add_batch()`** — reserves ahead of time for the entire batch size, avoiding mid-batch resizes that would stall the parallel thread pool.

Resizes are geometric (doubling). An index that grows from 4,096 to 1,000,000 elements will resize approximately log₂(1,000,000 / 4,096) ≈ 8 times total — amortized cost is negligible.

## Compact integration: survivor tracking

The `compact()` operation removes deleted vectors from the HNSW graph and rewrites the index. Before v0.15.3, compaction left the index at `max_elements=1,000,000` regardless of how many vectors survived. Now, `compact()` tracks the survivor count and rebuilds the index starting from a capacity appropriate to that count:

```cpp
// After compaction, right-size the rebuilt index
size_t survivor_count = collect_survivors(ns, modality);
size_t new_capacity   = next_power_of_two(survivor_count);
new_capacity          = std::max(new_capacity, INITIAL_MAX_ELEMENTS);

rebuild_index(ns, modality, new_capacity);

```

This means a namespace that had 50,000 vectors, then compacted down to 8,000 after bulk deletions, will have an index sized for ~8,192 slots after compaction — not 1,000,000. Long-running applications that regularly compact will see sustained memory savings, not just at startup.

## Where the 7.7× number comes from

The benchmark that produced the headline number used a realistic multi-namespace setup: 19 namespaces with population sizes ranging from 47 to 3,800 vectors each. All namespaces used the default `"text"` modality. The measurement isolates HNSW index overhead by subtracting vector storage (float32 × dim × count, which is the same before and after).

NamespaceVector countIndex overhead beforeIndex overhead after

ns-014737.3 MB0.24 MB
ns-0212037.3 MB0.24 MB
ns-0551237.3 MB0.48 MB
ns-111,80037.3 MB1.8 MB
ns-193,80037.3 MB3.7 MB
**Total (19 ns)****—****709 MB****92 MB**

The savings are proportionally largest for small namespaces. A namespace with 47 vectors previously claimed the same 37 MB index budget as one with 3,800. After v0.15.3, index overhead scales with actual usage.

## Who this helps most

Multi-namespace workloads are the primary beneficiary. The three most common patterns:

### 1. Per-user memory in a SaaS product

Each user gets their own namespace. A product with 50 active users in memory at once previously held 50 indexes pre-allocated for a million elements each — roughly 1.85 GB of index overhead before a single vector was added. After v0.15.3, a 50-user deployment with an average of 300 memories per user uses around 100 MB of index overhead total.

### 2. Per-agent memory in a multi-agent system

An orchestration layer running 10–30 specialized agents, each with its own namespace, would hit the pre-allocation problem at startup. With adaptive capacity, spawning 25 new agent namespaces costs the same as allocating 25 × 4,096-slot indexes instead of 25 × 1,000,000-slot indexes.

### 3. Topic-partitioned knowledge bases

Applications that create one namespace per document category, project, or topic domain often have highly uneven namespace populations. A namespace for "onboarding docs" might have 30 chunks; one for "engineering specs" might have 5,000. Previously both paid the same overhead. Now the 30-chunk namespace uses a tiny fraction of the RAM.

## Python example: before and after

The following example creates 20 namespaces with small populations — exactly the shape of workload that was penalized most before v0.15.3. The code is identical before and after the upgrade; only the memory footprint changes.

```python
import feather_db as fdb
import numpy as np

db = fdb.DB.open("multi_tenant.feather", dim=768)

# 20 namespaces, each with a modest number of vectors
# Before v0.15.3: ~740 MB index overhead
# After  v0.15.3: ~5 MB index overhead
namespace_sizes = {
    f"user-{i:02d}": np.random.randint(50, 400)
    for i in range(20)
}

for ns, n_vectors in namespace_sizes.items():
    vecs = np.random.randn(n_vectors, 768).astype(np.float32)
    ids  = list(range(n_vectors))
    # add_batch() calls reserve() once for the full batch
    # — no mid-batch resizes
    db.add_batch(ids, vecs, namespace=ns)

db.save()

# Memory breakdown is now proportional to actual usage
for ns, n_vectors in namespace_sizes.items():
    results = db.search(
        np.random.randn(768).astype(np.float32),
        k=5,
        namespace=ns
    )
    print(f"{ns}: {n_vectors} vectors → {len(results)} results")

```

The search API, save/load behavior, and file format are unchanged. Existing `.feather` files from v0.15.x load correctly — when they load, each namespace's index is initialized to the smaller starting size and the loaded vectors are re-inserted, so the in-memory footprint after load is also smaller.

## Combined with int8 RAM quantization

v0.15.0 shipped in-RAM int8 quantization via `set_int8_ram()`, which reduces vector storage by 1.7×. The two features compose independently: adaptive capacity reduces index overhead (neighbor lists, element counts, layer assignments), while int8 quantization reduces vector storage (the raw float32 bytes). On a memory-constrained host, combining both is the right default:

```python
import os, feather_db as fdb

os.environ["FEATHER_LOAD_THREADS"] = "8"  # parallel load (v0.15+)

db = fdb.DB.open("multi_tenant.feather", dim=768)

# Quantize each namespace's text modality in RAM
# (adaptive capacity already applied at index init and load)
for ns in db.list_namespaces():
    db.set_int8_ram(ns, "text", max_abs=1.0)

# Memory savings stack:
# - adaptive HNSW capacity:  7.7× less index overhead
# - int8 RAM quantization:   1.7× less vector storage

```

## Resize cost

The `resizeIndex()` call in hnswlib reallocates the neighbor-list array and copies existing data into the new allocation. This is O(current_capacity) in time and triggers one allocation of size `new_capacity - old_capacity`. In practice:

- For a namespace growing from 4,096 to 8,192 slots: the resize is sub-millisecond.

- For a namespace that has grown to 500,000 and resizes to 1,000,000: the resize is on the order of 40–80 ms on a modern CPU, amortized over 500,000 inserts that triggered the resize.

Batch ingestion via `add_batch()` avoids mid-batch resizes by calling `reserve()` with the full batch size before the first insert. For real-time single-item `add()` calls, occasional resizes are infrequent and short enough to be invisible in practice — a single resize at 4,096 elements takes less than 1 ms.

## No breaking changes

- **API:** No changes. All existing `add()`, `add_batch()`, `search()`, `compact()`, and `save()`/`open()` calls work identically.

- **File format:** Still v8. Files saved with v0.15.3 load on v0.15.0/v0.15.1/v0.15.2 and vice versa. The adaptive capacity is a runtime behavior, not a serialized property — indexes always save the current element count and re-initialize capacity on load.

- **Recall:** Unchanged. The HNSW graph structure (M=16, ef_construction=200, ef=50) is identical. Capacity management is entirely outside the graph traversal path.

## Upgrade

```bash
pip install feather-db==0.15.3

```

No code changes needed. The memory reduction is automatic on first run.

**GitHub:** [github.com/feather-store/feather](https://github.com/feather-store/feather)

---

*This is the machine-readable mirror of the theory post at [getfeather.store/theory/feather-db-v0153-adaptive-hnsw-index-capacity](https://getfeather.store/theory/feather-db-v0153-adaptive-hnsw-index-capacity). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*