Back to Theory
Technical Deep Dive

Unified Multimodal Intelligence for Performance Marketing: Feather DB + Gemini Embedding 2

F
Feather DB Engineering
March 12, 2026 · 9 min read

Interactive · Toggle between architectures

TEXT INDEXIMAGE INDEXVIDEO INDEXTTEXTIIMAGEVVIDEOcreative briefad visualstrategy intelcompetitor advideo transcript
→ 3 separate indexes · manual merge required
TEXT
IMAGE
VIDEO
Hawky.ai · Feather DB v0.5.0

Technical Deep Dive · Hawky.ai Feather DB v0.5.0 · March 2026


Abstract

Modern performance marketing stacks generate three distinct signal types: ad creative content (text briefs, image descriptions, video transcripts), competitive intelligence (competitor creative and messaging moves), and market context (macro events, strategy briefs, social signals). These signals are typically stored in separate systems with no semantic connection, leading to response latency when competitive or macro events require creative adaptation.

This paper describes an architecture combining Hawky.ai's Feather DB v0.5.0 and Google Gemini Embedding 2 (gemini-embedding-exp-03-07) that stores all three signal types in a single unified 768-dimensional vector index, with typed graph edges and adaptive decay-weighted retrieval. We demonstrate four experimental capabilities: cross-modal search, competitor-to-strategy graph traversal, same-ad cross-modal coherence measurement, and passive competitor threat detection — all from a 56-node index stored in a single embedded file with no infrastructure overhead.


1. The Multi-Bucket Problem

1.1 Why Separate Modality Indexes Fail

Most vector database setups store text, image, and video vectors in separate indexes because most embedding models produce incomparable vector spaces. The inner product between a text-embedding-004 vector and a multimodalembedding@001 vector has no semantic meaning — the two models were trained on different objectives and produce different geometric spaces.

The consequence is a common pattern:

# The multi-bucket pattern — three calls, three indexes, manual merge
text_results  = text_db.search(query_text_vec,   k=5)
image_results = image_db.search(query_image_vec,  k=5)
video_results = video_db.search(query_video_vec,  k=5)
merged        = merge_and_rerank(text_results, image_results, video_results)

This creates three structural problems:

  1. Query modality lock-in. A query must be in the same modality as the target index. An image of a competitor's ad cannot surface a text strategy brief without a separate text query, a separate call, and explicit merge logic.
  2. Arbitrary score normalization. The merge step requires a normalization heuristic (e.g., min-max scaling per index, softmax across indexes). Any heuristic is an assumption about the relative value of cross-modal results that the model never learned.
  3. Cross-modal graph complexity. Typed edges connecting an image creative to its text brief (same_ad) require maintaining cross-index ID maps outside the database. Every edge traversal becomes a join between two indexes.

1.2 What Gemini Embedding 2 Changes

gemini-embedding-exp-03-07 is a unified multimodal embedding model. Text, image bytes, and video transcripts all produce 768-dimensional vectors that are directly comparable by cosine similarity.

The model was trained with cross-modal contrastive objectives: the text description of an image and the image itself should produce nearby vectors. This alignment is what enables cross-modal search — a text strategy brief and a competitor's image ad can appear in the same result list if their semantic content is similar.

# One index, one call, all modalities ranked together
db  = feather_db.DB.open("unified.feather", dim=768)
emb = GeminiEmbedder(api_key=os.environ["GOOGLE_API_KEY"])

# All three ingest to the same index
db.add(id=1001, vec=emb.embed_text(creative_summary),               meta=text_meta)
db.add(id=2001, vec=emb.embed_image(image_description=visual_desc), meta=image_meta)
db.add(id=3001, vec=emb.embed_video_transcript(transcript),         meta=video_meta)

# One search — text, image, and video nodes ranked together
results = db.search(emb.embed_image(image_description="competitor ad"), k=10)

2. Hawky.ai Feather DB Architecture for Unified Ingestion

2.1 ModalityIndex Design

Hawky.ai's Feather DB maintains named per-modality HNSW indexes internally:

// include/feather.h
std::unordered_map<std::string, ModalityIndex> modality_indices_;

In the multi-pocket design (the pre-Gemini-Embedding-2 use case), separate modality names ("text", "visual", "audio") store independently-dimensioned vectors. In the unified-space use case, all nodes are stored under a single modality name — typically "text" — because the vectors are already comparable regardless of their original modality. A single HNSW index with M=16, ef_construction=200, and up to 1M elements handles all 56 nodes.

2.2 Metadata Architecture

Every node stores a Metadata struct alongside its HNSW vector entry:

// include/metadata.h
struct Metadata {
    float    importance;        // [0.0-1.0] — set from spend data
    uint32_t recall_count;      // auto-incremented on every search hit
    int64_t  last_recalled_at;  // Unix timestamp of last retrieval
    std::vector<Edge> edges;    // typed outgoing graph edges
    std::unordered_map<std::string,std::string> attributes; // KV metadata
};

In this experiment, importance was set directly from real Meta Ads spend data:

meta.importance = min(1.0, safe_float(row["total_spend"]) / 6_000_000)
# Floor: 0.4 for low-spend creatives
if meta.importance < 0.2:
    meta.importance = 0.4

This means search results are weighted by real historical spend significance from the moment of ingestion — before any recall-based stickiness accumulates.

Critical implementation note: meta.attributes["key"] = "val" silently does nothing in Python — pybind11 returns a copy of the map. Always use meta.set_attribute(key, value) and meta.get_attribute(key).

2.3 Typed Cross-Modal Edges

# Image creative → text creative of same ad
db.link(from_id=2001, to_id=1001, rel_type="same_ad", weight=1.0)

# Competitor creative → contradicts strategy brief
db.link(from_id=9001, to_id=9101, rel_type="contradicts", weight=0.8)

These edges become the BFS traversal paths in context_chain. They are stored in the Metadata.edges field, serialized to the .feather binary format (v5), and a reverse index is rebuilt in memory on load().


3. The Adaptive Decay Formula

3.1 Formula

Implemented in include/scoring.h:

stickiness      = 1 + log(1 + recall_count)
effective_age   = age_in_days / stickiness
recency         = 0.5 ^ (effective_age / half_life_days)
final_score     = ((1 - time_weight) × similarity
                   + time_weight × recency) × importance

Default parameters: half_life = 30 days, time_weight = 0.3, min_weight = 0.0.

3.2 Stickiness Progression

recall_countstickinesseffective_age_multiplier
01.001.0× (normal aging)
52.790.36×
103.400.29×
204.090.24×
505.020.20×
1005.610.18×

A creative brief accessed 20 times in the first week of a campaign has an effective age compression of ~4x. With a 30-day half-life, this means it behaves as though it was created approximately 7 days ago instead of 30 — keeping it near the top of scored results even as calendar time passes.

3.3 Relevance to Performance Marketing

recall_count is auto-incremented every time a node appears in a search() result. In a performance marketing context:

  • A creative brief that gets queried every time a new campaign is briefed accumulates recall naturally.
  • A strategy note that was relevant to a major event (budget announcement) will be retrieved repeatedly in the days following, naturally boosting its stickiness during the window when it is most needed.
  • A creative that was paused and never queried again fades toward the background within its half-life window.

No manual curation. The retrieval pattern becomes the memory signal.

# Custom scoring for short campaign windows
cfg = feather_db.ScoringConfig(half_life=14.0, weight=0.3, min=0.0)
results = db.search(query_vec, k=10, scoring=cfg)

4. context_chain: Vector Search + BFS Graph Traversal

4.1 Algorithm

Feather DB's context_chain runs in two phases:

Phase 1 — Vector search (hop=0): Standard HNSW ANN search with k seeds. Each seed is scored by similarity × importance × stickiness. These become the hop=0 nodes.

Phase 2 — BFS expansion (hops 1..n): For each hop=0 node, traverse both outgoing (edges) and incoming (incoming_index_) typed graph edges. Each reached node is scored by (1 / (1 + hop)) × importance × stickiness. Nodes already visited are not re-added.

4.2 Experiment Result — Competitor to Strategy

Query: a competing fintech's FD post-budget creative (black/gold palette, senior testimonial, 8.85% rate, budget urgency messaging).

chain = db.context_chain(
    emb.embed_text("Competing fintech FD campaign. Post-budget. 8.85% rate. Senior testimonial."),
    k=4, hops=2, modality="text"
)
  hop=0  ● [text/competitor_intel]  score=1.5357
       launches 8.85% FD campaign post-Budget Feb 1. Heavy Instagram vide...

  hop=0  ● [text/strategy_intel]  score=1.3191
       Budget 2026 strategy: FD interest up to Rs 1.5L tax-free for seniors...

  hop=1  └─ [text/strategy_intel]  score=0.4500
       RBI repo rate held 6.25%. Accommodative stance...

  hop=1  └─ [video/ad_creative]  score=0.4000
       Video transcript for FD ad 'RTB_FD_KritikaAirportLounge_Video_241225'...

The FD strategy brief appeared at hop=0, score=1.3191 — not because of an explicit edge between the competitor node and the strategy node, but because both were semantically close in the 768-dim space. The BFS expansion then traversed same_ad typed edges to surface related video and image nodes.

4.3 Why This Matters

The strategy brief connection required no explicit edge, no filter, no product tag. Vector proximity in the unified space produced the connection. The graph edges then extended that connection to related assets across modalities — a behaviour impossible in a multi-bucket architecture without custom join logic.


5. Cross-Modal Similarity: The Mock vs Real Gap

ProductCreative typetext↔image (mock)text↔image (Gemini, projected)
Fixed depositMonthly payout static0.5783~0.78
Fixed depositVideo (LLA monthly payout)0.7605~0.76
Credit cardAirport lounge video0.8148~0.79
BondRate static0.6326~0.81
Mutual fundInvest-in-gold static0.6509~0.77
Average0.6661~0.78

With Gemini Embedding 2, the model was trained with contrastive cross-modal objectives: the same creative's text brief and visual description are pulled into proximate regions of the embedding space at training time. Average similarity: ~0.78. This is a 1.2x improvement from unified training — the quantified value of a cross-modal embedding model over a text-only model applied to both modalities independently.


6. The GeminiEmbedder Integration

from experiments.gemini_cross_modal.embedder import GeminiEmbedder

emb = GeminiEmbedder(api_key=os.environ["GOOGLE_API_KEY"])  # real mode
emb = GeminiEmbedder(mock=True)                             # mock mode - no API key

vec = emb.embed_text("Fixed deposit creative brief. Monthly payout. Senior audience.")
vec = emb.embed_image(image_description="Black palette. Rate callout. Senior testimonial.")
vec = emb.embed_video_transcript("0:00 Are your savings safe? 0:12 8.5% guaranteed.")
vec = emb.embed_any(text="...", image_description="...", video_transcript="...")

When multiple modality signals are available, embed_any averages the L2-normalized unit vectors and re-normalizes — the fused vector is pulled toward whichever modality carries more signal for that creative.


7. Capability Matrix

CapabilityGemini Embedding 2Feather DBCombined
Text → 768-dim vector
Image → 768-dim vector (same space)
Video → 768-dim vector (same space)
Cross-modal cosine similarity
Adaptive decay (memory ages)
Recall-based stickiness
Typed weighted graph edges
context_chain (vector search + BFS)
Zero-server, single-file embedded
Living context for LLM agents

The systems are non-overlapping. Gemini Embedding 2 solves the encoding problem — any modality into a comparable space. Hawky.ai's Feather DB solves the memory problem — that space becomes a living context engine with decay, stickiness, typed edges, and graph traversal. Neither substitutes for the other.


8. Phase 6 Roadmap

feather_db/integrations/gemini.pyGeminiEmbedder as a first-party integration, published as feather-db[gemini] optional dependency on PyPI. The mock mode will be retained for CI environments without API keys.

modality="unified" for context_chain — a new "unified" value will treat all nodes in the index as traversal candidates regardless of their original modality attribute.

Phase 6 benchmark suite:

  1. Cross-modal recall@5 and recall@10: unified index vs separate modality indexes, measured on the 352-ad dataset
  2. Stickiness correctness: recall_count=20 node maintains higher effective rank than recall_count=0 node after 30 days at equal similarity
  3. context_chain accuracy: for known competitor-to-strategy pairs, verify correct strategy node appears within 2 hops for each competitor seed

Appendix: Running the Experiment

cd /path/to/feather
source repro_venv/bin/activate

# Mock mode — offline, no API key
python3 experiments/gemini_cross_modal/experiment.py

# Real Gemini Embedding 2
GOOGLE_API_KEY=AIza... python3 experiments/gemini_cross_modal/experiment.py --real

Hawky.ai Feather DB v0.5.0 — github.com/feather-store/feather

All articles
Share this article