# 12 Creative Performance Metrics a Context Engine Can Auto-Track and Surface > These 12 metrics are standard in performance marketing — but storing them in a context engine with temporal decay and semantic indexing makes them actionable at brief time, not just reportable after the fact. - **Category**: Theory - **Read time**: 9 min read - **Date**: July 2, 2026 - **Author**: Feather DB (Engineering) - **URL**: https://getfeather.store/theory/creative-performance-metrics-context-engine --- ## The difference between tracking and surfacing Every performance marketing team tracks creative metrics. CTR, CPL, ROAS, hook rate, thumbstop rate, completion rate — these live in dashboards, spreadsheets, and ad platform reporting. They are tracked. But tracking is not the same as surfacing. Surfacing means making the right metric, for the right creative precedent, available at the moment a brief is being written — automatically, without requiring the strategist to go looking for it. A context engine does not replace metric tracking. It adds a retrieval layer on top of it: store each metric as metadata on the creative embedding, and at brief time retrieve the semantically relevant precedents along with their full metric profiles. The strategist starts every brief with a ranked list of the most relevant past creatives and their complete performance data — not a dashboard they have to navigate separately. These are the 12 metrics worth storing in the context engine, how to store them, and what they enable at retrieval time. ## The 12 metrics and how they are stored ```python import feather_db as fdb from feather_db import MetaRecord db = fdb.DB.open("brand_creative_metrics.feather", dim=768) def ingest_creative_with_metrics(creative: dict, embedder): vec = embedder.embed(creative["copy"] + " " + creative["hook"]) meta = MetaRecord() # 1. Click-through rate meta.set_attribute("ctr", creative["ctr"]) # 2. Cost per lead / cost per acquisition meta.set_attribute("cpl", creative.get("cpl", 0.0)) meta.set_attribute("cpa", creative.get("cpa", 0.0)) # 3. Return on ad spend meta.set_attribute("roas", creative.get("roas", 0.0)) # 4. Hook rate (% watching past 3 seconds) meta.set_attribute("hook_rate", creative.get("hook_rate", 0.0)) # 5. Thumbstop rate (impressions that stopped scrolling) meta.set_attribute("thumbstop_rate", creative.get("thumbstop_rate", 0.0)) # 6. Video completion rate meta.set_attribute("completion_rate", creative.get("completion_rate", 0.0)) # 7. Frequency at fatigue meta.set_attribute("fatigue_frequency", creative.get("fatigue_frequency", 0.0)) # 8. Days to fatigue meta.set_attribute("fatigue_day", creative.get("fatigue_day", 0)) # 9. Spend at scale meta.set_attribute("spend", creative["spend"]) # 10. Conversion rate (CTR to conversion) meta.set_attribute("conversion_rate", creative.get("conversion_rate", 0.0)) # 11. Creative velocity (creatives tested per week in this period) meta.set_attribute("test_cohort_size", creative.get("cohort_size", 1)) # 12. Incremental ROAS (lift over baseline) meta.set_attribute("incremental_roas", creative.get("incremental_roas", 0.0)) # Importance: spend-weighted meta.set_attribute("importance", min(1.0, creative["spend"] / 100_000)) db.add( id=creative["id"], vec=vec, text=creative["copy"], namespace="brand::creatives", meta=meta ) ``` ## Metrics 1–4: The core performance quartet CTR, CPL, CPA, and ROAS are the primary outcome metrics. In a context engine, they are stored as metadata attributes so that retrieval can filter and rank by them. A query for "what are the highest-ROAS creatives semantically similar to this brief?" returns a ranked list with all four metrics attached — not just ROAS, but the full performance profile of each precedent. The key difference from a dashboard view: in the context engine, these metrics are weighted by the importance and temporal decay of each record. A creative with a 4.2x ROAS from 18 months ago ranks below a creative with a 3.8x ROAS from last month, if the decay model weights recent evidence more heavily. The dashboard shows both at face value. The context engine weights them by confidence. ## Metrics 5–7: Attention quality indicators Hook rate, thumbstop rate, and completion rate measure attention quality — not whether the ad converted, but how well it captured and held attention before conversion was possible. These metrics are particularly valuable in the context engine because they predict creative potential in new placements: a hook with a 38% hook rate and a 72% completion rate has demonstrated audience attention across all the placements it has run. That attention quality signal is portable to new campaign contexts in a way that ROAS is not. Hawky.ai reports 20% CTR uplift at Univest within 7 days. Part of that gain comes from starting hook selection from attention-quality metrics — hook rate and thumbstop rate — not from ROAS alone. The context engine surfaces hooks with proven attention quality for semantically similar briefs, raising the floor of creative output from the first campaign cycle. ## Metrics 8–9: Fatigue and scale indicators Days to fatigue and spend at scale are operational metrics that inform campaign management decisions. In the context engine, they enable a specific query: "for creatives like this one, how long do they typically run before fatigue appears, and at what spend level do they typically scale?" The answer, drawn from semantically similar precedents with accumulated outcome data, gives the campaign manager a realistic operating window before launching. Without this memory, fatigue and scale parameters are estimated from general best practices. With memory, they are estimated from this brand's own track record with comparable creative territory — which is materially more accurate and brand-specific. ## Metrics 10–12: Conversion depth and incremental lift Conversion rate (click-to-conversion), creative velocity (how many creatives were tested in the same period), and incremental ROAS (lift over baseline holdout) provide a deeper signal about creative causality. These are harder to collect consistently but are the most informative when available. In the context engine, they allow the most sophisticated query type: "find me creatives with strong incremental ROAS that ran against a cohort of comparable test sizes" — filtering out results that were produced by lucky conditions rather than creative quality. ## Surfacing metrics at brief time ```python from feather_db import ScoringConfig def get_metric_brief_context(brief_text: str, embedder) -> dict: q = embedder.embed(brief_text) # High-attention creatives for this brief direction attention_leaders = db.search( q, k=6, namespace="brand::creatives", scoring=ScoringConfig(half_life=120.0, weight=0.4, min=0.0), filters={"min_hook_rate": 0.28, "min_completion_rate": 0.65} ) # High-conversion creatives conversion_leaders = db.search( q, k=6, namespace="brand::creatives", scoring=ScoringConfig(half_life=120.0, weight=0.4, min=0.0), filters={"min_conversion_rate": 0.04} ) # Fatigue-resistant creatives durable_creatives = db.search( q, k=4, namespace="brand::creatives", scoring=ScoringConfig(half_life=180.0, weight=0.35, min=0.0), filters={"min_fatigue_day": 28} ) return { "attention_leaders": attention_leaders, "conversion_leaders": conversion_leaders, "durable_creatives": durable_creatives, } ``` This retrieval completes in under 3ms at 0.19ms p50 ANN latency. The brief generation model receives three ranked lists — attention-optimized, conversion-optimized, and durability-optimized precedents — and generates creative direction grounded in all three performance dimensions simultaneously. ## The 40x cost advantage at metric scale Storing 12 metadata attributes per creative adds minimal overhead to the Feather DB index. The HNSW vector index and metadata filtering run in the same embedded file — no separate database for metrics, no API calls per query, no per-query cloud cost. At 40x lower cost than equivalent RAG implementations and 5–6x faster cold load than alternatives, running a full 12-metric retrieval across a history of 500+ creatives costs fractions of a cent per query. The LongMemEval score of 0.693 for Feather DB versus 0.640 for GPT-4o confirms that multi-attribute retrieval with structured metadata outperforms unstructured context stuffing for the kind of reasoning that 12-metric creative analysis requires. ## FAQ ### Which of the 12 metrics is most important to store first? Start with CTR, CPL or CPA, spend, and hook rate. These four provide the core signal for most brief-time retrieval queries. ROAS is the fifth to add. The remaining metrics — completion rate, thumbstop rate, fatigue day, conversion rate, incremental ROAS — add value incrementally as ingestion discipline improves. A context engine with 4–5 well-populated metrics per creative is more useful than one with 12 metrics per creative but inconsistent coverage. ### How does the context engine handle missing metric values for older creatives? Missing values are stored as zero or null and handled gracefully by the filtering layer — a filter for `min_hook_rate: 0.28` excludes records with null hook rate from that specific query. Records with missing data still appear in unfiltered searches ranked by their available metrics and importance weight. Partial data is better than no data; do not withhold ingestion because a creative's metric profile is incomplete. ### Can the context engine auto-track metrics from ad platform APIs? Yes, with an ingestion pipeline. A script that pulls campaign data from the Meta Marketing API or Google Ads API and runs `db.add()` for each completed creative automates the tracking step. Feather DB handles the storage and retrieval; the pipeline handles the data pull. The integration is typically under 100 lines of Python. Some teams run this nightly; others run it at campaign close. Either cadence works. ### How do I retrieve creatives that scored well on multiple metrics simultaneously? Use stacked filters: `filters={"min_ctr": 0.02, "min_hook_rate": 0.28, "min_roas": 3.0}`. The metadata intelligence layer applies all filters before returning results. For cases where no single creative meets all thresholds simultaneously, the system returns the closest matches — records that meet most of the criteria, ranked by semantic similarity and importance. The filter is a hard gate; the scoring is the ranking mechanism within the filtered set. ### Does storing this many metrics per creative significantly increase the index file size? No. Metadata attributes add negligible storage overhead compared to the vector embeddings themselves. A 768-dimension float32 embedding is approximately 3KB. Twelve metadata attributes at typical value types add under 200 bytes per record. A context engine with 10,000 creatives and 12 metrics each is under 50MB on disk — well within the range of embedded file storage on any hardware the context engine would run on. --- *This is the machine-readable mirror of the theory post at [getfeather.store/theory/creative-performance-metrics-context-engine](https://getfeather.store/theory/creative-performance-metrics-context-engine). For the full Feather DB documentation, see [getfeather.store/llms-full.txt](https://getfeather.store/llms-full.txt).*