Technical Deep Dive2026-04-028 min read

Measuring Filter Accuracy Honestly: What 97.1% Actually Means

Accuracy numbers are easy to misread. Here is how to think about precision, recall, and the real meaning of a filter's headline figure.

filter accuracyprecision recallAI security metricsevaluationSprappy Filter

Numbers Need Context

A filter that advertises "97.1% accuracy" tells you almost nothing without context. Accuracy on what dataset? Measuring precision, recall, or some blend? Across which threat categories? An honest discussion of filter performance has to unpack the headline.

Precision vs Recall

Recall is the fraction of real threats the filter catches. High recall means few threats slip through. Precision is the fraction of flagged items that are actually threats. High precision means few false alarms.

These trade off. Crank sensitivity up and you catch more threats (high recall) but flag more benign prompts (low precision). Tune it down and the reverse. There is no single "accuracy" that captures both — you have to talk about the tradeoff.

What the Two Tiers Deliver

Sprappy Filter's pattern tier catches roughly 95% of clear-cut threats. That figure refers to unambiguous cases — the injections, structured PII, and known malware that look exactly like what they are. Pattern matching is high-precision on these because the signals are unmistakable.

The transformer cascade addresses the ambiguous middle band — the paraphrased, context-dependent prompts where patterns fall short — pushing combined accuracy to 97.1%. That number is meaningful precisely because it covers the hard cases, not just the easy ones.

What 97.1% Is Not

It is not 100%. No filter is. It does not mean 2.9% of all traffic is malicious and uncaught — most traffic is benign and trivially allowed. It means that on the threats the filter is evaluated against, it correctly resolves about 97 in 100, with the remainder being the genuinely hardest, most novel cases.

False Positives Cost Too

A filter that blocks legitimate prompts is its own problem. Users hit walls, support tickets pile up, and teams disable the filter. This is why the block/sanitize/allow design matters — sanitize lets you handle borderline cases without an outright block, reducing the user-facing cost of a false positive.

How to Measure on Your Traffic

Do not trust any single vendor number for your specific use case. Run the filter in log-only mode on real traffic at https://api.sprapp.com/v1/filter, then:

Sample the flagged prompts and label them by hand
Compute your own precision (how many flags were real)
Sample the allowed prompts and look for misses (recall proxy)
Tune per-category thresholds to your tolerance

The Honest Bottom Line

Treat headline accuracy as a starting point, not a guarantee. The pattern tier's 95% on clear-cut threats and the cascade's 97.1% on the ambiguous band are real, useful figures — but the right number for you is the one you measure on your own data. Build defense in depth so the remaining percentage is not catastrophic.

Accuracy honesty is itself a security property. A vendor who claims 100% is the one to distrust.

Written bySPRAPP Security

Debate vs Voting: Comparing Consensus Methods in AI Panels

Majority voting, peer review, and structured debate each reach consensus differently. Here is when to use each in SPRAPP Panel.

2025-08-027 min read

Technical Deep Dive

Model Routing Strategies: Sending the Right Question to the Right Model

Not every query needs every model. Smart routing matches questions to model strengths to save cost without losing quality.

2025-09-227 min read

Technical Deep Dive

Orchestrating a Panel: Fan-Out, Latency, and Parallelism

A panel is only as fast as its slowest model unless you orchestrate it well. Inside the engineering of parallel reasoning.

2026-01-287 min read

Technical Deep Dive

Why Gzip and Zstd Fail on Small Payloads (And What to Do Instead)

General-purpose compressors carry fixed overhead that wipes out savings on strings under 1KB. Here is why smoltext exists.

2025-07-036 min read

← Back to News

Technical Deep Dive2026-04-028 min read

Measuring Filter Accuracy Honestly: What 97.1% Actually Means

Accuracy numbers are easy to misread. Here is how to think about precision, recall, and the real meaning of a filter's headline figure.

filter accuracyprecision recallAI security metricsevaluationSprappy Filter

Numbers Need Context

Precision vs Recall

What the Two Tiers Deliver

What 97.1% Is Not

False Positives Cost Too

How to Measure on Your Traffic

Do not trust any single vendor number for your specific use case. Run the filter in log-only mode on real traffic at https://api.sprapp.com/v1/filter, then: