# ADR 0006: Pairing rules for cross-detector list-grammar in audit validators

**Status:** Accepted at v1.3.0 — applies to all future audit validators
in the `eval_toolkit.audit_*` flat-module family.

**Date:** 2026-05-26

**Deciders:** Brandon Behring (author), `/exploring-options` 2-round
review during #81 implementation, consumer-feedback audit Round 14.

**Supersedes:** N/A. **Superseded by:** N/A.

## Context

[ADR 0005](0005-structured-keys-for-audit-validators.md) introduced
a two-layer correctness model for audit validators:

1. **Layer 1 — Identity** (`BindingKey` frozen dataclass) — v1.1.0.
2. **Layer 2 — Scope** (content-type + context-keyword filters via
   `scope="narrative"`) — v1.1.0 + v1.2.0.

The v1.2.0 release shipped four context-aware filters (T1 delta,
T2 floor, T3 consume-on-match per-sentence, T4 sentence-boundary
detector-pair reject) under `scope="narrative"`, achieving 93%
total noise reduction (95 → 7) on the consumer's writeup. ADR 0005's
"Future work (deferred)" section explicitly named two remaining
failure modes — *sentence-boundary unawareness* (closed by v1.2.0's
T4) and *multi-detector list parsing in dense prose* (still
deferred). Consumer adoption at `prompt-injection-detection-submission@v1.3.12`
reduced their dogfood to **4 residual warnings**, all in this
deferred category.

Issue [#81](https://github.com/brandon-behring/eval-toolkit/issues/81)
documented the 4 residuals as three distinct prose patterns:

- **Pattern A — "for X" postfix**: `"versus 0.364 [...] for the
  frozen probe and 0.291 [...] for TF-IDF + LR"`. The validator's
  proximity-based detector pairing mis-attributes 0.291 to the
  nearer prior detector mention; the `"for TF-IDF + LR"` postfix
  is the authoritative binding signal.
- **Pattern B — possessive `'s`**: `"LoRA's pooled OOD AUROC is
  0.383 against frozen probe's 0.515"`. The `'s` construction
  isn't part of detector-alias regex matching; cross-detector
  confusion follows.
- **Pattern C — group subject**: `"... 0.38 AUROC, ~0.6 drop for
  the trained detectors; frozen probe's gap is 0.91 → 0.515"`.
  The value 0.38 belongs to "trained detectors" (a multi-detector
  group), not the next-mentioned single detector.

A fourth pattern emerged during v1.3.0 dogfood:

- **Pattern D — metric-axis confusion**: `"than the AUPRC delta
  suggests: LoRA's pooled OOD AUROC is 0.383"`. The proximity-based
  metric check finds "AUPRC" from a delta clause earlier in the
  prose, even though AUROC is the metric semantically owning
  0.383. This is symmetric to detector-axis pairing — the same
  positional heuristic fails for metric-axis in dense prose.

These four patterns are **pairing-rule problems**, not identity or
scope problems. They require a third correctness layer that
operates **on top of** identity + scope.

## Decision

Introduce **Layer 3 — pairing rules** as the third correctness
layer for audit validators:

| Layer | Correctness dimension | Mechanism | Release |
|---|---|---|---|
| 1 | Identity | Structured key with named fields | v1.1.0 |
| 2 | Scope | Content-type + context-keyword filters | v1.1.0 + v1.2.0 |
| **3** | **Pairing** | **Override or suppress proximity-based detector / metric pairing under explicit grammar cues** | **v1.3.0** |

Layer 3 ships under the existing `scope="narrative"` bundle (no new
public kwargs). Tier-1 ADDITIVE per
[ADR 0003](0003-stability-contract-and-gate3-methodology.md):
`scope="all"` callers see zero behavior change.

### Four pairing rules

**Pattern A — `"for {detector}"` postfix override.**
When a candidate value is followed (within +50 chars) by a `"for
{detector_alias}"` construct AND no other value pattern lies
between the value and the postfix (excluding values in CI brackets
per v1.1.0's scope='narrative' content-type filter), the postfix is
authoritative:
- If the postfix names THIS binding's detector → confirm pairing
  (bypass proximity check).
- If it names a DIFFERENT canonical detector → skip (the other
  detector's loop iteration will claim the value).
- If unresolved → fall through to proximity.

**Pattern B — `"{detector}'s"` possessive override.**
Same mechanics as Pattern A, but scanning −80 chars before the
value for `"{alias}'s"`. The LAST possessive in the pre-window is
authoritative IF its end position is within 30 chars of the value
start (covers both immediate `"frozen probe's 0.515"` and
short-clause `"LoRA's ... AUROC is 0.383"`). Last-match — not
first — is critical: an earlier possessive belonging to a different
preceding value must not bleed into a later value's check.

**Pattern C — group-subject suppression.**
When prose contains `"for the {trained|frozen|baseline|all|both|other}
detectors"` within ±60 chars of the value AND on the same side of
any sentence boundary, the value refers to a multi-detector group
statement that doesn't bind to a single canonical detector. The
candidate is SUPPRESSED (no override). Multi-detector inference is
deferred to v1.4.0+.

**Pattern D — metric-axis nearest-pairing.**
Symmetric to detector-axis pairing. Pre-collects ALL metric
positions per file (across consumer-supplied `metric_aliases`,
not just metrics tied to canonical bindings). Requires the NEAREST
metric mention to the value (by text-order last-before-first-after)
to be THIS binding's canonical metric. Catches prose with multiple
metrics in close proximity where the v1.2.0 window-based proximity
check picks up the wrong metric.

### Why suppression (Pattern C) rather than inference

ADR 0005's deferred-work section framed multi-detector list parsing
as a 200+ LOC parser-level problem. v1.3.0 takes a simpler path:
when prose explicitly names a multi-detector group (`"for the
trained detectors"`), the validator SUPPRESSES the candidate
rather than trying to infer which detectors own the value. This
matches the architectural pattern of v1.2.0's T1/T2 (recognize a
context cue, skip the candidate) and avoids the high-risk
multi-detector iteration path (~250 LOC, MODERATE-HIGH risk per
the Round 12 Explore analysis). Inference can be added as a
Layer 3 extension (multi-detector iteration) in v1.4.0+ if
consumer demand emerges.

## Scope of this ADR

**Applies to the `audit_*` flat-module family only.** Other parts
of the codebase (e.g., `MetricSpec`, `harness.evaluate`) are NOT
retroactively forced to adopt pairing-rule mechanics. Audit
validators are a coherent subfamily that share the closed-config
pattern + the consumer-prose-aware mission.

Future pairing-rule additions (e.g., enumeration parsing for the
`"X scored Y, Z, and W for A, B, C respectively"` pattern, or
multi-detector inference replacing Pattern C's suppression) join
Layer 3 as additional rule families under the same architectural
slot.

## Consequences

### Positive

- **Closes #81's 4 residuals.** Consumer-side dogfood reaches 0
  warnings (down from 4); HARD-gate promotion of
  `audit_value_bindings` becomes credible. Combined with v1.1.0 +
  v1.2.0, **100% reduction vs the pre-fix v1.0.5 baseline** on
  the consumer's writeup.
- **Architectural consistency.** Layer 3 is opt-in via the
  existing `scope="narrative"` bundle (no new kwargs);
  backward-compat preserved for `scope="all"` callers.
- **Symmetric metric-axis pairing.** Pattern D extends the
  positional pairing model from detector-axis to metric-axis,
  using the existing `_nearest_canonical_key` helper. Establishes
  "axis-by-axis nearest-pairing" as a reusable Layer 3 building
  block.
- **Bypass + confirm semantics.** Pattern A/B overrides are
  authoritative: they CONFIRM pairing (bypass proximity check)
  when they match THIS binding's detector. Avoids the bug where
  override + proximity disagree and the value is wrongly rejected.

### Negative

- **Layer 3 adds ~150 LOC.** Pattern helpers, per-call regex
  builds, inner-loop wiring. Within the flat-module
  maintainability bar (comparable to v1.2.0's T1-T4 = ~150 LOC).
- **Pattern A intervening-value check + Pattern C sentence-
  boundary check reuse the existing exclusion-ranges and
  sentence-positions infrastructure** from v1.1.0/v1.2.0. Adds
  cross-layer coupling: changes to bracket-exclusion logic or
  sentence-detection logic now also affect Layer 3 correctness.
  Mitigation: unit tests pin each rule's behavior under known
  prose patterns.
- **Pattern D requires `metric_aliases` for unbound metrics.**
  When prose mentions a metric (e.g., AUROC) that has no canonical
  binding, the consumer must still pass it in `metric_aliases`
  for Pattern D to recognize it. Without the alias, Pattern D
  falls through to the binding's own metric (legacy v1.2.0
  behavior). Documented in v1.3.0 CHANGELOG.

### Future work (post-v1.3.0)

- **Multi-detector inference for Pattern C.** Replace suppression
  with multi-detector ownership iteration: when "for the trained
  detectors" is found, iterate the value-comparison block once per
  detector in the implied group. ~250 LOC; MODERATE-HIGH risk.
  Track as v1.4.0+ if consumer demand emerges.
- **Enumeration parsing.** Prose like `"X scored Y, Z, and W for
  A, B, C respectively"` requires positional alignment between two
  lists. Not addressed by v1.3.0. Track as v1.4.0+ if needed.
- **Markdown AST parsing** (ADR 0005 §A4) — v2.0 territory.

## Alternatives considered

### A1 — Markdown AST parsing

Rejected for v1.x per ADR 0005 §A4. Too heavy, fragile to markdown
dialects, ~1000+ LOC dependency. Stays v2.0 territory.

### A2 — Pattern A + B only (defer C and D)

Closes 3 of 4 residuals. ~100 LOC. Rejected because the consumer's
HARD-gate promotion is blocked on ALL 4 warnings; a 75% close-rate
release would not unblock the consumer-side workflow that motivated
this work.

### A3 — Multi-detector inference for Pattern C (instead of suppression)

~250 LOC; replaces the simple "for the trained detectors" suppression
with explicit iteration over implied group detectors. Rejected for
v1.3.0 because suppression closes the same false positives at ~30
LOC; inference's marginal value isn't worth the complexity until
consumer prose surfaces a case where suppression hides a real bug.

### A4 — Public kwargs for pairing rules

Add `list_connectives: Sequence[str] | None = None`,
`possessive_patterns: ...`, etc. — let consumers extend the
hardcoded sets. Rejected per ADR 0005 §4 reasoning: YAGNI without
concrete consumer demand. The hardcoded frozensets cover the
consumer's actual prose patterns; runtime extension can be added
in a future v1.3.x patch if needed.

### A5 — Layer 3 as a separate `scope="strict"` tier

A new `scope` value that's `narrative` + pairing rules.
Rejected because it creates an ordering relationship consumers
must remember (`all` ⊂ `narrative` ⊂ `strict`) and grows the API
surface. The existing `scope="narrative"` bundle already
represents "opt-in narrative-prose-aware correctness mode"; Layer
3 fits within that mental model.

## Cross-references

- [ADR 0001](0001-flat-module-layout.md) — flat-module layout
  still applies; Layer 3 helpers live in `audit_value_bindings.py`
  alongside the v1.2.0 helpers, not in a subpackage.
- [ADR 0003](0003-stability-contract-and-gate3-methodology.md) —
  Tier-1 ADDITIVE classification. No new public kwargs; no
  signature drift; only `__version__` and the inner-loop logic
  change.
- [ADR 0005](0005-structured-keys-for-audit-validators.md) —
  introduces Layer 1 + 2 and explicitly defers Layer 3 to v1.3.0+.
  This ADR is the formal closure of that deferred work.
- [Round 14 audit findings](../audit_findings.md) — captures
  the v1.3.0 cycle dogfood + the four-pattern taxonomy.
- Issue [#81](https://github.com/brandon-behring/eval-toolkit/issues/81)
  — consumer-filed signal that triggered this ADR.