Quarto writeup clarity rewrite and canonical reviewer figure slate

Published

May 19, 2026

ADR-062: Quarto Writeup Clarity and Canonical Figures

Status

Accepted (2026-05-19; implementation patch for the live Quarto site).

Context

ADR-061 fixed navigation, but the first reading path still assumed too much ML/evaluation vocabulary. A first-time reviewer needed to infer why prompt injection matters, what the classifier was asked to do, what AUPRC/AUROC/FPR mean, and how to read the plots.

Inspection also found that docs/plots/F1-F7.svg had been generated through the scaffold path in scripts/render_figures.py. Those plots were useful for testing the rendering pipeline, but they were not safe as reviewer-facing evidence because their numbers did not come from the canonical evals/ artifacts.

Decision

Rebuild the reviewer path around plain-language interpretation:

  • Problem first: prompt injection is untrusted text trying to override an LLM system’s instructions.
  • Result second: no evaluated rung clearly beats the random AUPRC floor on the pooled cross-family OOD slice.
  • Evidence third: exact tables plus a small canonical plot slate.
  • Methodology/process detail remains available, but below the first-path story.

The reviewer-facing plot slate becomes five figures:

  1. Pooled OOD AUPRC by rung vs the prevalence baseline.
  2. Frozen-probe vs LoRA paired AUPRC deltas on comparable both-class slices.
  3. Per-slice AUPRC grid, with single-class slices explicitly marked N/A.
  4. Detection-threshold transfer against the 1% FPR target.
  5. Calibration comparison using ECE and Brier.

scripts/render_figures.py must read canonical artifacts by default. The --scaffold mode is retained for smoke tests only and refuses to write to docs/plots.

Consequences

  • The live site becomes clearer for the intended hiring-manager audience.
  • The main figures now align with the numerical tables and provenance sidecars.
  • F6/F7 remain historical Phase 4 concepts, but they are no longer embedded in the main reviewer path unless regenerated from canonical artifacts in a future ADR.
  • The figure renderer still uses eval-toolkit primitives where available: set_plot_style, PALETTE, plot_lift_ci, plot_slice_metric_heatmap, and save_figure. Remaining matplotlib code is project-specific composition.

Alternatives Considered

  • Caption-only fix: rejected because scaffold-derived plots should not stay in the reviewer path with stronger prose.
  • Tables-only results page: rejected because the hiring-manager path needs visual support, but exact values still belong in tables.
  • Keep ADR/process detail prominent: rejected for the first reading path; methodology rigor remains linked and auditable, just not front-loaded.