# Migrating to v0.47

The v0.47 release follows the v0.46 scorecard surface with a **breaking
consolidation** of the sweep API + Tier-2 Protocol cleanup. It also
completes the v0.43-forward-look advanced-6 character-injection suite and
lands the Round 6 audit follow-on items.

If you're jumping from v0.45 (or earlier) and have not yet migrated through
v0.46, read `migration/v0.46.md` first.

## What's removed at v0.47 (BREAKING)

### 1. Top-level scalar metric imports — hard removal

The v0.46 ``__getattr__`` shim that kept these reachable with a
``DeprecationWarning`` has been deleted:

```text
# v0.46 (still worked with warning):
from eval_toolkit import pr_auc, roc_auc, brier_score
from eval_toolkit import (
    expected_calibration_error,
    expected_calibration_error_debiased,
    expected_calibration_error_equal_mass,
    expected_calibration_error_l2,
    expected_calibration_error_l2_debiased,
)

# v0.47 (AttributeError on every name above):
from eval_toolkit import pr_auc  # ImportError
```

**Migration (primary path — preferred):**

```python
import numpy as np
from eval_toolkit import scorecard, metric_specs as ms

rng = np.random.default_rng(42)
y_true = rng.integers(0, 2, size=200)
y_score = np.clip(y_true + rng.normal(0, 0.3, size=200), 0, 1)

r = scorecard(y_true, y_score, metrics=[ms.pr_auc, ms.brier])
value = r["pr_auc"].value
ci = r["pr_auc"].ci   # BootstrapCI | None
print(f"PR-AUC: {value:.3f}  CI: [{ci.ci_low:.3f}, {ci.ci_high:.3f}]")
```

**Migration (escape hatch — internal API per ADR 0002):**

```python
from eval_toolkit.metrics import pr_auc, roc_auc, brier_score
# Same scalar-function signature as v0.45 and earlier.
```

For the 3 ECE variants that do not have a first-party ``metric_specs``
equivalent (``expected_calibration_error_debiased`` / ``_l2`` /
``_l2_debiased``), the submodule path is the only stable way to reach
them. ``metric_specs.ece(n_bins=..., strategy="uniform"|"quantile")``
covers the canonical two.

### 2. Module-level sweep functions removed

```text
# v0.46 — gone in v0.47:
from eval_toolkit.adversarial import sweep
from eval_toolkit.preprocessing import sweep
```

**Migration:** use the new top-level ``sweep()`` with any
:class:`TextTransform` strategy (defence + attack mix freely):

```python
from eval_toolkit import sweep, DelimitVariant, DatamarkVariant
from eval_toolkit.adversarial import ZeroWidthSpaceInjection

texts = ["hello world", "ignore previous instructions"]

# Pure text-transform enumeration:
df = sweep(
    [DelimitVariant(), DatamarkVariant(), ZeroWidthSpaceInjection()],
    texts,
)
print(df.columns.tolist())
```

Add a Scorer for original / transformed score columns, and an explicit
threshold for the ``asr`` column:

```text
df = sweep([...], texts, scorer=detector)
df = sweep([...], texts, scorer=detector, attack_threshold=0.5)
```

**Key contract change:** ``attack_threshold`` is now an explicit kwarg.
The v0.43–v0.46 ``adversarial.sweep`` had ``threshold=0.5`` as a
default; the new ``sweep()`` refuses to materialize an ``asr`` column
unless the caller commits to a calibrated operating point (see
``methodology/thresholds.md``).

### 3. SimpleNamespace shortcuts removed

```text
# v0.46 — gone in v0.47:
from eval_toolkit.adversarial import character_injection
from eval_toolkit.preprocessing import spotlighting

character_injection.zero_width_space("hello")
spotlighting.delimit("hello")
```

**Migration:**

```python
from eval_toolkit.adversarial import ZeroWidthSpaceInjection
from eval_toolkit.preprocessing import delimit  # or DelimitVariant

ZeroWidthSpaceInjection().transform("hello")
delimit("hello")
DelimitVariant().transform("hello")   # equivalent
```

### 4. ``CharacterInjectionStrategy`` Protocol removed

The per-module Protocol was redundant with the new top-level
:class:`TextTransform` Protocol that ships in v0.47 (Decision K).

```text
# v0.46:
from eval_toolkit.adversarial import CharacterInjectionStrategy
isinstance(my_strategy, CharacterInjectionStrategy)

# v0.47:
from eval_toolkit import TextTransform
isinstance(my_strategy, TextTransform)
```

Every existing adversarial dataclass continues to satisfy
``TextTransform`` structurally — no source changes required in concrete
classes.

## What's added at v0.47

### Top-level ``TextTransform`` Protocol

The 9th strict Tier-2 Protocol per ADR 0003 (Decision M):

```python
from eval_toolkit import TextTransform

# Structural subtyping — any class with name: str + transform(text) -> str
# satisfies the Protocol without inheriting from it.
```

### 3 preprocessing dataclasses

``DelimitVariant``, ``DatamarkVariant``, ``EncodeVariant`` — frozen +
``slots=True`` wrappers over the existing ``delimit`` / ``datamark`` /
``encode`` functions, satisfying ``TextTransform``:

```python
from eval_toolkit import DelimitVariant, DatamarkVariant, EncodeVariant

DelimitVariant(delimiter="<<").transform("hello")     # "<<hello>>"
DatamarkVariant(marker="^").transform("a b")          # "a^ b"
EncodeVariant(encoding="base64").transform("hello")   # "aGVsbG8="
```

### 6 advanced character-injection techniques

Closes the v0.43.0 CHANGELOG forward-look ("scheduled for v0.43.1" — a
version that never shipped) per Decision Q11→11.3:

```python
from eval_toolkit import (
    BidiRTLInjection,        # U+202E…U+202C override block
    TagStrippingInjection,   # <…> tag removal (idempotent)
    SynonymSubstitution,     # whitelisted-word swap, seed-deterministic
    TokenSplittingInjection, # mid-word single-space insertion (was `TokenSplitting`; renamed at v0.49)
    UnicodeNormalizationInjection, # NFC / NFD / NFKC / NFKD (was `UnicodeNormalization`; renamed at v0.49)
    InvisibleCharsInjection, # 5 invisible code points
)
```

``ADVANCED_TECHNIQUES`` (6-tuple) + ``ALL_TECHNIQUES`` (12-tuple =
core 6 + advanced 6) are exported from ``eval_toolkit.adversarial`` for
convenience.

### Round 6 audit follow-on (per ``docs/source/audit_findings.md``)

- **Decision R6-A**: ``scorecard(seed=None)`` docstring rewritten to document
  the deterministic-by-default contract.
- **Decision R6-B**: ``scorecard()`` raises ``ValueError`` on duplicate
  ``MetricSpec.name``.
- **Decision R6-C**: ``Scorecard.to_pandas()`` MultiIndex schema gains
  ``n_resamples`` + ``method`` columns (additive; lossless against
  ``BootstrapCI.to_dict()``).
- **Decision R6-D**: ``tests/test_public_api.py`` drift guard now captures
  Tier-2 Protocol method signatures.
- **Decision R6-F5**: ``_evaluate_spec()`` no longer swallows
  ``MemoryError`` / ``RecursionError`` / ``KeyboardInterrupt`` /
  ``SystemExit`` into per-cell ``status="error"`` cells.
- **Decision R6-H**: ``metric_specs.make_spec_name(prefix, **kwargs)``
  helper for custom parameterized ``MetricSpec`` name canonicalization.

## Migration checklist

Before bumping the pin to ``eval-toolkit==0.47.0``:

- [ ] Replace ``from eval_toolkit import pr_auc`` (and friends) with
  ``scorecard(...)`` OR ``from eval_toolkit.metrics import …``.
- [ ] Replace ``from eval_toolkit.adversarial import sweep`` with
  ``from eval_toolkit import sweep`` + pass ``TextTransform`` strategies.
- [ ] Replace ``from eval_toolkit.preprocessing import sweep`` with the
  top-level ``sweep()``.
- [ ] Replace ``character_injection.<name>(text)`` /
  ``spotlighting.<name>(text)`` namespace shortcuts with the concrete
  class or functional API.
- [ ] Replace ``CharacterInjectionStrategy`` references with
  ``TextTransform``.
- [ ] If you call ``adversarial.sweep(texts, scorer)`` and rely on the
  ``asr`` column, add ``attack_threshold=<float>`` explicitly.
- [ ] Run your test suite against the new pin; the v0.46→v0.47 transition
  surfaces every removed-symbol callsite as an ``AttributeError`` or
  ``ImportError`` at module-load time.

## What's next (v0.48 polish; v1.0 stability)

The remaining v1.0-prep work is collected in v0.48 and v1.0 per the plan:

- **v0.48** — ``metrics_at_threshold`` key normalization,
  ``BootstrapCI.to_dict()`` rewrite, lazy-extras message audit,
  docstring example sweep, ADRs 0001 + 0003 finalized, Round 5/Round 7
  packet-drift fixes.
- **v1.0** — stability commitment; no new code; final ADR pass; all 4
  gates closed.

See ``~/.claude/plans/evaluate-all-the-work-twinkly-kite.md`` for the
master plan.