--- jupytext: text_representation: extension: .md format_name: myst kernelspec: display_name: Python 3 language: python name: python3 mystnb: execution_mode: 'off' --- # Worked example: PyTorch + LoRA `Scorer` adapter > **What this shows.** How to wrap a PyTorch transformer with a LoRA > adapter as a `Scorer` for eval-toolkit's harness — batched inference, > GPU/CPU placement, deterministic-mode setup, returning a numpy array. > > **Optional dependencies.** This example requires > [`torch`](https://pytorch.org/) ≥ 2.5 and > [`transformers`](https://huggingface.co/docs/transformers/) (and > [`peft`](https://huggingface.co/docs/peft/) if you actually fine-tune > with LoRA). The toolkit's core does *not* depend on any of these — > they're consumer-side concerns. > > Code blocks below are marked `` so Sybil doesn't > try to execute them in CI without torch installed. The toolkit's own > `tests/` includes a guarded smoke test > (`pytest.importorskip("torch")`) that does run end-to-end if torch > is available — see verification step in the v0.7.0 plan. ## Setup (CPU baseline; runs in CI) ```{code-cell} import numpy as np import pandas as pd from eval_toolkit import EvalSlice, evaluate ``` ## Minimal `Scorer` Protocol shape The Protocol is just `predict_proba(X) -> np.ndarray`: ```{code-cell} class _UniformBaseline: """Reference shape — not a real model, just shows the Protocol.""" version = "0.0.0" def predict_proba(self, X: list[str]) -> np.ndarray: rng = np.random.default_rng(42) return rng.uniform(0, 1, size=len(X)) df = pd.DataFrame({"text": [f"row_{i}" for i in range(40)], "label": [i % 2 for i in range(40)]}) slice_ = EvalSlice(name="test", df=df) result = evaluate({"u": _UniformBaseline()}, [slice_], run_id="proto-shape") print(f"PR-AUC: {result.by_slice['test']['by_scorer']['u']['pr_auc']:.3f}") ``` Anything implementing this is a valid `Scorer`. The PyTorch adapter below is the same shape, with the work happening inside `predict_proba`. ## Transformer `Scorer` skeleton The pattern: a class wrapping `(tokenizer, model, device, batch_size)`, with `predict_proba` doing tokenize → forward → softmax → numpy in batches. ```{code-cell} # Requires: torch, transformers. Marked skip for Sybil. from __future__ import annotations import os # IMPORTANT: set CUBLAS_WORKSPACE_CONFIG BEFORE importing torch.cuda. os.environ.setdefault("CUBLAS_WORKSPACE_CONFIG", ":4096:8") import numpy as np import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer # One-time, BEFORE first CUDA op: torch.use_deterministic_algorithms(True, warn_only=True) torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False class TransformerScorer: """eval_toolkit.Scorer wrapping a HuggingFace classification model. Run with: scorer = TransformerScorer( "protectai/deberta-v3-base-prompt-injection-v2", device="cuda" if torch.cuda.is_available() else "cpu", batch_size=16, ) """ version = "v2-2025-q4" # bump when the underlying checkpoint changes def __init__(self, model_name: str, *, device: str = "cpu", batch_size: int = 16, max_length: int = 512) -> None: self.model_name = model_name self.device = device self.batch_size = batch_size self.max_length = max_length self.tokenizer = AutoTokenizer.from_pretrained(model_name) self.model = ( AutoModelForSequenceClassification.from_pretrained(model_name) .to(device) .eval() ) @torch.no_grad() def predict_proba(self, X: list[str]) -> np.ndarray: """Return P(positive) for each text. Batched + numpy-out.""" out = [] for i in range(0, len(X), self.batch_size): batch = X[i : i + self.batch_size] enc = self.tokenizer( batch, padding=True, truncation=True, max_length=self.max_length, return_tensors="pt", ).to(self.device) logits = self.model(**enc).logits # shape (batch, n_classes) probs = torch.softmax(logits, dim=-1)[:, 1] # P(injection) out.append(probs.cpu().numpy()) return np.concatenate(out) ``` ## With LoRA adapters If you fine-tuned with [PEFT](https://huggingface.co/docs/peft/), load the base model + adapter and otherwise reuse the wrapper above: ```{code-cell} # Requires: torch, transformers, peft. Marked skip for Sybil. from peft import PeftModel # noqa class LoRATransformerScorer(TransformerScorer): """Loads base model + LoRA adapter; predict_proba reused unchanged.""" version = "lora-2026-q1" def __init__(self, base_model_name: str, lora_path: str, **kwargs) -> None: super().__init__(base_model_name, **kwargs) self.model = PeftModel.from_pretrained(self.model, lora_path).eval() ``` ## With `SliceAwareScorer` for cost control LLM-judge and large-transformer scorers are expensive — typical production runs skip them on inexpensive subgroup slices and only run them on the headline `test` slice. Implement [`SliceAwareScorer`](../api/harness.md)'s `should_score_slice` hook: ```{code-cell} # Requires: torch, transformers. Marked skip for Sybil. class CostControlledTransformerScorer(TransformerScorer): """Skip subgroup slices to save GPU minutes.""" version = "v2-cost-controlled" SLICE_ALLOW_LIST: frozenset[str] = frozenset({ "test", "ood_lakera", "ood_llmail", # full-cost slices }) def should_score_slice(self, slice_name: str) -> bool: return slice_name in self.SLICE_ALLOW_LIST ``` The harness honors `should_score_slice` automatically — see [`evaluate(...)`](../api/harness.md); skipped slices land in `RunResult.by_slice[name].by_scorer[scorer_name] = {"skipped": "..."}`. ## DataLoader worker seeding (when training, not just inference) If your `Scorer.fit(...)` trains a transformer, the DataLoader worker seeding rules in [reproducibility.md §"DataLoader worker seeding"](../methodology/reproducibility.md#pytorch-determinism) apply: ```{code-cell} # Requires: torch. Marked skip for Sybil. import random # noqa import numpy as np # noqa import torch # noqa from torch.utils.data import DataLoader # noqa def seed_worker(worker_id: int) -> None: """Per-worker reseed used as DataLoader.worker_init_fn.""" seed = torch.initial_seed() % 2**32 np.random.seed(seed) random.seed(seed) # generator = torch.Generator() # generator.manual_seed(42) # loader = DataLoader( # dataset, batch_size=32, shuffle=True, # worker_init_fn=seed_worker, # generator=generator, # ) ``` Without these two arguments, every fold's data shuffle is non-deterministic regardless of `set_global_seeds(42)`. ## fp16 / bf16: trade-offs Mixed-precision speeds up inference 1.5–3× on modern GPUs but introduces small numerical noise that bootstrap CIs absorb but bit- identity does not. Two implications: - **Calibrate at inference precision.** If you'll deploy in bf16, fit [`fit_temperature`](../api/calibration.md) on bf16 logits, not fp32 logits cast to bf16. - **Don't compare ECE across precision levels.** A 0.001–0.005 ECE delta is well within fp16/bf16 noise on moderate-size eval sets. ```{code-cell} # Requires: torch. Marked skip for Sybil. # bf16 inference: # self.model = self.model.to(dtype=torch.bfloat16) # logits.float() # cast back to fp32 BEFORE softmax for numerical stability ``` ## Putting the LoRA scorer into the harness Drop-in replacement for `_UniformBaseline` in the [PI walkthrough](prompt_injection_walkthrough.md): ```{code-cell} # Requires: torch, transformers, peft. Marked skip for Sybil. # from eval_toolkit import evaluate_folded, SourceDisjointKFoldSplitter # # scorer = LoRATransformerScorer( # base_model_name="microsoft/deberta-v3-base", # lora_path="my/adapter", # device="cuda", # batch_size=32, # ) # # result = evaluate_folded( # {"deberta_lora": scorer}, # SourceDisjointKFoldSplitter(source_col="source", k=3, seed=42), # parent_slice, # run_id="lora-v1", # leakage_checks=[NormalizedFormLeakageCheck(), CrossSplitLeakageCheck()], # on_leakage="record", # ) ``` ## Copy-paste minimal scorer (CPU, no LoRA, no PEFT) If you just want to start scoring with a public injection-detector checkpoint: ```{code-cell} # Requires: torch, transformers. Marked skip for Sybil. # import os; os.environ.setdefault("CUBLAS_WORKSPACE_CONFIG", ":4096:8") # import torch # torch.use_deterministic_algorithms(True, warn_only=True) # # scorer = TransformerScorer( # "protectai/deberta-v3-base-prompt-injection-v2", # device="cpu", # batch_size=8, # ) # # Now plug into evaluate(...) as in the PI walkthrough. ``` ## Pitfalls / Common mistakes - **`CUBLAS_WORKSPACE_CONFIG` set after import.** Has no effect once the CUDA context exists. Set in `os.environ` BEFORE `import torch`. - **Calling `predict_proba` outside `torch.no_grad()`.** Builds the autograd graph; OOMs fast on long inputs. The decorator above prevents this. - **`model.train()` left on during eval.** Dropout is still active; every `predict_proba` call returns slightly different scores. `model.eval()` is mandatory. - **Returning a torch tensor from `predict_proba`.** The harness expects `np.ndarray`. Always `.cpu().numpy()` at the end. - **fp16 inference with `torch.softmax` on small logits.** Underflows to 0 in the tail; you lose calibration in the [0, 0.01] regime. Cast to fp32 *before* softmax. ## See also - [extending.md §"Implementing a Scorer"](../extending.md#scorer) - [methodology/calibration.md §"PyTorch & transformer specifics" ](../methodology/calibration.md#pytorch) - [methodology/reproducibility.md §"PyTorch determinism" ](../methodology/reproducibility.md#pytorch-determinism) - [PyTorch 2.8 reproducibility notes](https://docs.pytorch.org/docs/stable/notes/randomness.html) - HuggingFace [`AutoModelForSequenceClassification`](https://huggingface.co/docs/transformers/main/en/model_doc/auto#transformers.AutoModelForSequenceClassification) / [PEFT LoRA](https://huggingface.co/docs/peft/main/en/conceptual_guides/lora)