Worked example: PyTorch + LoRA Scorer adapter#
What this shows. How to wrap a PyTorch transformer with a LoRA adapter as a
Scorerfor eval-toolkit’s harness — batched inference, GPU/CPU placement, deterministic-mode setup, returning a numpy array.Optional dependencies. This example requires
torch≥ 2.5 andtransformers(andpeftif you actually fine-tune with LoRA). The toolkit’s core does not depend on any of these — they’re consumer-side concerns.Code blocks below are marked
<!-- skip: next -->so Sybil doesn’t try to execute them in CI without torch installed. The toolkit’s owntests/includes a guarded smoke test (pytest.importorskip("torch")) that does run end-to-end if torch is available — see verification step in the v0.7.0 plan.
Setup (CPU baseline; runs in CI)#
import numpy as np
import pandas as pd
from eval_toolkit import EvalSlice, evaluate
Minimal Scorer Protocol shape#
The Protocol is just predict_proba(X) -> np.ndarray:
class _UniformBaseline:
"""Reference shape — not a real model, just shows the Protocol."""
version = "0.0.0"
def predict_proba(self, X: list[str]) -> np.ndarray:
rng = np.random.default_rng(42)
return rng.uniform(0, 1, size=len(X))
df = pd.DataFrame({"text": [f"row_{i}" for i in range(40)],
"label": [i % 2 for i in range(40)]})
slice_ = EvalSlice(name="test", df=df)
result = evaluate({"u": _UniformBaseline()}, [slice_], run_id="proto-shape")
print(f"PR-AUC: {result.by_slice['test']['by_scorer']['u']['pr_auc']:.3f}")
Anything implementing this is a valid Scorer. The PyTorch adapter
below is the same shape, with the work happening inside
predict_proba.
Transformer Scorer skeleton#
The pattern: a class wrapping (tokenizer, model, device, batch_size),
with predict_proba doing tokenize → forward → softmax → numpy in
batches.
# Requires: torch, transformers. Marked skip for Sybil.
from __future__ import annotations
import os
# IMPORTANT: set CUBLAS_WORKSPACE_CONFIG BEFORE importing torch.cuda.
os.environ.setdefault("CUBLAS_WORKSPACE_CONFIG", ":4096:8")
import numpy as np
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
# One-time, BEFORE first CUDA op:
torch.use_deterministic_algorithms(True, warn_only=True)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
class TransformerScorer:
"""eval_toolkit.Scorer wrapping a HuggingFace classification model.
Run with:
scorer = TransformerScorer(
"protectai/deberta-v3-base-prompt-injection-v2",
device="cuda" if torch.cuda.is_available() else "cpu",
batch_size=16,
)
"""
version = "v2-2025-q4" # bump when the underlying checkpoint changes
def __init__(self, model_name: str, *, device: str = "cpu",
batch_size: int = 16, max_length: int = 512) -> None:
self.model_name = model_name
self.device = device
self.batch_size = batch_size
self.max_length = max_length
self.tokenizer = AutoTokenizer.from_pretrained(model_name)
self.model = (
AutoModelForSequenceClassification.from_pretrained(model_name)
.to(device)
.eval()
)
@torch.no_grad()
def predict_proba(self, X: list[str]) -> np.ndarray:
"""Return P(positive) for each text. Batched + numpy-out."""
out = []
for i in range(0, len(X), self.batch_size):
batch = X[i : i + self.batch_size]
enc = self.tokenizer(
batch,
padding=True,
truncation=True,
max_length=self.max_length,
return_tensors="pt",
).to(self.device)
logits = self.model(**enc).logits # shape (batch, n_classes)
probs = torch.softmax(logits, dim=-1)[:, 1] # P(injection)
out.append(probs.cpu().numpy())
return np.concatenate(out)
With LoRA adapters#
If you fine-tuned with PEFT, load the base model + adapter and otherwise reuse the wrapper above:
# Requires: torch, transformers, peft. Marked skip for Sybil.
from peft import PeftModel # noqa
class LoRATransformerScorer(TransformerScorer):
"""Loads base model + LoRA adapter; predict_proba reused unchanged."""
version = "lora-2026-q1"
def __init__(self, base_model_name: str, lora_path: str, **kwargs) -> None:
super().__init__(base_model_name, **kwargs)
self.model = PeftModel.from_pretrained(self.model, lora_path).eval()
With SliceAwareScorer for cost control#
LLM-judge and large-transformer scorers are expensive — typical
production runs skip them on inexpensive subgroup slices and only run
them on the headline test slice. Implement
SliceAwareScorer’s
should_score_slice hook:
# Requires: torch, transformers. Marked skip for Sybil.
class CostControlledTransformerScorer(TransformerScorer):
"""Skip subgroup slices to save GPU minutes."""
version = "v2-cost-controlled"
SLICE_ALLOW_LIST: frozenset[str] = frozenset({
"test", "ood_lakera", "ood_llmail", # full-cost slices
})
def should_score_slice(self, slice_name: str) -> bool:
return slice_name in self.SLICE_ALLOW_LIST
The harness honors should_score_slice automatically — see
evaluate(...); skipped slices
land in RunResult.by_slice[name].by_scorer[scorer_name] = {"skipped": "..."}.
DataLoader worker seeding (when training, not just inference)#
If your Scorer.fit(...) trains a transformer, the DataLoader worker
seeding rules in
reproducibility.md §”DataLoader worker seeding”
apply:
# Requires: torch. Marked skip for Sybil.
import random # noqa
import numpy as np # noqa
import torch # noqa
from torch.utils.data import DataLoader # noqa
def seed_worker(worker_id: int) -> None:
"""Per-worker reseed used as DataLoader.worker_init_fn."""
seed = torch.initial_seed() % 2**32
np.random.seed(seed)
random.seed(seed)
# generator = torch.Generator()
# generator.manual_seed(42)
# loader = DataLoader(
# dataset, batch_size=32, shuffle=True,
# worker_init_fn=seed_worker,
# generator=generator,
# )
Without these two arguments, every fold’s data shuffle is
non-deterministic regardless of set_global_seeds(42).
fp16 / bf16: trade-offs#
Mixed-precision speeds up inference 1.5–3× on modern GPUs but introduces small numerical noise that bootstrap CIs absorb but bit- identity does not. Two implications:
Calibrate at inference precision. If you’ll deploy in bf16, fit
fit_temperatureon bf16 logits, not fp32 logits cast to bf16.Don’t compare ECE across precision levels. A 0.001–0.005 ECE delta is well within fp16/bf16 noise on moderate-size eval sets.
# Requires: torch. Marked skip for Sybil.
# bf16 inference:
# self.model = self.model.to(dtype=torch.bfloat16)
# logits.float() # cast back to fp32 BEFORE softmax for numerical stability
Putting the LoRA scorer into the harness#
Drop-in replacement for _UniformBaseline in the PI
walkthrough:
# Requires: torch, transformers, peft. Marked skip for Sybil.
# from eval_toolkit import evaluate_folded, SourceDisjointKFoldSplitter
#
# scorer = LoRATransformerScorer(
# base_model_name="microsoft/deberta-v3-base",
# lora_path="my/adapter",
# device="cuda",
# batch_size=32,
# )
#
# result = evaluate_folded(
# {"deberta_lora": scorer},
# SourceDisjointKFoldSplitter(source_col="source", k=3, seed=42),
# parent_slice,
# run_id="lora-v1",
# leakage_checks=[NormalizedFormLeakageCheck(), CrossSplitLeakageCheck()],
# on_leakage="record",
# )
Copy-paste minimal scorer (CPU, no LoRA, no PEFT)#
If you just want to start scoring with a public injection-detector checkpoint:
# Requires: torch, transformers. Marked skip for Sybil.
# import os; os.environ.setdefault("CUBLAS_WORKSPACE_CONFIG", ":4096:8")
# import torch
# torch.use_deterministic_algorithms(True, warn_only=True)
#
# scorer = TransformerScorer(
# "protectai/deberta-v3-base-prompt-injection-v2",
# device="cpu",
# batch_size=8,
# )
# # Now plug into evaluate(...) as in the PI walkthrough.
Pitfalls / Common mistakes#
CUBLAS_WORKSPACE_CONFIGset after import. Has no effect once the CUDA context exists. Set inos.environBEFOREimport torch.Calling
predict_probaoutsidetorch.no_grad(). Builds the autograd graph; OOMs fast on long inputs. The decorator above prevents this.model.train()left on during eval. Dropout is still active; everypredict_probacall returns slightly different scores.model.eval()is mandatory.Returning a torch tensor from
predict_proba. The harness expectsnp.ndarray. Always.cpu().numpy()at the end.fp16 inference with
torch.softmaxon small logits. Underflows to 0 in the tail; you lose calibration in the [0, 0.01] regime. Cast to fp32 before softmax.