Skip to main content
Ctrl+K

eval-toolkit

  • Getting Started
  • What’s new
  • Examples
  • Worked example: metrics + bootstrap CIs
  • Worked example: slice-aware evaluate harness
    • Worked example: calibration with Platt + isotonic
    • Worked example: leakage detection
    • Worked example: claims + evidence gates
    • Worked example: paired bootstrap comparison
    • Worked example: prompt-injection classifier evaluation
    • Worked example: PyTorch + LoRA Scorer adapter
    • Worked example: declarative OOD slate loading
    • Worked example: character-injection adversarial sweep
    • Worked example: ActivationDeltaProbe (TaskTracker port)
    • Worked example: Spotlighting structural defenses
    • Worked example: RecallAtLowFPR loss training
    • Methodology
    • Splits
    • Comparison & confidence intervals
    • Reproducibility
    • Claims and Gates
    • Prediction Artifacts and Metric States
    • Evidence And Claims
    • Bootstrap
    • Calibration
    • Leakage
    • Text deduplication
    • Fairness & subgroup slicing
    • Stratified PR-AUC & the gap-flag report
    • Parallelism
    • Reading list
    • Testing your evaluation code
    • Threshold selection
    • Versioning Tier-2 implementations
    • API reference
    • eval_toolkit.adversarial
    • eval_toolkit.analysis
    • eval_toolkit.artifacts
    • eval_toolkit.audit_citation_alignment
    • eval_toolkit.audit_sister_doc_concept_drift
    • eval_toolkit.audit_value_bindings
    • eval_toolkit.bootstrap
    • eval_toolkit.calibration
    • eval_toolkit.claims
    • eval_toolkit.config
    • eval_toolkit.docs
    • eval_toolkit.eda
    • eval_toolkit.embeddings
    • eval_toolkit.evidence
    • eval_toolkit.harness
    • eval_toolkit.leakage
    • eval_toolkit.loaders
    • eval_toolkit.losses
    • eval_toolkit.manifest
    • eval_toolkit.metric_specs
    • eval_toolkit.metrics
    • eval_toolkit.operating_points
    • eval_toolkit.paths
    • eval_toolkit.plotting
    • eval_toolkit.preprocessing
    • eval_toolkit.probes
    • eval_toolkit.protocols
    • Strict Tier-2 Protocols at v1.0
    • eval_toolkit.provenance
    • scorecard family — primary metric surface (v0.46+)
    • eval_toolkit.seeds
    • eval_toolkit.splits
    • eval_toolkit.stacking
    • sweep — unified text-transform enumeration (v0.47)
    • eval_toolkit.text_dedup
    • eval_toolkit.thresholds
    • v0.6.x → v0.7.x migration
    • v0.7.x → v0.8.0 migration
    • v0.8.x → v0.9.0 migration
    • Migrating to v0.46.0
    • Migrating to v0.47
    • Migrating to v0.48
    • Migrating to v0.49
    • Migrating to v0.50
    • Migrating to v0.51
    • Extending eval-toolkit
    • Schema Reference
    • Roadmap
    • Repo Strategy
    • Deprecation policy
    • Migration guides
    • Releasing eval-toolkit
  • GitHub
  • PyPI
  • Getting Started
  • What’s new
  • Examples
  • Worked example: metrics + bootstrap CIs
  • Worked example: slice-aware evaluate harness
  • Worked example: calibration with Platt + isotonic
  • Worked example: leakage detection
  • Worked example: claims + evidence gates
  • Worked example: paired bootstrap comparison
  • Worked example: prompt-injection classifier evaluation
  • Worked example: PyTorch + LoRA Scorer adapter
  • Worked example: declarative OOD slate loading
  • Worked example: character-injection adversarial sweep
  • Worked example: ActivationDeltaProbe (TaskTracker port)
  • Worked example: Spotlighting structural defenses
  • Worked example: RecallAtLowFPR loss training
  • Methodology
  • Splits
  • Comparison & confidence intervals
  • Reproducibility
  • Claims and Gates
  • Prediction Artifacts and Metric States
  • Evidence And Claims
  • Bootstrap
  • Calibration
  • Leakage
  • Text deduplication
  • Fairness & subgroup slicing
  • Stratified PR-AUC & the gap-flag report
  • Parallelism
  • Reading list
  • Testing your evaluation code
  • Threshold selection
  • Versioning Tier-2 implementations
  • API reference
  • eval_toolkit.adversarial
  • eval_toolkit.analysis
  • eval_toolkit.artifacts
  • eval_toolkit.audit_citation_alignment
  • eval_toolkit.audit_sister_doc_concept_drift
  • eval_toolkit.audit_value_bindings
  • eval_toolkit.bootstrap
  • eval_toolkit.calibration
  • eval_toolkit.claims
  • eval_toolkit.config
  • eval_toolkit.docs
  • eval_toolkit.eda
  • eval_toolkit.embeddings
  • eval_toolkit.evidence
  • eval_toolkit.harness
  • eval_toolkit.leakage
  • eval_toolkit.loaders
  • eval_toolkit.losses
  • eval_toolkit.manifest
  • eval_toolkit.metric_specs
  • eval_toolkit.metrics
  • eval_toolkit.operating_points
  • eval_toolkit.paths
  • eval_toolkit.plotting
  • eval_toolkit.preprocessing
  • eval_toolkit.probes
  • eval_toolkit.protocols
  • Strict Tier-2 Protocols at v1.0
  • eval_toolkit.provenance
  • scorecard family — primary metric surface (v0.46+)
  • eval_toolkit.seeds
  • eval_toolkit.splits
  • eval_toolkit.stacking
  • sweep — unified text-transform enumeration (v0.47)
  • eval_toolkit.text_dedup
  • eval_toolkit.thresholds
  • v0.6.x → v0.7.x migration
  • v0.7.x → v0.8.0 migration
  • v0.8.x → v0.9.0 migration
  • Migrating to v0.46.0
  • Migrating to v0.47
  • Migrating to v0.48
  • Migrating to v0.49
  • Migrating to v0.50
  • Migrating to v0.51
  • Extending eval-toolkit
  • Schema Reference
  • Roadmap
  • Repo Strategy
  • Deprecation policy
  • Migration guides
  • Releasing eval-toolkit
  • GitHub
  • PyPI

Section Navigation

  • eval_toolkit.sweep
  • sweep — unified text-transform enumeration (v0.47)
  • eval_toolkit.sweep

eval_toolkit.sweep#

.. currentmodule:: eval_toolkit

.. autofunction:: sweep

previous

sweep — unified text-transform enumeration (v0.47)

next

eval_toolkit.text_dedup

Edit on GitHub
Show Source

© Copyright 2026, Brandon Behring.

Created using Sphinx 9.1.0.

Built with the PyData Sphinx Theme 0.18.0.