eval_toolkit.metrics#
|
Built-in immutable sequence. |
|
Outcome of operating-point selection at a given criterion. |
|
Murphy 1973 [#murphy]_ decomposition of the Brier score. |
|
|
|
Expected calibration error on equal-width probability bins. |
|
Bias-corrected L1 ECE via simulated-H0 Monte-Carlo (Roelofs 2022 spirit). |
|
ECE on equal-mass (quantile) bins. |
|
Equal-mass L2 ECE — root mean squared bin-level miscalibration. |
|
Bias-corrected L2 ECE per Kumar 2019 [#kumar]_ §3.3. |
|
Bundle PR-AUC + ROC-AUC + 3 operating-point F1s + per-stratum recall (if provided). |
|
Precision / recall / F1 / accuracy / TN/FP/FN/TP at a fixed threshold. |
|
|
|
Project precision under a different positive-class prior. |
|
PR-AUC on the central [q_low, q_high] range of any 1-D stratifier. |
|
Full vs trimmed PR-AUC report with a gap-flag (SDD reporting convention). |
|
|
|
Threshold-free score-distribution summary. |
|
Operating metrics for all-positive or all-negative slices. |
|
Recall (TPR) per categorical stratum. |