`eval_toolkit.metrics`#

`DEFAULT_ASSUMED_PRIORS`	Built-in immutable sequence.
`SINGLE_CLASS_INCOMPATIBLE_METRICS`	Metrics that are mathematically undefined on single-class slices.
`ThresholdResult`	Outcome of operating-point selection at a given criterion.
`brier_decomposition`	Murphy 1973 [#murphy]_ decomposition of the Brier score.
`brier_score`
`expected_calibration_error`	Expected calibration error on equal-width probability bins.
`expected_calibration_error_debiased`	Bias-corrected L1 ECE via simulated-H0 Monte-Carlo (Roelofs 2022 spirit).
`expected_calibration_error_equal_mass`	ECE on equal-mass (quantile) bins.
`expected_calibration_error_l2`	Equal-mass L2 ECE — root mean squared bin-level miscalibration.
`expected_calibration_error_l2_debiased`	Bias-corrected L2 ECE per Kumar 2019 [#kumar]_ §3.3.
`headline_metrics`	Bundle PR-AUC + ROC-AUC + 3 operating-point F1s + per-stratum recall (if provided).
`is_metric_defined_for_slice`	Return `True` iff a metric is well-defined for the given class distribution.
`metrics_at_threshold`	Precision / recall / F1 / accuracy / TN/FP/FN/TP at a fixed threshold.
`pr_auc`
`precision_at_prior`	Project precision under a different positive-class prior.
`quantile_stratified_pr_auc`	PR-AUC on the central [q_low, q_high] range of any 1-D stratifier.
`quantile_stratified_report`	Full vs trimmed PR-AUC report with a gap-flag (SDD reporting convention).
`roc_auc`
`score_distribution_summary`	Threshold-free score-distribution summary.
`single_class_threshold_metrics`	Operating metrics for all-positive or all-negative slices.
`stratified_recall`	Recall (TPR) per categorical stratum.

eval_toolkit.metrics#

`eval_toolkit.metrics`#