slideflow.biscuit¶

This module contains an official implementation of BISCUIT, an uncertainty quantification and confidence thresholding algorithm for whole-slide images. The original implementation, which includes instructions for reproducing experimental results reported in the manuscript, is available on GitHub.

This module is requires the slideflow-noncommercial package, which can be installed with:

pip install slideflow-noncommercial

See Uncertainty Quantification for more information.

find_cv(project, label, outcome, epoch=None, k=3)[source]¶

Finds paths to cross-validation models.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

Paths to cross-validation models.

Return type:

list(str)

get_model_results(path, epoch, outcome)[source]¶

Reads results/metrics from a trained model.

Parameters:

path (str) – Path to model.
outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,: tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

biscuit.Experiment¶

class Experiment(train_project, eval_projects=None, outcome='cohort', outcome1='LUAD', outcome2='LUSC', outdir='results')[source]¶: Supervises uncertainty thresholding experiments.

display(self, df, eval_dfs, hue='uq', palette='tab10', relplot_uq_compare=True, boxplot_uq_compare=True, ttest_uq_groups=['all', 'include'], prefix='')¶

Creates plots from assmebled results, exports results to CSV.

Parameters:

df (pandas.DataFrame) – Cross-validation results metrics, as generated by results()
eval_dfs (dict(pandas.DataFrame)) – Dict of external eval dataset names (keys) mapped to pandas DataFrame of result metrics (values).
hue (str, optional) – Comparison to show with different hue on plots. Defaults to ‘uq’.
palette (str, optional) – Seaborn color palette. Defaults to ‘tab10’.
relplot_uq_compare (bool, optional) – For the Relplot display, ensure non-UQ and UQ results are generated from the same models/preds.
boxplot_uq_compare (bool, optional) – For the boxplot display, ensure non-UQ and UQ results are generated from the same models/preds.
ttest_uq_groups (list(str)) – UQ groups to compare via t-test. Defaults to [‘all’, ‘include’].
prefix (str, optional) – Prefix to use when saving figures. Defaults to empty string.

Returns:

None

plot_uq_calibration(self, label, tile_uq, slide_uq, slide_pred, epoch=1)¶

Plots a graph of predictions vs. uncertainty.

Parameters:

label (str) – Experiment label.
kfold (int) – Validation k-fold.
tile_uq (float) – Tile-level uncertainty threshold.
slide_uq (float) – Slide-level uncertainty threshold.
slide_pred (float) – Slide-level prediction threshold.

Returns:

None

results(self, exp_to_run, uq=True, eval=True, plot=False)¶

Assembles results from experiments, applies UQ thresholding, and returns pandas dataframes with metrics.

Parameters:

exp_to_run (list) – List of experiment IDs to search for results.
uq (bool, optional) – Apply UQ thresholds. Defaults to True.
eval (bool, optional) – Calculate results of external evaluation models. Defaults to True.
plot (bool, optional) – Show plots. Defaults to False.

Returns:

Cross-val results, pandas.DataFrame: Dxternal eval results

Return type:

pandas.DataFrame

thresholds_from_nested_cv(self, label, outer_k=3, inner_k=5, id=None, threshold_params=None, epoch=1, tile_filename='tile_predictions_val_epoch1.csv', y_true=None, y_pred=None, uncertainty=None)¶: Detects tile- and slide-level UQ thresholds and slide-level prediction thresholds from nested cross-validation.

train(self, hp, label, filters=None, save_predictions='csv', validate_on_batch=32, validation_steps=32, **kwargs)¶

Train outer cross-validation models.

Parameters:

hp (slideflow.ModelParams) – Hyperparameters object.
label (str) – Experimental label.
filters (dict, optional) – Dataset filters to use for selecting slides. See slideflow.Dataset.filter() for more information. Defaults to None.
save_predictions (bool, optional) – Save validation predictions to model folder. Defaults to ‘csv’.

Keyword Arguments:

validate_on_batch (int) – Frequency of validation checks during training, in steps. Defaults to 32.
validation_steps (int) – Number of validation steps to perform during each mid-training evaluation check. Defaults to 32.
**kwargs – All remaining keyword arguments are passed to slideflow.Project.train().

Returns:

None

train_nested_cv(self, hp, label, outer_k=3, inner_k=5, **kwargs)¶

Train models using nested cross-validation (outer_k=3, inner_k=5), skipping already-generated models.

Parameters:

hp (slideflow.ModelParams) – Hyperparameters object.
label (str) – Experimental label.

Keyword Arguments:

outer_k (int) – Number of outer cross-folds. Defaults to 3.
inner_k (int) – Number of inner cross-folds. Defaults to 5.
**kwargs – All remaining keyword arguments are passed to slideflow.Project.train().

Returns:

None

biscuit.hp¶

nature2022()[source]¶

Hyperparameters used in the associated manuscript.

Dolezal, J.M., Srisuwananukorn, A., Karpeyev, D. et al. Uncertainty-informed deep learning models enable high-confidence predictions for digital histopathology. Nat Commun 13, 6572 (2022). https://doi.org/10.1038/s41467-022-34025-x

Returns:: sf.ModelParams

biscuit.threshold¶

apply(df, tile_uq, slide_uq, tile_pred=0.5, slide_pred=0.5, plot=False, keep='high_confidence', title=None, patients=None, level='slide')[source]¶

Apply pre-calculcated tile- and group-level uncertainty thresholds.

Parameters:

df (pandas.DataFrame) – Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.
tile_uq (float) – Tile-level uncertainty threshold.
slide_uq (float) – Slide-level uncertainty threshold.
tile_pred (float, optional) – Tile-level prediction threshold. Defaults to 0.5.
slide_pred (float, optional) – Slide-level prediction threshold. Defaults to 0.5.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
keep (str, optional) – Either ‘high_confidence’ or ‘low_confidence’. Cohort to keep after thresholding. Defaults to ‘high_confidence’.
title (str, optional) – Title for uncertainty plot. Defaults to None.
patients (dict, optional) – Dictionary mapping slides to patients. Adds a ‘patient’ column in the tile prediction dataframe, enabling patient-level thresholding. Defaults to None.
level (str, optional) – Either ‘slide’ or ‘patient’. Level at which to apply threshold. If ‘patient’, requires patient dict be supplied. Defaults to ‘slide’.

Returns:

Dictionary of results, with keys auc, percent_incl, accuracy,: sensitivity, and specificity

DataFrame of thresholded group-level predictions

detect(df, tile_uq='detect', slide_uq='detect', tile_pred='detect', slide_pred='detect', plot=False, patients=None)[source]¶

Detect optimal tile- and slide-level uncertainty thresholds.

Parameters:

df (pandas.DataFrame) – Tile-level predictions. Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.
tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.
slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.
tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.
slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.

Returns:

Dictionary with tile- and slide-level UQ and prediction threhsolds,: with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’

Float: Slide-level AUROC

from_cv(dfs, **kwargs)[source]¶

Finds the optimal tile and slide-level thresholds from a set of nested cross-validation experiments.

Parameters:

dfs (list(DataFrame)) – List of DataFrames with tile predictions, containing headers ‘y_true’, ‘y_pred’, ‘uncertainty’, ‘slide’, and ‘patient’.

Keyword Arguments:

tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.
slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.
tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.
slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.

Returns:

Dictionary with tile- and slide-level UQ and prediction threhsolds,: with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’

plot_uncertainty(df, kind, threshold=None, title=None)[source]¶

Plots figure of tile or slide-level predictions vs. uncertainty.

Parameters:

df (pandas.DataFrame) – Processed dataframe containing columns ‘uncertainty’, ‘correct’, ‘y_pred’.
kind (str) – Kind of plot. If ‘tile’, subsample to only 1000 points. Included in title.
threshold (float, optional) – Uncertainty threshold. Defaults to None.
title (str, optional) – Title for plots. Defaults to None.

Returns:

None

process_group_predictions(df, pred_thresh, level)[source]¶: From a given dataframe of tile-level predictions, calculate group-level predictions and uncertainty.

process_tile_predictions(df, pred_thresh=0.5, patients=None)[source]¶

Load and process tile-level predictions from CSV.

Parameters:

df (pandas.DataFrame) – Unprocessed DataFrame from reading tile-level predictions.
pred_thresh (float or str, optional) – Tile-level prediction threshold. If ‘detect’, will auto-detect via Youden’s J. Defaults to 0.5.
patients (dict, optional) – Dict mapping slides to patients, used for patient-level thresholding. Defaults to None.

Returns:

pandas.DataFrame, tile prediction threshold

biscuit.utils¶

auc(y_true, y_pred)[source]¶

Calculate Area Under Receiver Operator Curve (AUC / AUROC)

Parameters:

y_true (np.ndarray) – True labels.
y_pred (np.ndarray) – Predictions.

Returns:

AUC

Return type:

Float

auc_and_threshold(y_true, y_pred)[source]¶

Calculates AUC and optimal threshold (via Youden’s J)

Parameters:

y_true (np.ndarray) – Y true (labels).
y_pred (np.ndarray) – Y pred (predictions).

Returns:

AUC float: Optimal threshold

Return type:

float

df_from_cv(project, label, outcome, epoch=None, k=3, y_true=None, y_pred=None, uncertainty=None)[source]¶

Loads tile predictions from cross-fold models & renames columns.

Parameters:

project (sf.Project) – Slideflow project.
label (str) – Experimental label.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
k (int, optional) – K-fold iteration. Defaults to 3.
outcome (str, optional) – Outcome name.
y_true (str, optional) – Column name for ground truth labels. Defaults to {outcome}_y_true0.
y_pred (str, optional) – Column name for predictions. Defaults to {outcome}_y_pred1.
uncertainty (str, optional) – Column name for uncertainty. Defaults to {outcome}_y_uncertainty1.

Returns:

DataFrame for each k-fold.

Return type:

list(DataFrame)

eval_exists(project, label, outcome, epoch=1)[source]¶

Check if matching eval exists.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
epoch (int, optional) – Epoch number of saved model. Defaults to None.

Returns:

If eval exists

Return type:

bool

find_cv(project, label, outcome, epoch=None, k=3)[source]¶

Finds paths to cross-validation models.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

Paths to cross-validation models.

Return type:

list(str)

find_cv_early_stop(project, label, outcome, k=3)[source]¶

Detects early stop batch from cross-val trained models.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
k (int, optional) – Number of k-fold iterations. Defaults to 3.
outcome (str) – Outcome name.

Returns:

Early stop batch.

Return type:

int

find_eval(project, label, outcome, epoch=1)[source]¶

Finds matching eval directory.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.

Raises:

MultipleModelsFoundError – If multiple matches are found.
ModelNotFoundError – If no match is found.

Returns:

path to eval directory

Return type:

str

find_model(project, label, outcome, epoch=None, kfold=None)[source]¶

Searches for a model in a project model directory.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str) – Outcome name.
epoch (int, optional) – Epoch to search for. If not None, returns path to the saved model. If None, returns path to parent model folder. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.

Raises:

MultipleModelsFoundError – If multiple potential matches are found.
ModelNotFoundError – If no matching model is found.

Returns:

Path to matching model.

Return type:

str

get_model_results(path, epoch, outcome)[source]¶

Reads results/metrics from a trained model.

Parameters:

path (str) – Path to model.
outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,: tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

get_eval_results(path, outcome)[source]¶

Reads results/metrics from a trained model.

Parameters:

path (str) – Path to model.
outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,: tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

model_exists(project, label, outcome, epoch=None, kfold=None)[source]¶

Check if matching model exists.

Parameters:

project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

If model exists

Return type:

bool

prediction_metrics(y_true, y_pred, threshold)[source]¶

Calculate prediction metrics (AUC, sensitivity/specificity, etc)

Parameters:

y_true (np.ndarray) – True labels.
y_pred (np.ndarray) – Predictions.
threshold (_type_) – Prediction threshold.

Returns:

Prediction metrics.

Return type:

dict

read_group_predictions(path)[source]¶

Reads patient- or slide-level predictions CSV or parquet file, returning y_true and y_pred.

Expects a binary categorical outcome.

Compatible with Slideflow 1.1 and 1.2.

truncate_colormap(cmap, minval=0.0, maxval=1.0, n=100)[source]¶: Truncates matplotlib colormap.

biscuit.delong¶

fastDeLong(predictions_sorted_transposed, label_1_count)[source]¶

The fast version of DeLong’s method for computing the covariance of unadjusted AUC.

Parameters:: predictions_sorted_transposed – a 2D numpy.array[n_classifiers, n_examples] sorted such as the examples with label “1” are first
Returns:: (AUC value, DeLong covariance)

delong_roc_variance(ground_truth, predictions)[source]¶

Computes ROC AUC variance for a single set of predictions

Parameters:

ground_truth – np.array of 0 and 1
predictions – np.array of floats of the probability of being class 1

delong_roc_test(ground_truth, predictions_one, predictions_two)[source]¶

Computes log(p-value) for hypothesis that two ROC AUCs are different

Parameters:

ground_truth – np.array of 0 and 1
predictions_one – predictions of the first model, np.array of floats of the probability of being class 1
predictions_two – predictions of the second model, np.array of floats of the probability of being class 1