Shortcuts

slideflow.biscuit

This module contains an official implementation of BISCUIT, an uncertainty quantification and confidence thresholding algorithm for whole-slide images. The original implementation, which includes instructions for reproducing experimental results reported in the manuscript, is available on GitHub.

See Uncertainty Quantification for more information.

find_cv(project, label, outcome, epoch=None, k=3)[source]

Finds paths to cross-validation models.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • outcome (str, optional) – Outcome name.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

  • kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

Paths to cross-validation models.

Return type:

list(str)

get_model_results(path, epoch, outcome)[source]

Reads results/metrics from a trained model.

Parameters:
  • path (str) – Path to model.

  • outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,

tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

biscuit.Experiment

class Experiment(train_project, eval_projects=None, outcome='cohort', outcome1='LUAD', outcome2='LUSC', outdir='results')[source]

Supervises uncertainty thresholding experiments.

display(self, df, eval_dfs, hue='uq', palette='tab10', relplot_uq_compare=True, boxplot_uq_compare=True, ttest_uq_groups=['all', 'include'], prefix='')

Creates plots from assmebled results, exports results to CSV.

Parameters:
  • df (pandas.DataFrame) – Cross-validation results metrics, as generated by results()

  • eval_dfs (dict(pandas.DataFrame)) – Dict of external eval dataset names (keys) mapped to pandas DataFrame of result metrics (values).

  • hue (str, optional) – Comparison to show with different hue on plots. Defaults to ‘uq’.

  • palette (str, optional) – Seaborn color palette. Defaults to ‘tab10’.

  • relplot_uq_compare (bool, optional) – For the Relplot display, ensure non-UQ and UQ results are generated from the same models/preds.

  • boxplot_uq_compare (bool, optional) – For the boxplot display, ensure non-UQ and UQ results are generated from the same models/preds.

  • ttest_uq_groups (list(str)) – UQ groups to compare via t-test. Defaults to [‘all’, ‘include’].

  • prefix (str, optional) – Prefix to use when saving figures. Defaults to empty string.

Returns:

None

plot_uq_calibration(self, label, tile_uq, slide_uq, slide_pred, epoch=1)

Plots a graph of predictions vs. uncertainty.

Parameters:
  • label (str) – Experiment label.

  • kfold (int) – Validation k-fold.

  • tile_uq (float) – Tile-level uncertainty threshold.

  • slide_uq (float) – Slide-level uncertainty threshold.

  • slide_pred (float) – Slide-level prediction threshold.

Returns:

None

results(self, exp_to_run, uq=True, eval=True, plot=False)

Assembles results from experiments, applies UQ thresholding, and returns pandas dataframes with metrics.

Parameters:
  • exp_to_run (list) – List of experiment IDs to search for results.

  • uq (bool, optional) – Apply UQ thresholds. Defaults to True.

  • eval (bool, optional) – Calculate results of external evaluation models. Defaults to True.

  • plot (bool, optional) – Show plots. Defaults to False.

Returns:

Cross-val results, pandas.DataFrame: Dxternal eval results

Return type:

pandas.DataFrame

thresholds_from_nested_cv(self, label, outer_k=3, inner_k=5, id=None, threshold_params=None, epoch=1, tile_filename='tile_predictions_val_epoch1.csv', y_true=None, y_pred=None, uncertainty=None)

Detects tile- and slide-level UQ thresholds and slide-level prediction thresholds from nested cross-validation.

train(self, hp, label, filters=None, save_predictions='csv', validate_on_batch=32, validation_steps=32, **kwargs)

Train outer cross-validation models.

Parameters:
  • hp (slideflow.ModelParams) – Hyperparameters object.

  • label (str) – Experimental label.

  • filters (dict, optional) – Dataset filters to use for selecting slides. See slideflow.Dataset.filter() for more information. Defaults to None.

  • save_predictions (bool, optional) – Save validation predictions to model folder. Defaults to ‘csv’.

Keyword Arguments:
  • validate_on_batch (int) – Frequency of validation checks during training, in steps. Defaults to 32.

  • validation_steps (int) – Number of validation steps to perform during each mid-training evaluation check. Defaults to 32.

  • **kwargs – All remaining keyword arguments are passed to slideflow.Project.train().

Returns:

None

train_nested_cv(self, hp, label, outer_k=3, inner_k=5, **kwargs)

Train models using nested cross-validation (outer_k=3, inner_k=5), skipping already-generated models.

Parameters:
Keyword Arguments:
  • outer_k (int) – Number of outer cross-folds. Defaults to 3.

  • inner_k (int) – Number of inner cross-folds. Defaults to 5.

  • **kwargs – All remaining keyword arguments are passed to slideflow.Project.train().

Returns:

None

biscuit.hp

nature2022()[source]

Hyperparameters used in the associated manuscript.

Dolezal, J.M., Srisuwananukorn, A., Karpeyev, D. et al. Uncertainty-informed deep learning models enable high-confidence predictions for digital histopathology. Nat Commun 13, 6572 (2022). https://doi.org/10.1038/s41467-022-34025-x

Returns:

sf.ModelParams

biscuit.threshold

apply(df, tile_uq, slide_uq, tile_pred=0.5, slide_pred=0.5, plot=False, keep='high_confidence', title=None, patients=None, level='slide')[source]

Apply pre-calculcated tile- and group-level uncertainty thresholds.

Parameters:
  • df (pandas.DataFrame) – Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.

  • tile_uq (float) – Tile-level uncertainty threshold.

  • slide_uq (float) – Slide-level uncertainty threshold.

  • tile_pred (float, optional) – Tile-level prediction threshold. Defaults to 0.5.

  • slide_pred (float, optional) – Slide-level prediction threshold. Defaults to 0.5.

  • plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.

  • keep (str, optional) – Either ‘high_confidence’ or ‘low_confidence’. Cohort to keep after thresholding. Defaults to ‘high_confidence’.

  • title (str, optional) – Title for uncertainty plot. Defaults to None.

  • patients (dict, optional) – Dictionary mapping slides to patients. Adds a ‘patient’ column in the tile prediction dataframe, enabling patient-level thresholding. Defaults to None.

  • level (str, optional) – Either ‘slide’ or ‘patient’. Level at which to apply threshold. If ‘patient’, requires patient dict be supplied. Defaults to ‘slide’.

Returns:

Dictionary of results, with keys auc, percent_incl, accuracy,

sensitivity, and specificity

DataFrame of thresholded group-level predictions

detect(df, tile_uq='detect', slide_uq='detect', tile_pred='detect', slide_pred='detect', plot=False, patients=None)[source]

Detect optimal tile- and slide-level uncertainty thresholds.

Parameters:
  • df (pandas.DataFrame) – Tile-level predictions. Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.

  • tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.

  • slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.

  • tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.

  • slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.

  • plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.

  • patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.

Returns:

Dictionary with tile- and slide-level UQ and prediction threhsolds,

with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’

Float: Slide-level AUROC

from_cv(dfs, **kwargs)[source]

Finds the optimal tile and slide-level thresholds from a set of nested cross-validation experiments.

Parameters:

dfs (list(DataFrame)) – List of DataFrames with tile predictions, containing headers ‘y_true’, ‘y_pred’, ‘uncertainty’, ‘slide’, and ‘patient’.

Keyword Arguments:
  • tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.

  • slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.

  • tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.

  • slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.

  • plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.

  • patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.

Returns:

Dictionary with tile- and slide-level UQ and prediction threhsolds,

with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’

plot_uncertainty(df, kind, threshold=None, title=None)[source]

Plots figure of tile or slide-level predictions vs. uncertainty.

Parameters:
  • df (pandas.DataFrame) – Processed dataframe containing columns ‘uncertainty’, ‘correct’, ‘y_pred’.

  • kind (str) – Kind of plot. If ‘tile’, subsample to only 1000 points. Included in title.

  • threshold (float, optional) – Uncertainty threshold. Defaults to None.

  • title (str, optional) – Title for plots. Defaults to None.

Returns:

None

process_group_predictions(df, pred_thresh, level)[source]

From a given dataframe of tile-level predictions, calculate group-level predictions and uncertainty.

process_tile_predictions(df, pred_thresh=0.5, patients=None)[source]

Load and process tile-level predictions from CSV.

Parameters:
  • df (pandas.DataFrame) – Unprocessed DataFrame from reading tile-level predictions.

  • pred_thresh (float or str, optional) – Tile-level prediction threshold. If ‘detect’, will auto-detect via Youden’s J. Defaults to 0.5.

  • patients (dict, optional) – Dict mapping slides to patients, used for patient-level thresholding. Defaults to None.

Returns:

pandas.DataFrame, tile prediction threshold

biscuit.utils

auc(y_true, y_pred)[source]

Calculate Area Under Receiver Operator Curve (AUC / AUROC)

Parameters:
  • y_true (np.ndarray) – True labels.

  • y_pred (np.ndarray) – Predictions.

Returns:

AUC

Return type:

Float

auc_and_threshold(y_true, y_pred)[source]

Calculates AUC and optimal threshold (via Youden’s J)

Parameters:
  • y_true (np.ndarray) – Y true (labels).

  • y_pred (np.ndarray) – Y pred (predictions).

Returns:

AUC float: Optimal threshold

Return type:

float

df_from_cv(project, label, outcome, epoch=None, k=3, y_true=None, y_pred=None, uncertainty=None)[source]

Loads tile predictions from cross-fold models & renames columns.

Parameters:
  • project (sf.Project) – Slideflow project.

  • label (str) – Experimental label.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

  • k (int, optional) – K-fold iteration. Defaults to 3.

  • outcome (str, optional) – Outcome name.

  • y_true (str, optional) – Column name for ground truth labels. Defaults to {outcome}_y_true0.

  • y_pred (str, optional) – Column name for predictions. Defaults to {outcome}_y_pred1.

  • uncertainty (str, optional) – Column name for uncertainty. Defaults to {outcome}_y_uncertainty1.

Returns:

DataFrame for each k-fold.

Return type:

list(DataFrame)

eval_exists(project, label, outcome, epoch=1)[source]

Check if matching eval exists.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

Returns:

If eval exists

Return type:

bool

find_cv(project, label, outcome, epoch=None, k=3)[source]

Finds paths to cross-validation models.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • outcome (str, optional) – Outcome name.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

  • kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

Paths to cross-validation models.

Return type:

list(str)

find_cv_early_stop(project, label, outcome, k=3)[source]

Detects early stop batch from cross-val trained models.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • k (int, optional) – Number of k-fold iterations. Defaults to 3.

  • outcome (str) – Outcome name.

Returns:

Early stop batch.

Return type:

int

find_eval(project, label, outcome, epoch=1)[source]

Finds matching eval directory.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • outcome (str, optional) – Outcome name.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

Raises:
  • MultipleModelsFoundError – If multiple matches are found.

  • ModelNotFoundError – If no match is found.

Returns:

path to eval directory

Return type:

str

find_model(project, label, outcome, epoch=None, kfold=None)[source]

Searches for a model in a project model directory.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • outcome (str) – Outcome name.

  • epoch (int, optional) – Epoch to search for. If not None, returns path to the saved model. If None, returns path to parent model folder. Defaults to None.

  • kfold (int, optional) – K-fold iteration. Defaults to None.

Raises:
  • MultipleModelsFoundError – If multiple potential matches are found.

  • ModelNotFoundError – If no matching model is found.

Returns:

Path to matching model.

Return type:

str

get_model_results(path, epoch, outcome)[source]

Reads results/metrics from a trained model.

Parameters:
  • path (str) – Path to model.

  • outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,

tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

get_eval_results(path, outcome)[source]

Reads results/metrics from a trained model.

Parameters:
  • path (str) – Path to model.

  • outcome (str) – Outcome name.

Returns:

pt_auc, pt_ap, slide_auc, slide_ap,

tile_auc, tile_ap, opt_thresh

Return type:

Dict of results with the keys

model_exists(project, label, outcome, epoch=None, kfold=None)[source]

Check if matching model exists.

Parameters:
  • project (slideflow.Project) – Project.

  • label (str) – Experimental label.

  • outcome (str, optional) – Outcome name.

  • epoch (int, optional) – Epoch number of saved model. Defaults to None.

  • kfold (int, optional) – K-fold iteration. Defaults to None.

Returns:

If model exists

Return type:

bool

prediction_metrics(y_true, y_pred, threshold)[source]

Calculate prediction metrics (AUC, sensitivity/specificity, etc)

Parameters:
  • y_true (np.ndarray) – True labels.

  • y_pred (np.ndarray) – Predictions.

  • threshold (_type_) – Prediction threshold.

Returns:

Prediction metrics.

Return type:

dict

read_group_predictions(path)[source]

Reads patient- or slide-level predictions CSV or parquet file, returning y_true and y_pred.

Expects a binary categorical outcome.

Compatible with Slideflow 1.1 and 1.2.

truncate_colormap(cmap, minval=0.0, maxval=1.0, n=100)[source]

Truncates matplotlib colormap.

biscuit.delong

fastDeLong(predictions_sorted_transposed, label_1_count)[source]

The fast version of DeLong’s method for computing the covariance of unadjusted AUC.

Parameters:

predictions_sorted_transposed – a 2D numpy.array[n_classifiers, n_examples] sorted such as the examples with label “1” are first

Returns:

(AUC value, DeLong covariance)

Reference:

@article{sun2014fast, title={Fast Implementation of DeLong’s Algorithm for

Comparing the Areas Under Correlated Receiver Operating Characteristic Curves},

author={Xu Sun and Weichao Xu}, journal={IEEE Signal Processing Letters}, volume={21}, number={11}, pages={1389–1393}, year={2014}, publisher={IEEE} }

delong_roc_variance(ground_truth, predictions)[source]

Computes ROC AUC variance for a single set of predictions

Parameters:
  • ground_truth – np.array of 0 and 1

  • predictions – np.array of floats of the probability of being class 1

delong_roc_test(ground_truth, predictions_one, predictions_two)[source]

Computes log(p-value) for hypothesis that two ROC AUCs are different

Parameters:
  • ground_truth – np.array of 0 and 1

  • predictions_one – predictions of the first model, np.array of floats of the probability of being class 1

  • predictions_two – predictions of the second model, np.array of floats of the probability of being class 1