slideflow.biscuit¶
This module contains an official implementation of BISCUIT, an uncertainty quantification and confidence thresholding algorithm for whole-slide images. The original implementation, which includes instructions for reproducing experimental results reported in the manuscript, is available on GitHub.
This module is requires the slideflow-noncommercial
package, which can be installed with:
pip install slideflow-noncommercial
See Uncertainty Quantification for more information.
- find_cv(project, label, outcome, epoch=None, k=3)[source]¶
Finds paths to cross-validation models.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.
- Returns:
Paths to cross-validation models.
- Return type:
biscuit.Experiment¶
- class Experiment(train_project, eval_projects=None, outcome='cohort', outcome1='LUAD', outcome2='LUSC', outdir='results')[source]¶
Supervises uncertainty thresholding experiments.
- display(self, df, eval_dfs, hue='uq', palette='tab10', relplot_uq_compare=True, boxplot_uq_compare=True, ttest_uq_groups=['all', 'include'], prefix='')¶
Creates plots from assmebled results, exports results to CSV.
- Parameters:
df (pandas.DataFrame) – Cross-validation results metrics, as generated by results()
eval_dfs (dict(pandas.DataFrame)) – Dict of external eval dataset names (keys) mapped to pandas DataFrame of result metrics (values).
hue (str, optional) – Comparison to show with different hue on plots. Defaults to ‘uq’.
palette (str, optional) – Seaborn color palette. Defaults to ‘tab10’.
relplot_uq_compare (bool, optional) – For the Relplot display, ensure non-UQ and UQ results are generated from the same models/preds.
boxplot_uq_compare (bool, optional) – For the boxplot display, ensure non-UQ and UQ results are generated from the same models/preds.
ttest_uq_groups (list(str)) – UQ groups to compare via t-test. Defaults to [‘all’, ‘include’].
prefix (str, optional) – Prefix to use when saving figures. Defaults to empty string.
- Returns:
None
- plot_uq_calibration(self, label, tile_uq, slide_uq, slide_pred, epoch=1)¶
Plots a graph of predictions vs. uncertainty.
- results(self, exp_to_run, uq=True, eval=True, plot=False)¶
Assembles results from experiments, applies UQ thresholding, and returns pandas dataframes with metrics.
- Parameters:
- Returns:
Cross-val results, pandas.DataFrame: Dxternal eval results
- Return type:
pandas.DataFrame
- thresholds_from_nested_cv(self, label, outer_k=3, inner_k=5, id=None, threshold_params=None, epoch=1, tile_filename='tile_predictions_val_epoch1.csv', y_true=None, y_pred=None, uncertainty=None)¶
Detects tile- and slide-level UQ thresholds and slide-level prediction thresholds from nested cross-validation.
- train(self, hp, label, filters=None, save_predictions='csv', validate_on_batch=32, validation_steps=32, **kwargs)¶
Train outer cross-validation models.
- Parameters:
hp (
slideflow.ModelParams
) – Hyperparameters object.label (str) – Experimental label.
filters (dict, optional) – Dataset filters to use for selecting slides. See
slideflow.Dataset.filter()
for more information. Defaults to None.save_predictions (bool, optional) – Save validation predictions to model folder. Defaults to ‘csv’.
- Keyword Arguments:
validate_on_batch (int) – Frequency of validation checks during training, in steps. Defaults to 32.
validation_steps (int) – Number of validation steps to perform during each mid-training evaluation check. Defaults to 32.
**kwargs – All remaining keyword arguments are passed to
slideflow.Project.train()
.
- Returns:
None
- train_nested_cv(self, hp, label, outer_k=3, inner_k=5, **kwargs)¶
Train models using nested cross-validation (outer_k=3, inner_k=5), skipping already-generated models.
- Parameters:
hp (slideflow.ModelParams) – Hyperparameters object.
label (str) – Experimental label.
- Keyword Arguments:
outer_k (int) – Number of outer cross-folds. Defaults to 3.
inner_k (int) – Number of inner cross-folds. Defaults to 5.
**kwargs – All remaining keyword arguments are passed to
slideflow.Project.train()
.
- Returns:
None
biscuit.hp¶
- nature2022()[source]¶
Hyperparameters used in the associated manuscript.
Dolezal, J.M., Srisuwananukorn, A., Karpeyev, D. et al. Uncertainty-informed deep learning models enable high-confidence predictions for digital histopathology. Nat Commun 13, 6572 (2022). https://doi.org/10.1038/s41467-022-34025-x
- Returns:
sf.ModelParams
biscuit.threshold¶
- apply(df, tile_uq, slide_uq, tile_pred=0.5, slide_pred=0.5, plot=False, keep='high_confidence', title=None, patients=None, level='slide')[source]¶
Apply pre-calculcated tile- and group-level uncertainty thresholds.
- Parameters:
df (pandas.DataFrame) – Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.
tile_uq (float) – Tile-level uncertainty threshold.
slide_uq (float) – Slide-level uncertainty threshold.
tile_pred (float, optional) – Tile-level prediction threshold. Defaults to 0.5.
slide_pred (float, optional) – Slide-level prediction threshold. Defaults to 0.5.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
keep (str, optional) – Either ‘high_confidence’ or ‘low_confidence’. Cohort to keep after thresholding. Defaults to ‘high_confidence’.
title (str, optional) – Title for uncertainty plot. Defaults to None.
patients (dict, optional) – Dictionary mapping slides to patients. Adds a ‘patient’ column in the tile prediction dataframe, enabling patient-level thresholding. Defaults to None.
level (str, optional) – Either ‘slide’ or ‘patient’. Level at which to apply threshold. If ‘patient’, requires patient dict be supplied. Defaults to ‘slide’.
- Returns:
- Dictionary of results, with keys auc, percent_incl, accuracy,
sensitivity, and specificity
DataFrame of thresholded group-level predictions
- detect(df, tile_uq='detect', slide_uq='detect', tile_pred='detect', slide_pred='detect', plot=False, patients=None)[source]¶
Detect optimal tile- and slide-level uncertainty thresholds.
- Parameters:
df (pandas.DataFrame) – Tile-level predictions. Must contain columns ‘y_true’, ‘y_pred’, and ‘uncertainty’.
tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.
slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.
tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.
slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.
- Returns:
- Dictionary with tile- and slide-level UQ and prediction threhsolds,
with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’
Float: Slide-level AUROC
- from_cv(dfs, **kwargs)[source]¶
Finds the optimal tile and slide-level thresholds from a set of nested cross-validation experiments.
- Parameters:
dfs (list(DataFrame)) – List of DataFrames with tile predictions, containing headers ‘y_true’, ‘y_pred’, ‘uncertainty’, ‘slide’, and ‘patient’.
- Keyword Arguments:
tile_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level uncertainty threshold. If float, will use the specified tile-level uncertainty threshold.
slide_uq (str or float) – Either ‘detect’ or float. If ‘detect’, will detect slide-level uncertainty threshold. If float, will use the specified slide-level uncertainty threshold.
tile_pred (str or float) – Either ‘detect’ or float. If ‘detect’, will detect tile-level prediction threshold. If float, will use the specified tile-level prediction threshold.
slide_pred (str or float) – Either ‘detect’ or float. If ‘detect’ will detect slide-level prediction threshold. If float, will use the specified slide-level prediction threshold.
plot (bool, optional) – Plot slide-level uncertainty. Defaults to False.
patients (dict, optional) – Dict mapping slides to patients. Required for patient-level thresholding.
- Returns:
- Dictionary with tile- and slide-level UQ and prediction threhsolds,
with keys: ‘tile_uq’, ‘tile_pred’, ‘slide_uq’, ‘slide_pred’
- plot_uncertainty(df, kind, threshold=None, title=None)[source]¶
Plots figure of tile or slide-level predictions vs. uncertainty.
- Parameters:
df (pandas.DataFrame) – Processed dataframe containing columns ‘uncertainty’, ‘correct’, ‘y_pred’.
kind (str) – Kind of plot. If ‘tile’, subsample to only 1000 points. Included in title.
threshold (float, optional) – Uncertainty threshold. Defaults to None.
title (str, optional) – Title for plots. Defaults to None.
- Returns:
None
- process_group_predictions(df, pred_thresh, level)[source]¶
From a given dataframe of tile-level predictions, calculate group-level predictions and uncertainty.
- process_tile_predictions(df, pred_thresh=0.5, patients=None)[source]¶
Load and process tile-level predictions from CSV.
- Parameters:
df (pandas.DataFrame) – Unprocessed DataFrame from reading tile-level predictions.
pred_thresh (float or str, optional) – Tile-level prediction threshold. If ‘detect’, will auto-detect via Youden’s J. Defaults to 0.5.
patients (dict, optional) – Dict mapping slides to patients, used for patient-level thresholding. Defaults to None.
- Returns:
pandas.DataFrame, tile prediction threshold
biscuit.utils¶
- auc(y_true, y_pred)[source]¶
Calculate Area Under Receiver Operator Curve (AUC / AUROC)
- Parameters:
y_true (np.ndarray) – True labels.
y_pred (np.ndarray) – Predictions.
- Returns:
AUC
- Return type:
Float
- auc_and_threshold(y_true, y_pred)[source]¶
Calculates AUC and optimal threshold (via Youden’s J)
- Parameters:
y_true (np.ndarray) – Y true (labels).
y_pred (np.ndarray) – Y pred (predictions).
- Returns:
AUC float: Optimal threshold
- Return type:
- df_from_cv(project, label, outcome, epoch=None, k=3, y_true=None, y_pred=None, uncertainty=None)[source]¶
Loads tile predictions from cross-fold models & renames columns.
- Parameters:
project (sf.Project) – Slideflow project.
label (str) – Experimental label.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
k (int, optional) – K-fold iteration. Defaults to 3.
outcome (str, optional) – Outcome name.
y_true (str, optional) – Column name for ground truth labels. Defaults to {outcome}_y_true0.
y_pred (str, optional) – Column name for predictions. Defaults to {outcome}_y_pred1.
uncertainty (str, optional) – Column name for uncertainty. Defaults to {outcome}_y_uncertainty1.
- Returns:
DataFrame for each k-fold.
- Return type:
list(DataFrame)
- eval_exists(project, label, outcome, epoch=1)[source]¶
Check if matching eval exists.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
- Returns:
If eval exists
- Return type:
- find_cv(project, label, outcome, epoch=None, k=3)[source]¶
Finds paths to cross-validation models.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.
- Returns:
Paths to cross-validation models.
- Return type:
- find_cv_early_stop(project, label, outcome, k=3)[source]¶
Detects early stop batch from cross-val trained models.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
k (int, optional) – Number of k-fold iterations. Defaults to 3.
outcome (str) – Outcome name.
- Returns:
Early stop batch.
- Return type:
- find_eval(project, label, outcome, epoch=1)[source]¶
Finds matching eval directory.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str, optional) – Outcome name.
epoch (int, optional) – Epoch number of saved model. Defaults to None.
- Raises:
MultipleModelsFoundError – If multiple matches are found.
ModelNotFoundError – If no match is found.
- Returns:
path to eval directory
- Return type:
- find_model(project, label, outcome, epoch=None, kfold=None)[source]¶
Searches for a model in a project model directory.
- Parameters:
project (slideflow.Project) – Project.
label (str) – Experimental label.
outcome (str) – Outcome name.
epoch (int, optional) – Epoch to search for. If not None, returns path to the saved model. If None, returns path to parent model folder. Defaults to None.
kfold (int, optional) – K-fold iteration. Defaults to None.
- Raises:
MultipleModelsFoundError – If multiple potential matches are found.
ModelNotFoundError – If no matching model is found.
- Returns:
Path to matching model.
- Return type:
- model_exists(project, label, outcome, epoch=None, kfold=None)[source]¶
Check if matching model exists.
- prediction_metrics(y_true, y_pred, threshold)[source]¶
Calculate prediction metrics (AUC, sensitivity/specificity, etc)
- Parameters:
y_true (np.ndarray) – True labels.
y_pred (np.ndarray) – Predictions.
threshold (_type_) – Prediction threshold.
- Returns:
Prediction metrics.
- Return type:
biscuit.delong¶
- fastDeLong(predictions_sorted_transposed, label_1_count)[source]¶
The fast version of DeLong’s method for computing the covariance of unadjusted AUC.
- Parameters:
predictions_sorted_transposed – a 2D numpy.array[n_classifiers, n_examples] sorted such as the examples with label “1” are first
- Returns:
(AUC value, DeLong covariance)
- delong_roc_variance(ground_truth, predictions)[source]¶
Computes ROC AUC variance for a single set of predictions
- Parameters:
ground_truth – np.array of 0 and 1
predictions – np.array of floats of the probability of being class 1
- delong_roc_test(ground_truth, predictions_one, predictions_two)[source]¶
Computes log(p-value) for hypothesis that two ROC AUCs are different
- Parameters:
ground_truth – np.array of 0 and 1
predictions_one – predictions of the first model, np.array of floats of the probability of being class 1
predictions_two – predictions of the second model, np.array of floats of the probability of being class 1