Shortcuts

slideflow.util

This module contains a variety of utility functions used throughout the package.

class EasyDict[source]

Convenience class that behaves like a dict but allows access with the attribute syntax.

class ImgBatchSpeedColumn(batch_size=1, *args, **kwargs)[source]

Renders human readable transfer speed.

__init__(batch_size=1, *args, **kwargs)[source]
render(task: Task) Text[source]

Show data transfer speed.

class MultiprocessProgress(pb)[source]

Wrapper for a rich.progress bar that can be shared across processes.

__init__(pb)[source]
class MultiprocessProgressTracker(tasks)[source]

Wrapper for a rich.progress tracker that can be shared across processes.

__init__(tasks)[source]
class TileExtractionProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]
get_renderables()[source]

Get a number of renderables for the progress display.

class TileExtractionSpeedColumn(table_column: Column | None = None)[source]

Renders human readable transfer speed.

render(task: Task) Text[source]

Show data transfer speed.

about(console=None) None[source]

Print a summary of the slideflow version and active backends.

Example
>>> sf.about()
╭=======================╮
│       Slideflow       │
│    Version: 2.1.0     │
│  Backend: tensorflow  │
│ Slide Backend: cucim  │
│ https://slideflow.dev │
╰=======================╯
Parameters:

console (rich.console.Console, optional) – Active console, if one exists. Defaults to None.

batch(iterable: List, n: int = 1) Iterable[source]

Separates an interable into batches of maximum size n.

batch_generator(iterable: Iterable, n: int = 1) Iterable[source]

Separates an interable into batches of maximum size n.

choice_input(prompt, valid_choices, default=None, multi_choice=False, input_type=<class 'str'>)[source]

Prompts user for multi-choice input.

download_from_tcga(uuid: str, dest: str, message: str = 'Downloading...') None[source]

Download a file from TCGA (GDC) by UUID.

getLoggingLevel()[source]

Return the current logging level.

get_ensemble_model_config(model_path: str) Dict[source]

Loads ensemble model configuration JSON file.

get_gan_config(model_path: str) Dict[source]

Loads a GAN training_options.json for an associated network PKL.

get_model_config(model_path: str) Dict[source]

Loads model configuration JSON file.

get_model_normalizer(model_path: str) StainNormalizer | None[source]

Loads and fits normalizer using configuration at a model path.

get_preprocess_fn(model_path: str)[source]

Returns a function which preprocesses a uint8 image for a model.

Parameters:

model_path (str) – Path to a saved Slideflow model.

Returns:

A function which accepts a single image or batch of uint8 images, and returns preprocessed (and stain normalized) float32 images.

get_relative_tfrecord_paths(root: str, directory: str = '') List[str][source]

Returns relative tfrecord paths with respect to the given directory.

get_slide_paths(slides_dir: str) List[str][source]

Get all slide paths from a given directory containing slides.

get_slides_from_model_manifest(model_path: str, dataset: str | None = None) List[str][source]

Get list of slides from a model manifest.

Parameters:
  • model_path (str) – Path to model from which to load the model manifest.

  • dataset (str) – ‘training’ or ‘validation’. Will return only slides from this dataset. Defaults to None (all).

Returns:

List of slide names.

Return type:

list(str)

get_valid_model_dir(root: str) List[source]

This function returns the path of the first indented directory from root. This only works when the indented folder name starts with a 5 digit number, like “00000%”.

Examples

If the root has 3 files: root/00000-foldername/ root/00001-foldername/ root/00002-foldername/

The function returns “root/00000-foldername/”

global_path(root: str, path_string: str)[source]

Returns global path from a local path.

is_model(path: str) bool[source]

Checks if the given path is a valid Slideflow model.

is_project(path: str) bool[source]

Checks if the given path is a valid Slideflow project.

is_simclr_model_path(path: Any) bool[source]

Checks if the given path is a valid SimCLR model or checkpoint.

is_slide(path: str) bool[source]

Checks if the given path is a supported slide.

is_tensorflow_model_path(path: str) bool[source]

Checks if the given path is a valid Slideflow/Tensorflow model.

is_torch_model_path(path: str) bool[source]

Checks if the given path is a valid Slideflow/PyTorch model.

is_uq_model(model_path: str) bool[source]

Checks if the given model path points to a UQ-enabled model.

isnumeric(val: Any) bool[source]

Check if the given value is numeric (numpy or python).

Tensors will return False.

Specifically checks if the value is a python int or float, or if the value is a numpy array with a numeric dtype (int or float).

load_json(filename: str) Any[source]

Reads JSON data from file.

load_predictions(path: str, **kwargs) DataFrame[source]

Loads a ‘csv’, ‘parquet’ or ‘feather’ file to a pandas dataframe.

Parameters:

path (str) – Path to the file to be read.

Returns:

The dataframe read from the path.

Return type:

df (pd.DataFrame)

location_heatmap(locations: ndarray, values: ndarray, slide: str, tile_px: int, tile_um: int | str, outdir: str, *, interpolation: str | None = 'bicubic', cmap: str = 'inferno', norm: str | None = None, background: str = 'min') Dict[str, Dict[str, float]][source]

Generate a heatmap for a slide.

Parameters:
  • locations (np.ndarray) – Array of shape (n_tiles, 2) containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.

  • values (np.ndarray) – Array of shape (n_tiles,) containing heatmap values for each tile.

  • slide (str) – Path to corresponding slide.

  • tile_px (int) – Tile pixel size.

  • tile_um (int, str) – Tile micron or magnification size.

  • outdir (str) – Directory in which to save heatmap.

Keyword Arguments:
  • interpolation (str, optional) – Interpolation strategy for smoothing heatmap. Defaults to ‘bicubic’.

  • cmap (str, optional) – Matplotlib colormap for heatmap. Can be any valid matplotlib colormap. Defaults to ‘inferno’.

  • norm (str, optional) – Normalization strategy for assigning heatmap values to colors. Either ‘two_slope’, or any other valid value for the norm argument of matplotlib.pyplot.imshow. If ‘two_slope’, normalizes values less than 0 and greater than 0 separately. Defaults to None.

log_manifest(train_tfrecords: List[str] | None = None, val_tfrecords: List[str] | None = None, *, labels: Dict[str, Any] | None = None, filename: str | None = None, remove_extension: bool = True) str[source]

Saves the training manifest in CSV format and returns as a string.

Parameters:
  • train_tfrecords (list(str)], optional) – List of training TFRecords. Defaults to None.

  • val_tfrecords (list(str)], optional) – List of validation TFRecords. Defaults to None.

Keyword Arguments:
  • labels (dict, optional) – TFRecord outcome labels. Defaults to None.

  • filename (str, optional) – Path to CSV file to save. Defaults to None.

  • remove_extension (bool, optional) – Remove file extension from slide names. Defaults to True.

Returns:

Saved manifest in str format.

Return type:

str

make_dir(_dir: str) None[source]

Makes a directory if one does not already exist, in a manner compatible with multithreading.

map_values_to_slide_grid(locations: ndarray, values: ndarray, wsi: WSI, background: str = 'min', *, interpolation: str | None = 'bicubic')[source]

Map heatmap values to a slide grid, using tile location information.

md5(path: str) str[source]

Calculate and return MD5 checksum for a file.

multi_warn(arr: List, compare: Callable, msg: Callable | str) int[source]

Logs multiple warning

Parameters:
  • arr (List) – Array to compare.

  • compare (Callable) – Comparison to perform on array. If True, will warn.

  • msg (str) – Warning message.

Returns:

Number of warnings.

Return type:

int

path_input(prompt: str, root: str, default: str | None = None, create_on_invalid: bool = False, filetype: str | None = None, verify: bool = True) str[source]

Prompts user for directory input.

path_to_ext(path: str) str[source]

Returns extension of a file path string.

path_to_name(path: str) str[source]

Returns name of a file, without extension, from a given full path string.

read_annotations(path: str) Tuple[List[str], List[Dict]][source]

Read an annotations file.

relative_path(path: str, root: str)[source]

Returns a relative path, from a given root directory.

setLoggingLevel(level)[source]

Set the logging level.

Uses standard python logging levels:

  • 50: CRITICAL

  • 40: ERROR

  • 30: WARNING

  • 20: INFO

  • 10: DEBUG

  • 0: NOTSET

Parameters:

level (int) – Logging level numeric value.

set_ignore_sigint()[source]

Ignore keyboard interrupts.

split_list(a: List, n: int) List[List][source]

Function to split a list into n components

tfrecord_heatmap(tfrecord: str, slide: str, tile_px: int, tile_um: int | str, tile_dict: Dict[int, float], outdir: str, **kwargs) Dict[str, Dict[str, float]][source]

Creates a tfrecord-based WSI heatmap using a dictionary of tile values for heatmap display.

Parameters:
  • tfrecord (str) – Path to tfrecord.

  • slide (str) – Path to whole-slide image.

  • tile_dict (dict) – Dictionary mapping tfrecord indices to a tile-level value for display in heatmap format.

  • tile_px (int) – Tile width in pixels.

  • tile_um (int or str) – Tile width in microns (int) or magnification (str, e.g. “20x”).

  • outdir (str) – Path to directory in which to save images.

Returns:

Dictionary mapping slide names to dict of statistics (mean, median)

to_onehot(val: int, max: int) ndarray[source]

Converts value to one-hot encoding

Parameters:
  • val (int) – Value to encode

  • max (int) – Maximum value (length of onehot encoding)

update_results_log(results_log_path: str, model_name: str, results_dict: Dict) None[source]

Dynamically update results_log when recording training metrics.

write_json(data: Any, filename: str) None[source]

Write data to JSON file.

yes_no_input(prompt: str, default: str = 'no') bool[source]

Prompts user for yes/no input.