slideflow.util¶
This module contains a variety of utility functions used throughout the package.
- class EasyDict[source]¶
Convenience class that behaves like a dict but allows access with the attribute syntax.
- class FeatureExtractionProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]¶
- class ImgBatchSpeedColumn(batch_size=1, *args, **kwargs)[source]¶
Renders human readable transfer speed.
- class LabeledMofNCompleteColumn(unit: str, *args, **kwargs)[source]¶
Renders a completion column with labels.
- class MultiprocessProgress(pb)[source]¶
Wrapper for a rich.progress bar that can be shared across processes.
- class MultiprocessProgressTracker(tasks)[source]¶
Wrapper for a rich.progress tracker that can be shared across processes.
- class TileExtractionProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]¶
- class TileExtractionSpeedColumn(table_column: Column | None = None)[source]¶
Renders human readable transfer speed.
- class ValidJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an RecursionError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is
None
and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError
.- default(obj)[source]¶
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- about(console=None) None [source]¶
Print a summary of the slideflow version and active backends.
- Example
>>> sf.about() ╭=======================╮ │ Slideflow │ │ Version: 3.0.0 │ │ Backend: torch │ │ Slide Backend: cucim │ │ https://slideflow.dev │ ╰=======================╯
- Parameters:
console (rich.console.Console, optional) – Active console, if one exists. Defaults to None.
- batch(iterable: List, n: int = 1) Iterable [source]¶
Separates an interable into batches of maximum size n.
- batch_generator(iterable: Iterable, n: int = 1) Iterable [source]¶
Separates an interable into batches of maximum size n.
- bin_values_to_slide_grid(locations: ndarray, values: ndarray, wsi: WSI, background: str = 'min') ndarray [source]¶
Bin heatmap values to a slide grid, using tile location information.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)
containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)
containing heatmap values for each tile.wsi (slideflow.wsi.WSI) – WSI object.
- Keyword Arguments:
background (str, optional) – Background strategy for heatmap. Can be ‘min’, ‘mean’, ‘median’, ‘max’, or ‘mask’. Defaults to ‘min’.
- choice_input(prompt, valid_choices, default=None, multi_choice=False, input_type=<class 'str'>)[source]¶
Prompts user for multi-choice input.
- create_triangles(vertices, hole_vertices=None, hole_points=None)[source]¶
Tessellate a complex polygon, possibly with holes.
- Parameters:
vertices – A list of vertices [(x1, y1), (x2, y2), …] defining the polygon boundary.
holes – An optional list of points [(hx1, hy1), (hx2, hy2), …] inside each hole in the polygon.
- Returns:
A numpy array of vertices for the tessellated triangles.
- download_from_tcga(uuid: str, dest: str, message: str = 'Downloading...') None [source]¶
Download a file from TCGA (GDC) by UUID.
- get_ensemble_model_config(model_path: str) Dict [source]¶
Loads ensemble model configuration JSON file.
- get_gan_config(model_path: str) Dict [source]¶
Loads a GAN training_options.json for an associated network PKL.
- get_model_normalizer(model_path: str) StainNormalizer | None [source]¶
Loads and fits normalizer using configuration at a model path.
- get_preprocess_fn(model_path: str)[source]¶
Returns a function which preprocesses a uint8 image for a model.
- Parameters:
model_path (str) – Path to a saved Slideflow model.
- Returns:
A function which accepts a single image or batch of uint8 images, and returns preprocessed (and stain normalized) float32 images.
- get_relative_tfrecord_paths(root: str, directory: str = '') List[str] [source]¶
Returns relative tfrecord paths with respect to the given directory.
- get_slide_paths(slides_dir: str) List[str] [source]¶
Get all slide paths from a given directory containing slides.
- get_slides_from_model_manifest(model_path: str, dataset: str | None = None) List[str] [source]¶
Get list of slides from a model manifest.
- get_valid_model_dir(root: str) List [source]¶
This function returns the path of the first indented directory from root. This only works when the indented folder name starts with a 5 digit number, like “00000%”.
- Examples
If the root has 3 files: root/00000-foldername/ root/00001-foldername/ root/00002-foldername/
The function returns “root/00000-foldername/”
- infer_stride(locations, wsi)[source]¶
Infer the stride of a grid of locations from a set of locations.
- Parameters:
locations (np.ndarray) – Nx2 array of locations
wsi (slideflow.wsi.WSI) – WSI object
- Returns:
inferred stride divisor in pixels
- Return type:
- is_simclr_model_path(path: Any) bool [source]¶
Checks if the given path is a valid SimCLR model or checkpoint.
- is_tensorflow_model_path(path: str) bool [source]¶
Checks if the given path is a valid Slideflow/Tensorflow model.
- is_tile_size_compatible(tile_px1: int, tile_um1: str | int, tile_px2: int, tile_um2: str | int) bool [source]¶
Check whether tile sizes are compatible.
- Compatibility is defined as:
Equal size in pixels
If tile width (tile_um) is defined in microns (int) for both, these must be equal
If tile width (tile_um) is defined as a magnification (str) for both, these must be equal
If one is defined in microns and the other as a magnification, the calculated magnification must be +/- 2.
Example 1: - tile_px1=299, tile_um1=302 - tile_px2=299, tile_um2=304 - Incompatible (unequal micron width)
Example 2: - tile_px1=299, tile_um1=10x - tile_px2=299, tile_um2=9x - Incompatible (unequal magnification)
Example 3: - tile_px1=299, tile_um1=302 - tile_px2=299, tile_um2=10x - Compatible (first has an equivalent magnification of 9.9x, which is +/- 2 compared to 10x)
- Parameters:
tile_px1 (int) – Tile size (in pixels) of first slide.
tile_um1 (int or str) – Tile size (in microns) of first slide. Can also be expressed as a magnification level, e.g.
'10x'
tile_px2 (int) – Tile size (in pixels) of second slide.
tile_um2 (int or str) – Tile size (in microns) of second slide. Can also be expressed as a magnification level, e.g.
'10x'
- Returns:
Whether the tile sizes are compatible.
- Return type:
- is_torch_model_path(path: str) bool [source]¶
Checks if the given path is a valid Slideflow/PyTorch model.
- is_uq_model(model_path: str) bool [source]¶
Checks if the given model path points to a UQ-enabled model.
- isnumeric(val: Any) bool [source]¶
Check if the given value is numeric (numpy or python).
Tensors will return False.
Specifically checks if the value is a python int or float, or if the value is a numpy array with a numeric dtype (int or float).
- load_predictions(path: str, **kwargs) DataFrame [source]¶
Loads a ‘csv’, ‘parquet’ or ‘feather’ file to a pandas dataframe.
- Parameters:
path (str) – Path to the file to be read.
- Returns:
The dataframe read from the path.
- Return type:
df (pd.DataFrame)
- location_heatmap(locations: ndarray, values: ndarray, slide: str, tile_px: int, tile_um: int | str, filename: str, *, interpolation: str | None = 'bicubic', cmap: str = 'inferno', norm: str | None = None, background: str = 'min') None [source]¶
Generate a heatmap for a slide.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)
containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)
containing heatmap values for each tile.slide (str) – Path to corresponding slide.
tile_px (int) – Tile pixel size.
filename (str) – Destination filename for heatmap.
- Keyword Arguments:
interpolation (str, optional) – Interpolation strategy for smoothing heatmap. Defaults to ‘bicubic’.
cmap (str, optional) – Matplotlib colormap for heatmap. Can be any valid matplotlib colormap. Defaults to ‘inferno’.
norm (str, optional) – Normalization strategy for assigning heatmap values to colors. Either ‘two_slope’, or any other valid value for the
norm
argument ofmatplotlib.pyplot.imshow
. If ‘two_slope’, normalizes values less than 0 and greater than 0 separately. Defaults to None.
- log_manifest(train_tfrecords: List[str] | None = None, val_tfrecords: List[str] | None = None, *, labels: Dict[str, Any] | None = None, filename: str | None = None, remove_extension: bool = True) str [source]¶
Saves the training manifest in CSV format and returns as a string.
- Parameters:
- Keyword Arguments:
- Returns:
Saved manifest in str format.
- Return type:
- make_dir(_dir: str) None [source]¶
Makes a directory if one does not already exist, in a manner compatible with multithreading.
- map_values_to_slide_grid(locations: ndarray, values: ndarray, wsi: WSI, background: str = 'min', *, interpolation: str | None = 'bicubic') ndarray [source]¶
Map heatmap values to a slide grid, using tile location information.
- Parameters:
locations (np.ndarray) – Array of shape
(n_tiles, 2)
containing x, y coordinates for all image tiles. Coordinates represent the center for an associated tile, and must be in a grid.values (np.ndarray) – Array of shape
(n_tiles,)
containing heatmap values for each tile.wsi (slideflow.wsi.WSI) – WSI object.
- Keyword Arguments:
- path_input(prompt: str, root: str, default: str | None = None, create_on_invalid: bool = False, filetype: str | None = None, verify: bool = True) str [source]¶
Prompts user for directory input.
- path_to_name(path: str) str [source]¶
Returns name of a file, without extension, from a given full path string.
- setLoggingLevel(level)[source]¶
Set the logging level.
Uses standard python logging levels:
50: CRITICAL
40: ERROR
30: WARNING
20: INFO
10: DEBUG
0: NOTSET
- Parameters:
level (int) – Logging level numeric value.
- tfrecord_heatmap(tfrecord: str, slide: str, tile_px: int, tile_um: int | str, tile_dict: Dict[int, float], filename: str, **kwargs) None [source]¶
Creates a tfrecord-based WSI heatmap using a dictionary of tile values for heatmap display.
- Parameters:
tfrecord (str) – Path to tfrecord.
slide (str) – Path to whole-slide image.
tile_dict (dict) – Dictionary mapping tfrecord indices to a tile-level value for display in heatmap format.
tile_px (int) – Tile width in pixels.
tile_um (int or str) – Tile width in microns (int) or magnification (str, e.g. “20x”).
filename (str) – Destination filename for heatmap.
- tile_size_label(tile_px: int, tile_um: str | int) str [source]¶
Return the string label of the given tile size.
- update_results_log(results_log_path: str, model_name: str, results_dict: Dict) None [source]¶
Dynamically update results_log when recording training metrics.