Shortcuts

slideflow.slide

This module contains classes to load slides and extract tiles. For optimal performance, tile extraction should generally not be performed by instancing these classes directly, but by calling either slideflow.Project.extract_tiles() or slideflow.Dataset.extract_tiles(), which include performance optimizations and additional functionality.

slideflow.WSI

class WSI(path: str, tile_px: int, tile_um: int | str, stride_div: int = 1, *, enable_downsample: bool = True, roi_dir: str | None = None, rois: List[str] | None = None, roi_method: str = 'auto', roi_filter_method: str | float = 'center', origin: str | Tuple[int, int] = (0, 0), pb: Progress | None = None, verbose: bool = True, use_edge_tiles: bool = False, mpp: float | None = None, simplify_roi_tolerance: float | None = None, artifact_labels: List[str] | None = None, **reader_kwargs: Any)[source]

Loads a slide and its annotated region of interest (ROI).

Loads slide and ROI(s).

Parameters:
  • path (str) – Path to slide.

  • tile_px (int) – Size of tiles to extract, in pixels.

  • tile_um (int or str) – Size of tiles to extract, in microns (int) or magnification (str, e.g. “20x”).

  • stride_div (int, optional) – Stride divisor for tile extraction (1 = no tile overlap; 2 = 50% overlap, etc). Defaults to 1.

  • enable_downsample (bool, optional) – Allow use of downsampled intermediate layers in the slide image pyramid, which greatly improves tile extraction speed. May result in artifacts for slides with incompletely generated intermediates pyramids. Defaults to True.

  • roi_dir (str, optional) – Directory in which to search for ROI CSV files. Defaults to None.

  • rois (list(str)) – Alternatively, a list of ROI paths can be explicitly provided. Defaults to None.

  • roi_method (str) – Either ‘inside’, ‘outside’, ‘auto’, or ‘ignore’. Determines how ROIs are used to extract tiles. If ‘inside’ or ‘outside’, will extract tiles in/out of an ROI, and raise errors.MissingROIError if an ROI is not available. If ‘auto’, will extract tiles inside an ROI if available, and across the whole-slide if no ROI is found. If ‘ignore’, will extract tiles across the whole-slide regardless of whether an ROI is available. Defaults to ‘auto’.

  • roi_filter_method (str or float) – Method of filtering tiles with ROIs. Either ‘center’ or float (0-1). If ‘center’, tiles are filtered with ROIs based on the center of the tile. If float, tiles are filtered based on the proportion of the tile inside the ROI, and roi_filter_method is interpreted as a threshold. If the proportion of a tile inside the ROI is greater than this number, the tile is included. For example, if roi_filter_method=0.7, a tile that is 80% inside of an ROI will be included, and a tile that is 50% inside of an ROI will be excluded. Defaults to ‘center’.

  • origin (str or tuple(int, int)) – Offset the starting grid (x, y). Either a tuple of ints or ‘random’. Defaults to (0, 0).

  • pb (Progress, optional) – Multiprocessing capable Progress instance; will update progress bar during tile extraction if provided.

  • verbose (bool, optional) – Controls verbosity of output. If False, suppresses warnings about slide skipping when ROIs are missing. Defaults to True.

  • mpp (float, optional) – Override the microns-per-pixel value for the slide. Defaults to None (auto-detects).

  • ignore_missing_mpp (bool, optional) – If a slide does not have microns-per-pixel (MPP) information stored in EXIF data (key 65326), set the MPP to a default value (sf.slide.DEFAULG_JPG_MPP). If False and MPP data is missing, raises sf.errors.SlideMissingMPPError.

  • use_bounds (bool) – If True, use the slide bounds to determine the slide dimensions. This will crop out unscanned white space. If a tuple of int, interprets the bounds as (top_left_x, top_left_y, width, height). If False, use the full slide dimensions. Only available when using Libvips (SF_SLIDE_BACKEND=libvips). Defaults to False.

  • transforms (list(int), optional) – List of transforms to apply to the slide before establishing coordinate grid. Options include any combination of ROTATE_90_CLOCKWISE, ROTATE_180_CLOCKWISE, ROTATE_270_CLOCKWISE, FLIP_HORIZONTAL, and FLIP_VERTICAL. Only available when using Libvips (SF_SLIDE_BACKEND=libvips). Defaults to None.

  • artifact_labels (list(str), optional) – List of ROI issue labels to treat as artifacts. Whenever this is not None, all the ROIs with referred label will be inverted with ROI.invert(). Defaults to an empty list.

Attributes

WSI.dimensions

Dimensions of highest-magnification level (width, height)

WSI.qc_mask

Returns union of all QC masks.

WSI.levels

List of dict, with metadata for each level.

WSI.level_dimensions

List of list, with dimensions for each slide level.

WSI.level_downsamples

Downsample of each level (starts at 1, increases with lower mag).

WSI.level_mpp

Microns-per-pixel (MPP) for each level.

WSI.properties

Dictionary of metadata loaded from the slide.

WSI.slide

Backend-specific slide object.

WSI.vendor

Slide scanner vendor, if available.

Methods

align_to(self, slide: WSI, apply: bool = True, *, finetune_depth: Sequence[float] | None = None, normalizer: str | None = 'reinhard_mask', allow_errors: bool = False) Tuple[Tuple[int, int], float]

Align this slide to another slide.

Alignment is performed by first aligning thumbnails at low magnification (mpp = 8), then progressively fine-tuning alignment at increasing magnification (mpp = 1, 0.5, 0.25), focused on a dense tissue region. The densest tissue region is identified using the QC mask, if available, otherwise via Otsu thresholding.

Parameters:
  • slide (slideflow.WSI) – Slide to align to.

  • apply (bool) – Whether to apply the alignment to the slide.

Keyword Arguments:
  • finetune_depth (Optional[List[int]]) – List of magnifications at which to fine-tune alignment. Defaults to [1, 0.5, 0.25].

  • normalizer (str, optional) – Stain normalization method to use. Defaults to ‘reinhard_mask’.

  • allow_errors (bool) – Whether to allow and ignore alignment errors when finetuning at higher magnification. Defaults to False.

Returns:

Tuple of (x, y) offset and MSE of initial alignment.

Raises:
  • TypeError – If slide is not a slideflow.WSI object.

  • AlignmentError – If initial, thumbnail-based alignment fails, or if finetuning alignment fails at any magnification and allow_errors is False.

align_tiles_to(self, slide: WSI, normalizer: str | None = 'reinhard_mask', *, allow_errors: bool = True, mask_on_fail: bool = True, align_by: str = 'fit', ignore_outliers=True, num_workers: int | None = None, **kwargs) ndarray

Align tiles to another slide.

Differs from slideflow.WSI.align_to() in that it aligns each tile individually, rather than the slide as a whole. This is useful when aligning slides with distortion, whose alignment may drift across the slide.

Parameters:
  • slide (slideflow.WSI) – Slide to align to.

  • normalizer (str, optional) – Stain normalization method to use.

Keyword Arguments:
  • allow_errors (bool) – Whether to allow and ignore alignment errors when finetuning alignment fails at any magnification and allow_errors is False. Defaults to True.

  • mask_on_fail (bool) – Whether to mask tiles that fail alignment. Defaults to True.

  • align_by (str) – Either ‘tile’ or ‘fit’. If ‘tile’, tiles are aligned individually. If ‘fit’, tiles are aligned by fitting a plane to the alignment of all tiles. Defaults to ‘tile’.

  • ignore_outliers (bool) – Whether to ignore outliers when fitting a plane to tile alignment. Defaults to True.

  • **kwargs – Keyword arguments passed to slideflow.WSI.align_to().

Raises:

ValueError – If align_by is not ‘tile’ or ‘fit’.

Returns:

Alignment grid, with shape = (grid_x, grid_y, 2).

Return type:

np.ndarray

apply_qc_mask(self, mask: ndarray | QCMask | None = None, filter_threshold: float | None = None, *, is_roi: bool = False) Image

Apply custom slide-level QC by filtering grid coordinates.

The mask should have a shape (height, width) proportional to the slide’s dimensions.

If the mask is numerical, the mask is thresholded at filter_threshold, with values above the threshold indicating a region to discard.

If the mask is a boolean array, True indicates a region to discard and False indicates a region to keep.

If the mask is a QCMask, the filter_threshold is ignored.

Parameters:
  • mask (np.ndarray or slideflow.slide.QCMask, optional) – Boolean QC mask array or QCMask object. If None, will re-apply the current masks. Defaults to None.

  • filter_threshold (float) – Percent of a tile detected as background that will trigger a tile to be discarded. Only used if mask is an np.ndarray. Defaults to 0.6.

Keyword Arguments:

is_roi (bool) – Whether the mask is an ROI mask. Only used if mask is an np.ndarray. Defaults to False.

Returns:

Image of applied QC mask.

Return type:

Image

apply_segmentation(self, segmentation: sf.cellseg.Segmentation) None

Apply cell segmentation to the slide.

This sets the coordinates to the centroids of the segmentation.

Parameters:

segmentation (slideflow.cellseg.Segmentation) – Segmentation object to apply.

area(self) float

Calculate area (mm^2) of slide that passes QC masking.

build_generator(self, *, shuffle: bool = True, whitespace_fraction: float | None = None, whitespace_threshold: float | None = None, grayspace_fraction: float | None = None, grayspace_threshold: float | None = None, normalizer: str | StainNormalizer | None = None, normalizer_source: str | None = None, context_normalize: bool = False, num_threads: int | None = None, num_processes: int | None = None, show_progress: bool = False, img_format: str = 'numpy', full_core: bool = False, yolo: bool = False, draw_roi: bool = False, pool: Pool | None = None, dry_run: bool = False, lazy_iter: bool = False, shard: Tuple[int, int] | None = None, max_tiles: int | None = None, from_centroids: bool = False, apply_masks: bool = True, deterministic: bool = True) Callable | None

Builds a tile generator to extract tiles from this slide.

Keyword Arguments:
  • shuffle (bool) – Shuffle images during extraction.

  • whitespace_fraction (float, optional) – Range 0-1. Defaults to 1. Discard tiles with this fraction of whitespace. If 1, will not perform whitespace filtering.

  • whitespace_threshold (int, optional) – Range 0-255. Defaults to 230. Threshold above which a pixel (RGB average) is whitespace.

  • grayspace_fraction (float, optional) – Range 0-1. Defaults to 0.6. Discard tiles with this fraction of grayspace. If 1, will not perform grayspace filtering.

  • grayspace_threshold (float, optional) – Range 0-1. Defaults to 0.05. Pixels in HSV format with saturation below this threshold are considered grayspace.

  • normalizer (str, optional) – Normalization strategy to use on image tiles. Defaults to None.

  • normalizer_source (str, optional) – Stain normalization preset or path to a source image. Valid presets include ‘v1’, ‘v2’, and ‘v3’. If None, will use the default present (‘v3’). Defaults to None.

  • context_normalize (bool) – If normalizing, use context from the rest of the slide when calculating stain matrix concentrations. Defaults to False (normalize each image tile as separate images).

  • num_threads (int) – If specified, will extract tiles with a ThreadPool using the specified number of threads. Cannot supply both num_threads and num_processes. Libvips is particularly slow with ThreadPools. Defaults to None in the Libvips backend, and the number of CPU cores when using cuCIM.

  • num_processes (int) – If specified, will extract tiles with a multiprocessing pool using the specified number of processes. Cannot supply both num_threads and num_processes. With the libvips backend, this defaults to half the number of CPU cores, and with cuCIM, this defaults to None.

  • show_progress (bool, optional) – Show a progress bar.

  • img_format (str, optional) – Image format. Either ‘numpy’, ‘jpg’, or ‘png’. Defaults to ‘numpy’.

  • yolo (bool, optional) – Include yolo-formatted tile-level ROI annotations in the return dictionary, under the key ‘yolo’. Defaults to False.

  • draw_roi (bool, optional) – Draws ROIs onto extracted tiles. Defaults to False.

  • dry_run (bool, optional) – Determine tiles that would be extracted, but do not export any images. Defaults to None.

  • max_tiles (int, optional) – Only extract this many tiles per slide. Defaults to None.

  • from_centroids (bool) – Extract tiles from cell segmentation centroids, rather than in a grid-wise pattern. Requires that cell segmentation has already been applied with WSI.apply_segmentation(). Defaults to False.

  • apply_masks (bool) – Apply cell segmentation masks to tiles. Ignored if cell segmentation has been applied to the slide. Defaults to True.

  • deterministic (bool) – Return tile images in reproducible, deterministic order. May slightly decrease iteration time. Defaults to True.

  • shard (tuple(int, int), optional) – If provided, will only extract tiles from the shard with index shard[0] out of shard[1] shards. Defaults to None.

Returns:

  • "image": image data.

  • "yolo": yolo-formatted annotations, (x_center, y_center, width, height), optional.

  • "grid": (x, y) grid coordinates of the tile.

  • "loc": (x, y) coordinates of tile center, in base (level=0) dimension.

Return type:

A generator that yields a dictionary with the keys

dim_to_mpp(self, dimensions: Tuple[float, float]) float
get_tile_mask(self, index, sparse_mask) ndarray

Get a mask for a tile, given a sparse mask.

Examples

Get a mask for a tile, given a sparse mask.

>>> from slideflow.cellseg import seg_utils, Segmentation
>>> segmentation = Segmentation(...)
>>> wsi = sf.WSI(...)
>>> wsi.apply_segmentation(segmentation)
>>> sparse_mask = seg_utils.sparse_mask(segmentation.masks)
>>> wsi.get_tile_mask(0, sparse_mask)
<numpy.ndarray>
Parameters:
  • index (int) – Index of tile.

  • sparse_mask (scipy.sparse.csr_matrix) – Sparse mask.

Returns:

Mask for tile.

Return type:

numpy.ndarray

get_tile_dataframe(self) DataFrame

Build a dataframe of tiles and associated ROI labels.

Returns:

  • loc_x: X-coordinate of tile center

  • loc_y: Y-coordinate of tile center

  • grid_x: X grid index of the tile

  • grid_y: Y grid index of the tile

  • roi_name: Name of the ROI if tile is in an ROI, else None

  • roi_desc: Description of the ROI if tile is in ROI, else None

  • label: ROI label, if present.

Return type:

Pandas dataframe of all tiles, with the following columns

extract_cells(self, tfrecord_dir: str | None = None, tiles_dir: str | None = None, img_format: str = 'jpg', report: bool = True, apply_masks: bool = True, **kwargs) SlideReport | None

Extract tiles from cell segmentation centroids.

Parameters:
  • tfrecord_dir (str) – If provided, saves tiles into a TFRecord file (named according to slide name) here.

  • tiles_dir (str) – If provided, saves loose images into a subdirectory (per slide name) here.

  • img_format (str) – ‘png’ or ‘jpg’. Format of images for internal storage in tfrecords. PNG (lossless) format recommended for fidelity, JPG (lossy) for efficiency. Defaults to ‘jpg’.

  • report (bool) – Generate and return PDF report of tile extraction.

  • apply_masks (bool) – Apply cell segmentation masks to the extracted tiles. Defaults to True.

Keyword Arguments:

**kwargs – All keyword arguments are passed to WSI.extract_tiles().

extract_tiles(self, tfrecord_dir: str | None = None, tiles_dir: str | None = None, img_format: str = 'jpg', report: bool = True, **kwargs) SlideReport | None

Extracts tiles from slide using the build_generator() method, saving tiles into a TFRecord file or as loose JPG tiles in a directory.

Parameters:
  • tfrecord_dir (str) – If provided, saves tiles into a TFRecord file (named according to slide name) here.

  • tiles_dir (str) – If provided, saves loose images in a subdirectory (per slide name) here.

  • img_format (str) – ‘png’ or ‘jpg’. Format of images for internal storage in tfrecords. PNG (lossless) format recommended for fidelity, JPG (lossy) for efficiency. Defaults to ‘jpg’.

Keyword Arguments:
  • whitespace_fraction (float, optional) – Range 0-1. Defaults to 1. Discard tiles with this fraction of whitespace. If 1, will not perform whitespace filtering.

  • whitespace_threshold (int, optional) – Range 0-255. Defaults to 230. Threshold above which a pixel (RGB average) is whitespace.

  • grayspace_fraction (float, optional) – Range 0-1. Defaults to 0.6. Discard tiles with this fraction of grayspace. If 1, will not perform grayspace filtering.

  • grayspace_threshold (float, optional) – Range 0-1. Defaults to 0.05. Pixels in HSV format with saturation below this threshold are considered grayspace.

  • normalizer (str, optional) – Normalization to use on image tiles. Defaults to None.

  • normalizer_source (str, optional) – Stain normalization preset or path to a source image. Valid presets include ‘v1’, ‘v2’, and ‘v3’. If None, will use the default present (‘v3’). Defaults to None.

  • full_core (bool, optional) – Extract an entire detected core, rather than subdividing into image tiles. Defaults to False.

  • shuffle (bool) – Shuffle images during extraction.

  • num_threads (int) – Number of threads to allocate to workers.

  • yolo (bool, optional) – Export yolo-formatted tile-level ROI annotations (.txt) in the tile directory. Requires that tiles_dir is set. Defaults to False.

  • draw_roi (bool, optional) – Draws ROIs onto extracted tiles. Defaults to False.

  • dry_run (bool, optional) – Determine tiles that would be extracted, but do not export any images. Defaults to None.

  • num_threads – If specified, will extract tiles with a ThreadPool using the specified number of threads. Cannot supply both num_threads and num_processes. Libvips is particularly slow with ThreadPools. Defaults to None in the Libvips backend, and the number of CPU cores when using cuCIM.

  • num_processes (int) – If specified, will extract tiles with a multiprocessing pool using the specified number of processes. Cannot supply both num_threads and num_processes. With the libvips backend, this defaults to half the number of CPU cores, and with cuCIM, this defaults to None.

export_rois(self, dest: str | None = None) str

Export loaded ROIs to a given destination, in CSV format.

ROIs are exported with the columns ‘roi_name’, ‘x_base’, and ‘y_base’. Coordinates are in base dimension (level 0) of the slide.

Parameters:

dest (str) – Path to destination folder. If not provided, will export ROIs in the current folder. Defaults to None.

Returns:

None

has_rois(self) bool

Checks if the slide has loaded ROIs and they are not being ignored.

load_csv_roi(self, path: str, *, process: bool = True, scale: int = 1, skip_invalid: bool = True, simplify_tolerance: float | None = None) int

Load ROIs from a CSV file.

CSV file must contain headers ‘ROI_name’, ‘X_base’, and ‘Y_base’.

Any previously loaded ROIs are cleared prior to loading.

Parameters:

path (str) – Path to CSV file.

Keyword Arguments:
  • process (bool) – Process ROIs after loading. Defaults to True.

  • scale (int) – Scale factor to apply to ROI coordinates. Defaults to 1.

load_json_roi(self, path: str, *, scale: int = 1, process: bool = True, skip_invalid: bool = True) int

Load ROIs from a JSON file.

JSON file must contain a ‘shapes’ key, with a list of dictionaries containing a ‘points’ key, whose value is a list of (x, y) coordinates.

Parameters:
  • path (str) – Path to JSON file.

  • scale (int) – Scale factor to apply to ROI coordinates. Defaults to 1.

  • process (bool) – Process ROIs after loading. Defaults to True.

load_roi_array(self, array: ndarray, *, process: bool = True, label: str | None = None, name: str | None = None, allow_errors: bool = False, simplify_tolerance: float | None = None) int

Load an ROI from a numpy array.

Parameters:

array (np.ndarray) – Array of shape (n_points, 2) containing the coordinates of the ROI shape, in base (level=0) dimension.

Keyword Arguments:

process (bool) – Process ROIs after loading. Defaults to True.

mpp_to_dim(self, mpp: float) Tuple[int, int]
predict(self, model: str, **kwargs) Tuple[ndarray, ndarray | None]

Generate a whole-slide prediction from a saved model.

Parameters:

model (str) – Path to saved model trained in Slideflow.

Keyword Arguments:
  • batch_size (int, optional) – Batch size for calculating predictions. Defaults to 32.

  • num_threads (int, optional) – Number of tile worker threads. Cannot supply both num_threads (uses thread pool) and num_processes (uses multiprocessing pool). Defaults to CPU core count.

  • num_processes (int, optional) – Number of child processes to spawn for multiprocessing pool. Defaults to None (does not use multiprocessing).

  • img_format (str, optional) – Image format (png, jpg) to use when extracting tiles from slide. Must match the image format the model was trained on. If ‘auto’, will use the format logged in the model params.json. Defaults to ‘auto’.

  • device (torch.device, optional) – PyTorch device. Defaults to initializing a new CUDA device.

  • generator_kwargs (dict, optional) – Keyword arguments passed to the slideflow.WSI.build_generator().

Returns:

Predictions for each outcome, with shape = (num_classes, )

np.ndarray, optional: Uncertainty for each outcome, if the model was trained with uncertainty, with shape = (num_classes,)

Return type:

np.ndarray

preview(self, rois: bool = True, thumb_kwargs: Dict | None = None, low_res: bool = True, **kwargs) Image | None

Performs a dry run of tile extraction without saving any images, returning a PIL image of the slide thumbnail annotated with a grid of tiles that were marked for extraction.

Parameters:

rois (bool, optional) – Draw ROI annotation(s) onto the image. Defaults to True.

Keyword Arguments:
  • whitespace_fraction (float, optional) – Range 0-1. Defaults to 1. Discard tiles with this fraction of whitespace. If 1, will not perform whitespace filtering.

  • whitespace_threshold (int, optional) – Range 0-255. Defaults to 230. Threshold above which a pixel (RGB average) is considered whitespace.

  • grayspace_fraction (float, optional) – Range 0-1. Defaults to 0.6. Discard tiles with this fraction of grayspace. If 1, will not perform grayspace filtering.

  • grayspace_threshold (float, optional) – Range 0-1. Defaults to 0.05. Pixels in HSV format with saturation below this threshold are considered grayspace.

  • full_core (bool, optional) – Extract an entire detected core, rather than subdividing into image tiles. Defaults to False.

  • num_threads (int) – Number of threads to allocate to workers.

  • yolo (bool, optional) – Export yolo-formatted tile-level ROI annotations (.txt) in the tile directory. Requires that tiles_dir is set. Defaults to False.

  • thumb_kwargs (Optional[Dict], optional) – Keyword arguments to pass to the thumb method. Defaults to None.

  • low_res (bool, optional) – Use low resolution thumbnail. Defaults to True.

process_rois(self)

Process loaded ROIs and apply to the slide grid.

Returns:

Number of ROIs processed.

Return type:

int

show_alignment(self, slide: WSI, mpp: float = 4) Image

Show aligned thumbnail of another slide.

square_thumb(self, width: int = 512, use_associated_image: bool = True, **kwargs) Image

Returns a square thumbnail of the slide, with black bar borders.

Parameters:

width (int) – Width/height of thumbnail in pixels.

Returns:

PIL image

qc(self, method: str | Callable | List[Callable], *, blur_radius: int = 3, blur_threshold: float = 0.02, filter_threshold: float = 0.6, blur_mpp: float | None = None, pool: Pool | None = None) Image | None

Applies quality control to a slide, performing filtering based on a whole-slide image thumbnail.

‘blur’ method filters out blurry or out-of-focus slide sections. ‘otsu’ method filters out background based on automatic saturation thresholding in the HSV colorspace. ‘both’ applies both methods of filtering.

Parameters:
  • method (str, Callable, list(Callable)) – Quality control method(s). If a string, may be ‘blur’, ‘otsu’, or ‘both’. If a callable (or list of callables), each must accept a sf.WSI object and return a np.ndarray (dtype=np.bool).

  • blur_radius (int, optional) – Blur radius. Only used if method is ‘blur’ or ‘both’.

  • blur_threshold (float, optional) – Blur threshold. Only used if method is ‘blur’ or ‘both.’

  • filter_threshold (float) – Percent of a tile detected as background that will trigger a tile to be discarded. Defaults to 0.6.

  • blur_mpp (float, optional) – Size of WSI thumbnail on which to perform blur QC, in microns-per-pixel. Defaults to 4 times the tile extraction MPP (e.g. for a tile_px/tile_um combination at 10X effective magnification, where tile_px=tile_um, the default blur_mpp would be 4, or effective magnification 2.5x). Only used if method is ‘blur’ or ‘both’.

Returns:

Image of applied QC mask.

Return type:

Image

remove_qc(self) None
remove_roi(self, idx: int | List[int], *, process: bool = True) None

Remove an ROI from the slide.

Parameters:

idx (int, list(int)) – Index or indices of the ROI(s) to remove.

Keyword Arguments:

process (bool) – Process ROIs after removing. Defaults to True.

tensorflow(self, img_format: str = 'numpy', incl_slidenames: bool = False, incl_loc: str | None = None, shuffle: bool = True, **kwargs) Any

Create a Tensorflow Dataset which extractes tiles from this slide.

Parameters:
  • img_format (str, optional) – Image format for returned image tiles. Options include ‘png’, ‘jpg’, and ‘numpy’. Defaults to ‘numpy’.

  • incl_slidenames (bool, optional) – Yield slide names for each image tile. Defaults to False.

  • incl_loc (Optional[str], optional) – Yield image tile location with each image tile. Options include True, ‘coord’, or ‘grid’. If True or ‘coord’, will return X/Y coordinates of the tile center in the slide’s highest magnification layer. If ‘grid’, returns the grid indices for the tile. Defaults to None.

  • shuffle (bool, optional) – Shuffle image tiles. Defaults to True.

Returns:

tf.data.Dataset

Yields:

Iterator[Any] – Items yielded by the Dataset are in dictionary format, with the keys:

‘image_raw’: Contains the image (jpg, png, or numpy) ‘slide’: Slide name (if incl_slidenames=True) ‘loc_x’ Image tile center x location (if incl_loc provided) ‘loc_y’ Image tile center y location (if incl_loc provided)

torch(self, img_format: str = 'numpy', incl_slidenames: bool = False, incl_loc: str | None = None, shuffle: bool = True, infinite: bool = False, to_tensor: bool = True, **kwargs) Any

Create a PyTorch iterator which extractes tiles from this slide.

Parameters:
  • img_format (str, optional) – Image format for returned image tiles. Options include ‘png’, ‘jpg’, and ‘numpy’. Defaults to ‘numpy’.

  • incl_slidenames (bool, optional) – Yield slide names for each image tile. Defaults to False.

  • incl_loc (Optional[str], optional) – Yield image tile location with each image tile. Options include True, ‘coord’, or ‘grid’. If True or ‘coord’, will return X/Y coordinates of the tile center in the slide’s highest magnification layer. If ‘grid’, returns the grid indices for the tile. Defaults to None.

  • shuffle (bool, optional) – Shuffle image tiles. Defaults to True.

Returns:

An iterator which yields image tiles as Torch tensors.

Yields:

Iterator[Any] – Items yielded by the Dataset are in dictionary format, with the keys:

‘image_raw’: Contains the image as a Tensor (jpg, png, or numpy) ‘slide’: Slide name (if incl_slidenames=True) ‘loc_x’ Image tile center x location (if incl_loc provided) ‘loc_y’ Image tile center y location (if incl_loc provided)

thumb(self, mpp: float | None = None, width: int | None = None, *, coords: List[int] | None = None, rect_linewidth: int = 2, rect_color: str = 'black', rois: bool = False, linewidth: int = 2, color: str = 'black', use_associated_image: bool = False, low_res: bool = False) Image

Generate a PIL Image of the slide thumbnail, with ROI overlay.

Parameters:
  • mpp (float, optional) – Microns-per-pixel, used to determine thumbnail size.

  • width (int, optional) – Goal thumbnail width (alternative to mpp).

  • coords (list(int), optional) – List of tile extraction coordinates to show as rectangles on the thumbnail, in [(x_center, y_center), …] format. Defaults to None.

  • rois (bool, optional) – Draw ROIs onto thumbnail. Defaults to False.

  • linewidth (int, optional) – Width of ROI line. Defaults to 2.

  • color (str, optional) – Color of ROI. Defaults to black.

  • use_associated_image (bool) – Use the associated thumbnail image in the slide, rather than reading from a pyramid layer.

  • low_res (bool) – Create thumbnail from the lowest-mangnification pyramid layer. Defaults to False.

Returns:

PIL image

verify_alignment(self, slide: WSI, mpp: float = 4) float

Verify alignment to another slide by calculating MSE.

view(self)

Open the slide in Slideflow Studio for interactive display.

See Slideflow Studio: Live Visualization for more information.

Other functions

predict(slide: str, model: str, *, stride_div: int = 1, **kwargs) Tuple[ndarray, ndarray | None][source]

Generate a whole-slide prediction from a saved model.

Parameters:
  • slide (str) – Path to slide.

  • model (str) – Path to saved model trained in Slideflow.

Keyword Arguments:
  • stride_div (int, optional) – Divisor for stride when convoluting across slide. Defaults to 1.

  • roi_dir (str, optional) – Directory in which slide ROI is contained. Defaults to None.

  • rois (list, optional) – List of paths to slide ROIs. Alternative to providing roi_dir. Defaults to None.

  • roi_method (str) – Either ‘inside’, ‘outside’, ‘auto’, or ‘ignore’. Determines how ROIs are used to extract tiles. If ‘inside’ or ‘outside’, will extract tiles in/out of an ROI, and raise errors.MissingROIError if an ROI is not available. If ‘auto’, will extract tiles inside an ROI if available, and across the whole-slide if no ROI is found. If ‘ignore’, will extract tiles across the whole-slide regardless of whether an ROI is available. Defaults to ‘auto’.

  • batch_size (int, optional) – Batch size for calculating predictions. Defaults to 32.

  • num_threads (int, optional) – Number of tile worker threads. Cannot supply both num_threads (uses thread pool) and num_processes (uses multiprocessing pool). Defaults to CPU core count.

  • num_processes (int, optional) – Number of child processes to spawn for multiprocessing pool. Defaults to None (does not use multiprocessing).

  • enable_downsample (bool, optional) – Enable the use of downsampled slide image layers. Defaults to True.

  • img_format (str, optional) – Image format (png, jpg) to use when extracting tiles from slide. Must match the image format the model was trained on. If ‘auto’, will use the format logged in the model params.json. Defaults to ‘auto’.

  • generator_kwargs (dict, optional) – Keyword arguments passed to the slideflow.WSI.build_generator().

  • device (torch.device, optional) – PyTorch device. Defaults to initializing a new CUDA device.

Returns:

Predictions for each outcome, with shape = (num_classes, )

np.ndarray, optional: Uncertainty for each outcome, if the model was trained with uncertainty, with shape = (num_classes,)

Return type:

np.ndarray