Shortcuts

slideflow.SlideMap

slideflow.SlideMap assists with visualizing tiles and slides in two-dimensional space.

Once a model has been trained, tile-level predictions and intermediate layer activations can be calculated across an entire dataset with slideflow.DatasetFeatures. The slideflow.SlideMap class can then perform dimensionality reduction on these dataset-wide activations, plotting tiles and slides in two-dimensional space. Visualizing the distribution and clustering of tile-level and slide-level layer activations can help reveal underlying structures in the dataset and shared visual features among classes.

The primary method of use is first generating an slideflow.DatasetFeatures from a trained model, then using slideflow.DatasetFeatures.map_activations(), which returns an instance of slideflow.SlideMap.

ftrs = sf.DatasetFeatures(model='/path/', ...)
slide_map = ftrs.map_activations()

Alternatively, if you would like to map slides from a dataset in two-dimensional space using pre-calculated x and y coordinates, you can use the sldieflow.SlideMap.from_xy() class method. In addition to X and Y, this method requires supplying tile-level metadata in the form of a list of dicts. Each dict must contain the name of the origin slide and the tile index in the slide TFRecord.

x = np.array(...)
y = np.array(...)
slides = ['slide1', 'slide1', 'slide5', ...]
slide_map = sf.SlideMap.from_xy(x=x, y=y, slides=slides)
class SlideMap(*, parametric_umap: bool = False)[source]

Two-dimensional slide map for visualization & backend for mosaic maps.

Slides are mapped in 2D either explicitly with pre-specified coordinates, or with dimensionality reduction from post-convolutional layer weights, provided from slideflow.DatasetFeatures.

Backend for mapping slides into two dimensional space. Can use a DatasetFeatures object to map slides according to UMAP of features, or map according to pre-specified coordinates.

Can be initialized with three methods: from precalculated X/Y coordinates, from a DatasetFeatures object, or from a saved map.

Examples

Build a SlideMap from a DatasetFeatures object

dts_ftrs = sf.DatasetFeatures(model, dataset)
slidemap = sf.SlideMap.from_features(dts_ftrs)

Build a SlideMap from prespecified coordinates

x = np.array(...)
y = np.array(...)
slides = ['slide1', 'slide1', 'slide5', ...]
slidemap = sf.SlideMap.from_xy(
    x=x, y=y, slides=slides
)

Load a saved SlideMap

slidemap = sf.SlideMap.load('map.parquet')
Parameters:

slides (list(str)) – List of slide names

Methods

activations(self) ndarray

Return associated DatasetFeatures activations as a numpy array corresponding to the points on this SlideMap.

build_mosaic(self, tfrecords: List[str] | None = None, **kwargs) Mosaic

Build a mosaic map.

Parameters:

tfrecords (list(str), optional) – List of tfrecord paths. If SlideMap was created using DatasetFeatures, this argument is not required.

Keyword Arguments:
  • num_tiles_x (int, optional) – Mosaic map grid size. Defaults to 50.

  • tile_select (str, optional) – ‘first’, ‘nearest’, or ‘centroid’. Determines how to choose a tile for display on each grid space. If ‘first’, will display the first valid tile in a grid space (fastest; recommended). If ‘nearest’, will display tile nearest to center of grid space. If ‘centroid’, for each grid, will calculate which tile is nearest to centroid tile_meta. Defaults to ‘nearest’.

  • tile_meta (dict, optional) – Tile metadata, used for tile_select. Dictionary should have slide names as keys, mapped to list of metadata (length of list = number of tiles in slide). Defaults to None.

  • normalizer ((str or slideflow.norm.StainNormalizer), optional) – Normalization strategy to use on image tiles. Defaults to None.

  • normalizer_source (str, optional) – Stain normalization preset or path to a source image. Valid presets include ‘v1’, ‘v2’, and ‘v3’. If None, will use the default present (‘v3’). Defaults to None.

cluster(self, n_clusters: int) None

Performs K-means clustering on data and adds to metadata labels.

Clusters are saved to self.data[‘cluster’]. Requires that SlideMap was generated via DatasetFeatures.

Examples

Perform K-means clustering and apply cluster labels.

slidemap.cluster(n_clusters=5) slidemap.plot()

Parameters:

n_clusters (int) – Number of clusters for K means clustering.

neighbors(self, slide_categories: Dict | None = None, algorithm: str = 'kd_tree', method: str = 'map', pca_dim: int = 100) None
Calculates neighbors among tiles in this map, assigning neighboring

statistics to tile metadata ‘num_unique_neighbors’ and ‘percent_matching_categories’.

Parameters:
  • slide_categories (dict, optional) – Maps slides to categories. Defaults to None. If provided, will be used to calculate ‘percent_matching_categories’ statistic.

  • algorithm (str, optional) – NearestNeighbor algorithm, either ‘kd_tree’, ‘ball_tree’, or ‘brute’. Defaults to ‘kd_tree’.

  • method (str, optional) – Either ‘map’, ‘pca’, or ‘features’. How neighbors are determined. If ‘map’, calculates neighbors based on UMAP coordinates. If ‘features’, calculates neighbors on the full feature space. If ‘pca’, reduces features into pca_dim space. Defaults to ‘map’.

filter(self, slides: List[str]) None

Filters map to only show tiles from the given slides.

Parameters:

slides (list(str)) – List of slide names.

umap_transform(self, array: ndarray, *, dim: int = 2, n_neighbors: int = 50, min_dist: float = 0.1, metric: str = 'cosine', **kwargs: Any) ndarray

Transforms a given array using UMAP projection. If a UMAP has not yet been fit, this will fit a new UMAP on the given data.

Parameters:

array (np.ndarray) – Array to transform with UMAP dimensionality reduction.

Keyword Arguments:
  • dim (int, optional) – Number of dimensions for UMAP. Defaults to 2.

  • n_neighbors (int, optional) – Number of neighbors for UMAP algorithm. Defaults to 50.

  • min_dist (float, optional) – Minimum distance argument for UMAP algorithm. Defaults to 0.1.

  • metric (str, optional) – Metric for UMAP algorithm. Defaults to ‘cosine’.

  • **kwargs (optional) – Additional keyword arguments for the UMAP function.

label(self, meta: str, translate: Dict | None = None) None

Displays each point labeled by tile metadata (e.g. ‘predicted_class’)

Parameters:
  • meta (str) – Data column from which to assign labels.

  • translate (dict, optional) – If provided, will translate the read metadata through this dictionary.

label_by_preds(self, index: int) None

Displays each point with label equal to the prediction value (linear from 0-1)

Parameters:

index (int) – Logit index.

label_by_slide(self, slide_labels: Dict | None = None) None
Displays each point as the name of the corresponding slide.

If slide_labels is provided, will use this dict to label slides.

Parameters:

slide_labels (dict, optional) – Dict mapping slide names to labels.

label_by_uncertainty(self, index: int = 0) None

Labels each point with the tile-level uncertainty, if available.

Parameters:

index (int, optional) – Uncertainty index. Defaults to 0.

load(path: str)

Load a previously saved SlideMap (UMAP and coordinates).

Loads a SlideMap previously saved with SlideMap.save().

Expects a directory with slidemap.parquet, range_clip.npz, and either umap.pkl (non-parametric models) or a folder named parametric_model.

Examples

Save a SlideMap, then load it.

slidemap.save('/directory/')
new_slidemap = sf.SlideMap.load('/directory/')
Parameters:

path (str) – Directory from which to load a previously saved UMAP.

load_coordinates(self, path: str) None

Load coordinates from parquet file.

Parameters:

path (str, optional) – Path to parquet file (.parquet) with SlideMap coordinates.

load_umap(self, path: str) umap.UMAP

Load only a UMAP model and not slide coordinates or range_clip.npz.

Parameters:

path (str) – Path to either umap.pkl or directory with saved parametric UMAP.

plot(self, subsample: int | None = None, title: str | None = None, cmap: Dict | None = None, xlim: Tuple[float, float] = (-0.05, 1.05), ylim: Tuple[float, float] = (-0.05, 1.05), xlabel: str | None = None, ylabel: str | None = None, legend: str | None = None, ax: Axes | None = None, loc: str | None = 'center right', ncol: int | None = 1, categorical: str | bool = 'auto', legend_kwargs: Dict | None = None, **scatter_kwargs: Any) None

Plots calculated map.

Parameters:
  • subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.

  • title (str, optional) – Title for plot.

  • cmap (dict, optional) – Dict mapping labels to colors.

  • xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).

  • ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).

  • xlabel (str, optional) – Label for x axis. Defaults to None.

  • ylabel (str, optional) – Label for y axis. Defaults to None.

  • legend (str, optional) – Title for legend. Defaults to None.

  • ax (matplotlib.axes.Axes, optional) – Figure axis. If not supplied, will prepare a new figure axis.

  • loc (str, optional) – Location for legend, as defined by matplotlib.axes.Axes.legend(). Defaults to ‘center right’.

  • ncol (int, optional) – Number of columns in legend, as defined by matplotlib.axes.Axes.legend(). Defaults to 1.

  • categorical (str, optional) – Specify whether labels are categorical. Determines the colormap. Defaults to ‘auto’ (will attempt to automatically determine from the labels).

  • legend_kwargs (dict, optional) – Dictionary of additional keyword arguments to the matplotlib.axes.Axes.legend() function.

  • **scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.

plot_3d(self, z: ndarray | None = None, feature: int | None = None, subsample: int | None = None, fig: Figure | None = None) None

Saves a plot of a 3D umap, with the 3rd dimension representing values provided by argument “z”.

Parameters:
  • z (list, optional) – Values for z axis. Must supply z or feature. Defaults to None.

  • feature (int, optional) – Int, feature to plot on 3rd axis. Must supply z or feature. Defaults to None.

  • subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.

  • fig (matplotlib.figure.Figure, optional) – Figure. If not supplied, will prepare a new figure.

save(self, path: str, dpi: int = 300, **kwargs)

Save UMAP, plot, coordinates, and normalization values to a directory.

The UMAP, plot, coordinates, and normalization values can all be loaded from this directory after saving with sf.SlideMap.load(path).

Parameters:
  • path (str) – Directory in which to save the plot and UMAP. The UMAP image will be saved with the filename “slidemap.png”.

  • dpi (int, optional) – DPI for final image. Defaults to 300.

Keyword Arguments:
  • subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.

  • title (str, optional) – Title for plot.

  • cmap (dict, optional) – Dict mapping labels to colors.

  • xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).

  • ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).

  • xlabel (str, optional) – Label for x axis. Defaults to None.

  • ylabel (str, optional) – Label for y axis. Defaults to None.

  • legend (str, optional) – Title for legend. Defaults to None.

  • **scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.

save_3d(self, filename: str, dpi: int = 300, **kwargs)

Save 3D plot of slide map.

Parameters:
  • filename (str) – _description_

  • dpi (int, optional) – _description_. Defaults to 300.

Keyword Arguments:
  • z (list, optional) – Values for z axis. Must supply z or feature. Defaults to None.

  • feature (int, optional) – Int, feature to plot on 3rd axis. Must supply z or feature. Defaults to None.

  • subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.

save_plot(self, filename: str, dpi: int = 300, **kwargs)

Save plot of slide map.

Parameters:
  • filename (str) – File path to save the image.

  • dpi (int, optional) – DPI for final image. Defaults to 300.

Keyword Arguments:
  • subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.

  • title (str, optional) – Title for plot.

  • cmap (dict, optional) – Dict mapping labels to colors.

  • xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).

  • ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).

  • xlabel (str, optional) – Label for x axis. Defaults to None.

  • ylabel (str, optional) – Label for y axis. Defaults to None.

  • legend (str, optional) – Title for legend. Defaults to None.

  • **scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.

save_coordinates(self, path: str) None

Save coordinates only to parquet file.

Parameters:

path (str, optional) – Save coordinates to this location.

save_umap(self, path: str) None

Save UMAP, coordinates, and normalization information to a directory.

Parameters:

path (str, optional) – Save UMAP and coordinates to this directory. Coordinates will be saved in this directory with the filename slidemap.parquet Model will be saved as umap.pkl (parametric) or model.pkl (parametric).

save_encoder(self, path: str) None

Save Parametric UMAP encoder only.