slideflow.SlideMap¶
slideflow.SlideMap
assists with visualizing tiles and slides in two-dimensional space.
Once a model has been trained, tile-level predictions and intermediate layer activations can be calculated
across an entire dataset with slideflow.DatasetFeatures
.
The slideflow.SlideMap
class can then perform dimensionality reduction on these dataset-wide
activations, plotting tiles and slides in two-dimensional space. Visualizing the distribution and clustering
of tile-level and slide-level layer activations can help reveal underlying structures in the dataset and shared
visual features among classes.
The primary method of use is first generating an slideflow.DatasetFeatures
from a trained
model, then using slideflow.DatasetFeatures.map_activations()
, which returns an instance of
slideflow.SlideMap
.
ftrs = sf.DatasetFeatures(model='/path/', ...)
slide_map = ftrs.map_activations()
Alternatively, if you would like to map slides from a dataset in two-dimensional space using pre-calculated x and y
coordinates, you can use the sldieflow.SlideMap.from_xy()
class method. In addition to X and Y, this method
requires supplying tile-level metadata in the form of a list of dicts. Each dict must contain the name of the origin
slide and the tile index in the slide TFRecord.
x = np.array(...)
y = np.array(...)
slides = ['slide1', 'slide1', 'slide5', ...]
slide_map = sf.SlideMap.from_xy(x=x, y=y, slides=slides)
- class SlideMap(*, parametric_umap: bool = False)[source]¶
Two-dimensional slide map for visualization & backend for mosaic maps.
Slides are mapped in 2D either explicitly with pre-specified coordinates, or with dimensionality reduction from post-convolutional layer weights, provided from
slideflow.DatasetFeatures
.Backend for mapping slides into two dimensional space. Can use a DatasetFeatures object to map slides according to UMAP of features, or map according to pre-specified coordinates.
Can be initialized with three methods: from precalculated X/Y coordinates, from a DatasetFeatures object, or from a saved map.
- Examples
Build a SlideMap from a DatasetFeatures object
dts_ftrs = sf.DatasetFeatures(model, dataset) slidemap = sf.SlideMap.from_features(dts_ftrs)
Build a SlideMap from prespecified coordinates
x = np.array(...) y = np.array(...) slides = ['slide1', 'slide1', 'slide5', ...] slidemap = sf.SlideMap.from_xy( x=x, y=y, slides=slides )
Load a saved SlideMap
slidemap = sf.SlideMap.load('map.parquet')
Methods¶
- activations(self) ndarray ¶
Return associated DatasetFeatures activations as a numpy array corresponding to the points on this SlideMap.
- build_mosaic(self, tfrecords: List[str] | None = None, **kwargs) Mosaic ¶
Build a mosaic map.
- Parameters:
tfrecords (list(str), optional) – List of tfrecord paths. If SlideMap was created using DatasetFeatures, this argument is not required.
- Keyword Arguments:
num_tiles_x (int, optional) – Mosaic map grid size. Defaults to 50.
tile_select (str, optional) – ‘first’, ‘nearest’, or ‘centroid’. Determines how to choose a tile for display on each grid space. If ‘first’, will display the first valid tile in a grid space (fastest; recommended). If ‘nearest’, will display tile nearest to center of grid space. If ‘centroid’, for each grid, will calculate which tile is nearest to centroid tile_meta. Defaults to ‘nearest’.
tile_meta (dict, optional) – Tile metadata, used for tile_select. Dictionary should have slide names as keys, mapped to list of metadata (length of list = number of tiles in slide). Defaults to None.
normalizer ((str or
slideflow.norm.StainNormalizer
), optional) – Normalization strategy to use on image tiles. Defaults to None.normalizer_source (str, optional) – Stain normalization preset or path to a source image. Valid presets include ‘v1’, ‘v2’, and ‘v3’. If None, will use the default present (‘v3’). Defaults to None.
- cluster(self, n_clusters: int) None ¶
Performs K-means clustering on data and adds to metadata labels.
Clusters are saved to self.data[‘cluster’]. Requires that SlideMap was generated via DatasetFeatures.
- Examples
Perform K-means clustering and apply cluster labels.
slidemap.cluster(n_clusters=5) slidemap.plot()
- Parameters:
n_clusters (int) – Number of clusters for K means clustering.
- neighbors(self, slide_categories: Dict | None = None, algorithm: str = 'kd_tree', method: str = 'map', pca_dim: int = 100) None ¶
- Calculates neighbors among tiles in this map, assigning neighboring
statistics to tile metadata ‘num_unique_neighbors’ and ‘percent_matching_categories’.
- Parameters:
slide_categories (dict, optional) – Maps slides to categories. Defaults to None. If provided, will be used to calculate ‘percent_matching_categories’ statistic.
algorithm (str, optional) – NearestNeighbor algorithm, either ‘kd_tree’, ‘ball_tree’, or ‘brute’. Defaults to ‘kd_tree’.
method (str, optional) – Either ‘map’, ‘pca’, or ‘features’. How neighbors are determined. If ‘map’, calculates neighbors based on UMAP coordinates. If ‘features’, calculates neighbors on the full feature space. If ‘pca’, reduces features into pca_dim space. Defaults to ‘map’.
- umap_transform(self, array: ndarray, *, dim: int = 2, n_neighbors: int = 50, min_dist: float = 0.1, metric: str = 'cosine', **kwargs: Any) ndarray ¶
Transforms a given array using UMAP projection. If a UMAP has not yet been fit, this will fit a new UMAP on the given data.
- Parameters:
array (np.ndarray) – Array to transform with UMAP dimensionality reduction.
- Keyword Arguments:
dim (int, optional) – Number of dimensions for UMAP. Defaults to 2.
n_neighbors (int, optional) – Number of neighbors for UMAP algorithm. Defaults to 50.
min_dist (float, optional) – Minimum distance argument for UMAP algorithm. Defaults to 0.1.
metric (str, optional) – Metric for UMAP algorithm. Defaults to ‘cosine’.
**kwargs (optional) – Additional keyword arguments for the UMAP function.
- label(self, meta: str, translate: Dict | None = None) None ¶
Displays each point labeled by tile metadata (e.g. ‘predicted_class’)
- label_by_preds(self, index: int) None ¶
Displays each point with label equal to the prediction value (from 0-1)
- Parameters:
index (int) – Logit index.
- label_by_slide(self, slide_labels: Dict | None = None) None ¶
- Displays each point as the name of the corresponding slide.
If slide_labels is provided, will use this dict to label slides.
- Parameters:
slide_labels (dict, optional) – Dict mapping slide names to labels.
- label_by_uncertainty(self, index: int = 0) None ¶
Labels each point with the tile-level uncertainty, if available.
- Parameters:
index (int, optional) – Uncertainty index. Defaults to 0.
- load(path: str)¶
Load a previously saved SlideMap (UMAP and coordinates).
Loads a
SlideMap
previously saved withSlideMap.save()
.Expects a directory with
slidemap.parquet
,range_clip.npz
, and eitherumap.pkl
(non-parametric models) or a folder namedparametric_model
.- Examples
Save a SlideMap, then load it.
slidemap.save('/directory/') new_slidemap = sf.SlideMap.load('/directory/')
- Parameters:
path (str) – Directory from which to load a previously saved UMAP.
- load_coordinates(self, path: str) None ¶
Load coordinates from parquet file.
- Parameters:
path (str, optional) – Path to parquet file (.parquet) with SlideMap coordinates.
- load_umap(self, path: str) umap.UMAP ¶
Load only a UMAP model and not slide coordinates or range_clip.npz.
- Parameters:
path (str) – Path to either umap.pkl or directory with saved parametric UMAP.
- plot(self, subsample: int | None = None, title: str | None = None, cmap: Dict | None = None, xlim: Tuple[float, float] = (-0.05, 1.05), ylim: Tuple[float, float] = (-0.05, 1.05), xlabel: str | None = None, ylabel: str | None = None, legend: str | None = None, ax: Axes | None = None, loc: str | None = 'center right', ncol: int | None = 1, categorical: str | bool = 'auto', legend_kwargs: Dict | None = None, **scatter_kwargs: Any) None ¶
Plots calculated map.
- Parameters:
subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.
title (str, optional) – Title for plot.
cmap (dict, optional) – Dict mapping labels to colors.
xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).
ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).
xlabel (str, optional) – Label for x axis. Defaults to None.
ylabel (str, optional) – Label for y axis. Defaults to None.
legend (str, optional) – Title for legend. Defaults to None.
ax (matplotlib.axes.Axes, optional) – Figure axis. If not supplied, will prepare a new figure axis.
loc (str, optional) – Location for legend, as defined by matplotlib.axes.Axes.legend(). Defaults to ‘center right’.
ncol (int, optional) – Number of columns in legend, as defined by matplotlib.axes.Axes.legend(). Defaults to 1.
categorical (str, optional) – Specify whether labels are categorical. Determines the colormap. Defaults to ‘auto’ (will attempt to automatically determine from the labels).
legend_kwargs (dict, optional) – Dictionary of additional keyword arguments to the matplotlib.axes.Axes.legend() function.
**scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.
- plot_3d(self, z: ndarray | None = None, feature: int | None = None, subsample: int | None = None, fig: Figure | None = None) None ¶
Saves a plot of a 3D umap, with the 3rd dimension representing values provided by argument “z”.
- Parameters:
z (list, optional) – Values for z axis. Must supply z or feature. Defaults to None.
feature (int, optional) – Int, feature to plot on 3rd axis. Must supply z or feature. Defaults to None.
subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.
fig (matplotlib.figure.Figure, optional) – Figure. If not supplied, will prepare a new figure.
- save(self, path: str, dpi: int = 300, **kwargs)¶
Save UMAP, plot, coordinates, and normalization values to a directory.
The UMAP, plot, coordinates, and normalization values can all be loaded from this directory after saving with
sf.SlideMap.load(path)
.- Parameters:
- Keyword Arguments:
subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.
title (str, optional) – Title for plot.
cmap (dict, optional) – Dict mapping labels to colors.
xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).
ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).
xlabel (str, optional) – Label for x axis. Defaults to None.
ylabel (str, optional) – Label for y axis. Defaults to None.
legend (str, optional) – Title for legend. Defaults to None.
**scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.
- save_3d(self, filename: str, dpi: int = 300, **kwargs)¶
Save 3D plot of slide map.
- Parameters:
- Keyword Arguments:
- save_plot(self, filename: str, dpi: int = 300, **kwargs)¶
Save plot of slide map.
- Parameters:
- Keyword Arguments:
subsample (int, optional) – Subsample to only include this many tiles on plot. Defaults to None.
title (str, optional) – Title for plot.
cmap (dict, optional) – Dict mapping labels to colors.
xlim (list, optional) – List of float indicating limit for x-axis. Defaults to (-0.05, 1.05).
ylim (list, optional) – List of float indicating limit for y-axis. Defaults to (-0.05, 1.05).
xlabel (str, optional) – Label for x axis. Defaults to None.
ylabel (str, optional) – Label for y axis. Defaults to None.
legend (str, optional) – Title for legend. Defaults to None.
**scatter_kwargs (optional) – Additional keyword arguments to the seaborn scatterplot function.
- save_coordinates(self, path: str) None ¶
Save coordinates only to parquet file.
- Parameters:
path (str, optional) – Save coordinates to this location.
- save_umap(self, path: str) None ¶
Save UMAP, coordinates, and normalization information to a directory.
- Parameters:
path (str, optional) – Save UMAP and coordinates to this directory. Coordinates will be saved in this directory with the filename
slidemap.parquet
Model will be saved as umap.pkl (parametric) or model.pkl (parametric).