slideflow¶
- about(console=None) None [source]¶
Print a summary of the slideflow version and active backends.
- Example
>>> sf.about() ╭=======================╮ │ Slideflow │ │ Version: 3.0.0 │ │ Backend: torch │ │ Slide Backend: cucim │ │ https://slideflow.dev │ ╰=======================╯
- Parameters:
console (rich.console.Console, optional) – Active console, if one exists. Defaults to None.
- build_feature_extractor(name: str, backend: str | None = None, **kwargs) BaseFeatureExtractor [source]¶
Build a feature extractor.
The returned feature extractor is a callable object, which returns features (often layer activations) for either a batch of images or a
slideflow.WSI
object.If generating features for a batch of images, images are expected to be in (B, W, H, C) format and non-standardized (scaled 0-255) with dtype uint8. The feature extractors perform all needed preprocessing on the fly.
If generating features for a slide, the slide is expected to be a
slideflow.WSI
object. The feature extractor will generate features for each tile in the slide, returning a numpy array of shape (W, H, F), where F is the number of features.- Parameters:
name (str) – Name of the feature extractor to build. Available feature extractors are listed with
slideflow.model.list_extractors()
.- Keyword Arguments:
tile_px (int) – Tile size (input image size), in pixels.
**kwargs (Any) – All remaining keyword arguments are passed to the feature extractor factory function, and may be different for each extractor.
- Returns:
A callable object which accepts a batch of images (B, W, H, C) of dtype uint8 and returns a batch of features (dtype float32).
- Examples
Create an extractor that calculates post-convolutional layer activations from an imagenet-pretrained Resnet50 model.
import slideflow as sf extractor = sf.build_feature_extractor( 'resnet50_imagenet' )
Create an extractor that calculates ‘conv4_block4_2_relu’ activations from an imagenet-pretrained Resnet50 model.
extractor = sf.build_feature_extractor( 'resnet50_imagenet', layers='conv4_block4_2_relu )
Create a pretrained “CTransPath” extractor.
extractor = sf.build_feature_extractor('ctranspath')
Use an extractor to calculate layer activations for an entire dataset.
import slideflow as sf # Load a project and dataset P = sf.load_project(...) dataset = P.dataset(...) # Create a feature extractor resnet = sf.build_feature_extractor( 'resnet50_imagenet' ) # Calculate features for the entire dataset features = sf.DatasetFeatures( resnet, dataset=dataset )
Generate a map of features across a slide.
import slideflow as sf # Load a slide wsi = sf.WSI(...) # Create a feature extractor retccl = sf.build_feature_extractor( 'retccl', resize=True ) # Create a feature map, a 2D array of shape # (W, H, F), where F is the number of features. features = retccl(wsi)
- create_project(root: str, cfg: Dict | str | None = None, *, download: bool = False, md5: bool = False, **kwargs) Project ¶
Create a project at the existing folder from a given configuration.
Supports both manual project creation via keyword arguments, and setting up a project through a specified configuration. The configuration may be a dictionary or a path to a JSON file containing a dictionary. It must have the key ‘annotations’, which includes a path to an annotations file, and may optionally have the following arguments:
name: Name for the project and dataset.
rois: Path to .tar.gz file containing compressed ROIs.
slides: Path in which slides will be stored.
tiles: Path in which extracted tiles will be stored.
tfrecords: Path in which TFRecords will be stored.
import slideflow as sf P = sf.create_project( root='path', annotations='file.csv', slides='path', tfrecords='path' )
Annotations files are copied into the created project folder.
Alternatively, you can create a project using a prespecified configuration, of which there are three available:
sf.project.LungAdenoSquam()
sf.project.ThyroidBRS()
sf.project.BreastER()
When creating a project from a configuration, setting
download=True
will download the annoations file and slides from The Cancer Genome Atlas (TCGA).import slideflow as sf project = sf.create_project( root='path', cfg=sf.project.LungAdenoSquam(), download=True )
- Parameters:
- Keyword Arguments:
download (bool) – Download any missing slides from the Genomic Data Commons (GDC) automatically, using slide names stored in the annotations file.
md5 (bool) – Perform MD5 hash verification for all slides using the GDC (TCGA) MD5 manifest, which will be downloaded.
name (str) – Set the project name. This has higher priority than any supplied configuration, which will be ignored.
slides (str) – Set the destination folder for slides. This has higher priority than any supplied configuration, which will be ignored.
tiles (str) – Set the destination folder for tiles. This has higher priority than any supplied configuration, which will be ignored.
tfrecords (str) – Set the destination for TFRecords. This has higher priority than any supplied configuration, which will be ignored.
roi_dest (str) – Set the destination folder for ROIs.
dataset_config (str) – Path to dataset configuration JSON file for the project. Defaults to ‘./datasets.json’.
sources (list(str)) – List of dataset sources to include in project. Defaults to ‘MyProject’.
models_dir (str) – Path to directory in which to save models. Defaults to ‘./models’.
eval_dir (str) – Path to directory in which to save evaluations. Defaults to ‘./eval’.
- Returns:
slideflow.Project