Slideflow offers tools for training many types of neural networks, including:

  • Weakly supervised, tile-based models: Models trained on image tiles, with labels inherited from the parent slide.

  • Weakly supervised, multi-instance learning: Models trained on feature vectors, with labels inherited from the parent slide.

  • Strongly supervised models: Models trained on image tiles, with labels assigned by ROI.

  • Self-supervised pretraining: Contrastive pretraining with or without labels (e.g. SimCLR).

  • Generative adversarial networks: Models trained to generate synthetic images (e.g. StyleGAN2/3).

  • Segmentation models: Models trained to identify and classify tissue regions (e.g. U-Net).

In this section, we will walk through the process of training a weakly supervised tile-based model. Strong supervision, Multi-instance learning (MIL), self-supervised pretraining (SSL), generative adversarial networks (GAN), and Tissue Segmentation are described in other sections.

Prepare hyperparameters

The first step of training a weakly-supervised model is configuring model parameters and hyperparameters with slideflow.ModelParams. ModelParams determines the model architecture, loss, preprocessing augmentations, and training hyperparameters.

import slideflow as sf

hp = sf.ModelParams(
  epochs=[1, 5],

See the slideflow.ModelParams API documentation for a list of available hyperparameters.


If you are using a continuous variable as an outcome measure, be sure to use a linear loss function. Linear loss functions can be viewed in slideflow.ModelParams.LinearLossDict, and all available loss functions are in slideflow.ModelParams.AllLossDict.

Training a model

Slideflow provides two methods for training models: with the high-level slideflow.Project.train() function or with the lower-level slideflow.model.Trainer. The former provides an easier interface for executing complex training tasks with a single function call, while the latter provides lower-level access for greater customizability.

Training with a Project

slideflow.Project.train() provides an easy API for executing complex training plans and organizing results in the project directory. This is the recommended way to train models in Slideflow. There are two required arguments for this function:

  • outcomes: Name (or list of names) of annotation header columns, from which to determine slide labels.

  • params: Model parameters.

The default validation plan is three-fold cross-validation, but the validation strategy can be customized via keyword arguments (val_strategy, val_k_fold, etc) as described in the API documentation. If crossfold validation is used, each model in the crossfold will be trained sequentially. Read more about validation strategies.

By default, all slides in the project will be used for training. You can restrict your training/validation data to only a subset of slides in the project with one of two methods: either by providing filters or a filtered slideflow.Dataset.

For example, you can use the filters argument to train/validate only using slides labeled as “train_and_val” in the “dataset” column with the following syntax:

results = P.train(
  filters={"dataset": ["train_and_val"]}

Alternatively, you can restrict the training/validation dataset by providing a slideflow.Dataset to the dataset argument:

dataset = P.dataset(tile_px=299, tile_um=302)
dataset = dataset.filter({"dataset": ["train_and_val"]})

results = P.train(

In both cases, slides will be further split into training and validation sets using the specified validation settings (defaulting to three-fold cross-validation).

For more granular control over the validation dataset used, you can supply a slideflow.Dataset to the val_dataset argument. Doing so will cause the rest of the validation keyword arguments to be ignored.

dataset = P.dataset(tile_px=299, tile_um=302)
train_dataset = dataset.filter({"dataset": ["train"]})
val_dataset = dataset.filter({"dataset": ["val"]})

results = P.train(

Performance metrics - including accuracy, loss, etc. - are returned as a dictionary and saved in results_log.csv in both the project directory and model directory. Additional data, including ROCs and scatter plots, are saved in the model directories. Pandas DataFrames containing tile-, slide-, and patient-level predictions are also saved in the model directory.

At each designated epoch, models are saved in their own folders. Each model directory will include a copy of its hyperparameters in a params.json file, and a copy of its training/validation slide manifest in slide.log.

Using a Trainer

You can also train models outside the context of a project by using slideflow.model.Trainer. This lower-level interface provides greater flexibility for customization and allows models to be trained without requiring a Project to be set up. It lacks several convenience features afforded by using slideflow.Project.train(), however, such as cross-validation, logging, and label preparation for easy multi-outcome support.

For this training approach, start by building a trainer with slideflow.model.build_trainer(), which requires:

  • hp: slideflow.ModelParams object.

  • outdir: Directory in which to save models and checkpoints.

  • labels: Dictionary mapping slide names to outcome labels.

slideflow.Dataset provides a .labels() function that can generate this required labels dictionary.

# Prepare dataset and labels
dataset = P.dataset(tile_px=299, tile_um=302)
labels, unique_labels = dataset.labels('tumor_type')

# Split into training/validation
train_dataset = dataset.filter({"dataset": ["train"]})
val_dataset = dataset.filter({"dataset": ["val"]})

# Determine model parameters
hp = sf.ModelParams(

# Prepare a Trainer
trainer = sf.model.build_trainer(

Use slideflow.model.Trainer.train() to train a model using your specified training and validation datasets.

# Train a model
trainer.train(train_dataset, val_dataset)
  "epochs": {
    "epoch3": {
      "train_metrics": {
        "loss": 0.497
        "accuracy": 0.806
        "val_loss": 0.719
        "val_accuracy": 0.778
      "val_metrics": {
        "loss": 0.727
        "accuracy": 0.770
      "tile": {
        "Outcome 0": [
      "slide": {
        "Outcome 0": [
      "patient": {
        "Outcome 0": [

Read more about the Trainer class and available keyword arguments in the API documentation.

Multiple outcomes

Slideflow supports both categorical and continuous outcomes, as well as training to single or multiple outcomes at once. To train with multiple outcomes simultaneously, simply pass multiple annotation headers to the outcomes argument of slideflow.Project.train().

Time-to-event outcomes

Models can also be trained to a time series outcome using Cox Proportional Hazards (CPH) and negative log likelihood loss. For CPH models, use 'negative_log_likelihood' loss and set outcomes equal to the annotation column indicating event time. Specify the event type (0 or 1) by passing the event type annotation column to the argument input_header. If you are using multiple clinical inputs, the first header passed to input_header must be event type. CPH models are not compatible with multiple outcomes.


CPH models are currently only available with the Tensorflow backend. PyTorch support for CPH outcomes is in development.

Multimodal models

In addition to training using image data, clinical data can also be provided as model input by passing annotation column headers to the variable input_header. This input is concatenated at the post-convolutional layer, prior to any configured hidden layers.

If desired, models can also be trained with clinical input data alone, without images, by using the hyperparameter argument drop_images=True.

Hyperparameter optimization

Slideflow includes several tools for assisting with hyperparameter optimization, as described in the next sections.

Testing multiple combinations

You can easily test a series of hyperparameter combinations by passing a list of ModelParams object to the params argument of slideflow.Project.train().

hp1 = sf.ModelParams(..., batch_size=32)
hp2 = sf.ModelParams(..., batch_size=64)

  params=[hp1, hp2]

Grid-search sweep

You can also prepare a grid-search sweep, testing every permutation across a series of hyperparameter ranges. Use slideflow.Project.create_hp_sweep(), which will calculate and save the sweep configuration to a JSON file. For example, the following would configure a sweep with only two combinations; the first with a learning rate of 0.01, and the second with a learning rate of 0.001:

  learning_rate=[0.001, 0.0001],

The sweep is then executed by passing the JSON path to the params argument of slideflow.Project.train():

P.train(params='sweep.json', ...)

Bayesian optimization

You can also perform Bayesian hyperparameter optimization using SMAC3, which uses a configuration space to determine the types and ranges of hyperparameters to search.

Slideflow provides several functions to assist with building these configuration spaces. slideflow.util.create_search_space() allows you to define a range to search for each hyperparameter via keyword arguments:

import slideflow as sf

config_space = sf.util.create_search_space(
    normalizer=['macenko', 'reinhard', 'none'],
    dropout=(0.1, 0.5),
    learning_rate=(1e-4, 1e-5)

slideflow.util.broad_search_space() and slideflow.util.shallow_search_space() provide preconfigured search spaces that will search a broad and narrow range of hyperparameters, respectively. You can also customize a preconfigured search space using keyword arguments. For example, to do a broad search but disable L1 searching:

import slideflow as sf

config_space = sf.util.broad_search_space(l1=None)

See the linked API documentation for each function for more details about the respective search spaces.

Once the search space is determined, you can perform the hyperparameter optimization by simply replacing slideflow.Project.train() with slideflow.Project.smac_search(), providing the configuration space to the argument smac_configspace. By default, SMAC3 will optimize the tile-level AUROC, but the optimization metric can be customized with the keyword argument smac_metric.

# Base hyperparameters
hp = sf.ModelParams(tile_px=299, ...)

# Configuration space to optimize
config_space = sf.util.shallow_search_space()

# Run the Bayesian optimization
best_config, history = P.smac_search(
    dropout        l1        l2    metric
0  0.126269  0.306857  0.183902  0.271778
1  0.315987  0.014661  0.413443  0.283289
2  0.123149  0.311893  0.184439  0.250339
3  0.250000  0.250000  0.250000  0.247641
4  0.208070  0.018481  0.121243  0.257633

slideflow.Project.smac_search() returns the best configuration and a history of models trained during the search. This history is a Pandas DataFrame with hyperparameters for columns, and a “metric” column with the optimization metric result for each trained model. The run history is also saved in CSV format in the associated model folder.

See the API documentation for available customization via keyword arguments.

Customizing model or loss

Slideflow supports dozens of model architectures, but you can also train with a custom architecture, as demonstrated in Tutorial 3: Using a custom architecture.

Similarly, you can also train with a custom loss function by supplying a dictionary to the loss argument in ModelParams, with the keys type (which must be either 'categorical', 'linear', or 'cph') and fn (a callable loss function).

For Tensorflow/Keras, the loss function must accept arguments y_true, y_pred. For linear losses, y_true may need to be cast to tf.float32. An example custom linear loss is given below:

# Custom Tensorflow loss
def custom_linear_loss(y_true, y_pred):
  y_true = tf.cast(y_true, tf.float32)
  squared_difference = tf.square(y_true - y_pred)
  return tf.reduce_mean(squared_difference, axis=-1)

For PyTorch, the loss function must return a nested loss function with arguments output, target. An example linear loss is given below:

# Custom PyTorch loss
def custom_linear_loss():
  def loss_fn(output, target):
    return torch.mean((target - output) ** 2)
  return loss_fn

In both cases, the loss function is applied as follows:

hp = sf.ModelParams(..., loss={'type': 'linear', 'fn': custom_linear_loss})

Using multiple GPUs

Slideflow can perform distributed training if multiple GPUs are available. Enable distributed training by passing the argument multi_gpu=True, which will allow Slideflow to use all available (and visible) GPUs.

Training without TFRecords

It is also possible to train deep learning models directly from slides, without first generating TFRecords. This may be advantageous for rapidly prototyping models on a large dataset, or when tuning the tile size for a dataset.

Use the argument from_wsi=True in either the slideflow.Project.train() or slideflow.model.Trainer.train() functions. Image tiles will be dynamically extracted from slides during training, and background will be automatically removed via Otsu’s thresholding.


Using the cuCIM backend will greatly improve performance when training without TFRecords.

Monitoring performance


During training, progress can be monitored using Tensorflow’s bundled Tensorboard package by passing the argument use_tensorboard=True. This functionality was disabled by default due to a recent bug in Tensorflow. To use tensorboard to monitor training, execute:

$ tensorboard --logdir=/path/to/model/directory

… and open http://localhost:6006 in your web browser.

Experiments can be automatically logged with To enable logging, first locate your Neptune API token and workspace ID, and configure the environmental variables NEPTUNE_API_TOKEN and NEPTUNE_WORKSPACE.

With the environmental variables set, Neptune logs are enabled by passing use_neptune=True to sf.load_project.

P = sf.load_project('/project/path', use_neptune=True)