Active Semi-Supervised Learning training strategy

In this demo we will show how to perform an active semi-supervised learning strategy for the training of a deep learning neural network which aims to perform the wound segmentation on smartphone images. In this demo we will set generic folders for the images and related masks, which mimic as much as possible the original Deepskin dataset. Thus, the proposed code is ready-to-use with your custom dataset and it could be easily adapted also for other segmentation purposes.

The model used for the image segmentation is a classical EfficientNet-b3 architecture, which is already implemented in the deepskin Python package. The detailed description about the model architecture could be found in the work of Curti et al. 1.

We used Tensorflow library for the model implementation, so be sure to have installed all the required package before the use of this script.

First of all, we need to import the required libraries and define the common variables and path for the use of the segmentation model. The below script could be used for all the ASSL rounds of training, with minimum chaning in the below global variables. Take care about the code comments for the correct usage of the script in different rounds. You must take care about the folder tree in which the data (images and masks) are stored. In this demo we assume a folder tree described as:

data/
├── deepskin_images_round0
├── deepskin_masks_round0
├── deepskin_images
├── deepskin_masks
├── validation_images_pred_round0
└── validation_masks_pred_round0

where: * deepskin_images contains the entire set of available images in the dataset; * deepskin_masks contains the entire set of validated masks in the dataset; * deepskin_images_round0 contains the images to use during the current (round 0) round of ASSL training; * deepskin_masks_round0 contains the masks to use during the current (round 0) round of ASSL training; * validation_images_pred_round0 will be filled by the images with the predictions overlayed for the ASSL validation; * validation_masks_pred_round0 will be filled by the predictions of the trained models.

In [1]:

import os

# define the current round number
ASSL_ROUND = 0
# define the batch-size to use during the training
BATCH = 8
# define the directory in which the whole DB of images are stored
ALL_IMAGE_FOLDER = './data/deepskin_images'
# define the directory in which the whole DB of (validated!!) masks are stored
ALL_MASKS_FOLDER = './data/deepskin_masks'
# define the directory in which the images are stored
TRAIN_IMAGE_FOLDER = f'./data/deepskin_images_round{ASSL_ROUND:d}'
# define the directory in which the masks are stored
TRAIN_MASKS_FOLDER = f'./data/deepskin_masks_round{ASSL_ROUND:d}'
# define the directory in which the predictions will be saved for the validation
PRED_IMAGE_FOLDER = f'./data/validation_images_pred_round{ASSL_ROUND:d}'
# define the directory in which the predictions will be saved
PRED_MASKS_FOLDER = f'./data/validation_masks_pred_round{ASSL_ROUND:d}'

# crete the prediction folder if needed
os.makedirs(PRED_IMAGE_FOLDER, exist_ok=True)
os.makedirs(PRED_MASKS_FOLDER, exist_ok=True)

# define the output weight file for the best model checkpint
OUT_WEIGHT_FILE = f'./checkpoints/model_round{ASSL_ROUND:d}.h5'

To monitor also the development of the ASSL strategy, we can take a look at the statistics related to the considered data against the totality. In each ASSL round we will split the available round-data into a training set (90%) and a validation set (10%): the training set will be used for the tuning of the model parameters, while the remaining validation set will be used for the monitoring of the model performances in a set of independent images. For sake of completeness, we will check all the relevant stats for the round evaluation with the following log:

In [14]:

from glob import glob

all_files = glob(f'{ALL_IMAGE_FOLDER}/*')
print(f'All available images: {len(all_files):d}')
round_files = glob(f'{TRAIN_IMAGE_FOLDER}/*')
round_perc = len(round_files)/len(all_files)*100
print(f'Images used at round {ASSL_ROUND:d}: {len(round_files):d} ({round_perc:.3f}%)')
train_perc = round(len(round_files)*.9)
test_perc = round(len(round_files)*.1)
print(f'Images used as training at round {ASSL_ROUND:d}: {train_perc:d} (90%)')
print(f'Images used as test at round {ASSL_ROUND:d}: {test_perc:d} (10%)')

All available images: 6225
Images used at round 0: 4936 (79.293%)
Images used as training at round 0: 4442 (90%)
Images used as test at round 0: 494 (10%)

Since we want to perform also a data-augmentation step during the training, we will use the APIs of the Tensorflow library for the correct management of the images and desired transformations.

In [ ]:

import tensorflow as tf

# define the data-augmentation parameters
augmentation_params = {
    'rotation_range':360,       # all possible rotations
    'width_shift_range':0.0,    # avoid width shift
    'height_shift_range':0.0,   # avoid height shift
    'fill_mode':'reflect',      # use reflection to fill the augmented image
    'shear_range':0.,           # avoid shear
    'zoom_range':0.,            # avoid zoom
    'horizontal_flip':True,     # perform horizontal flip of the image
    'vertical_flip':True,       # perform vertical flip of the image
    'cval':0.,                  # just the constant value for the augmented background
    'validation_split':0.1,     # set the validation set as the 10% of the entire set of data
}

# define the data augmentation models for images and masks
# NOTE: both images and masks must be rescaled into [0, 1] range for
# the correct use of the segmentation model!
image_augmentation = tf.keras.preprocessing.image.ImageDataGenerator(
    **augmentation_params,
    rescale = 1./255
)
masks_augmentation = tf.keras.preprocessing.image.ImageDataGenerator(
    **augmentation_params,
    rescale = 1./255
)

Now we need to define the data-loader strategy for the images/masks. Using the global information about the folder tree and the data-augmentation models, we can use the Tensorflow APIs as follow.

NOTE: In the following snippet we assume a model for a semantic classification task (new version of the deepskin segmentation model), setting the color_mode of the mask files as rgb. In this context, each mask channel is associated to a different label. The model implemented in the deepskin package, provides the semantic masks as (wound, body, background), but any other order is supported by the following snippet!

In [ ]:

# we fixed the dimensionality of the input as 256x256
IMG_SIZE = 256

# define the training parameters for the data loader
train_params = {
    'target_size':(IMG_SIZE, IMG_SIZE),  # resize shape of the input
    'class_mode':'input',  # this is the input of the model
    'batch_size':BATCH,    # set the batch size
    'shuffle':True,        # enable shuffling of the data
    'seed':42,             # fix the random seed for the reproducibility
}

# define the data loader for the training data (aka images and masks)
train_image_generator = image_augmentation.flow_from_directory(
    directory=TRAIN_IMAGE_FOLDER,  # set the folder of the images
    **train_params,                # set the training parameters
    color_mode='rgb',              # the images are in RGB fmt
    classes=[''],                  # there are no classes
    subset='training'              # this subset is the training one (aka the 90% of the data)
)
train_masks_generator = masks_augmentation.flow_from_directory(
    directory=TRAIN_MASKS_FOLDER,  # set the folder of the masks
    **train_params,                # set the training parameters
    color_mode='rgb',              # the masks are in multi-channel format
    classes=[''],                  # there are no classes
    subset='training'              # this subset is the training one (aka the 90% of the data)
)

# define the data loader for the validation data (aka images and masks)
val_image_generator = image_augmentation.flow_from_directory(
    directory=TRAIN_IMAGE_FOLDER,  # the validation images belongs to the same folder of the training ones
    **train_params,                # set the training parameters
    color_mode='rgb',              # the images are in RGB fmt
    classes=[''],                  # there are no classes
    subset='validation'            # this subset is the validation one (aka the 10% of the data)
)
val_masks_generator = masks_augmentation.flow_from_directory(
    directory=TRAIN_MASKS_FOLDER,  # the validation images belongs to the same folder of the training ones
    **train_params,                # set the training parameters
    color_mode='rgb',              # the masks are in multi-channel format
    classes=[''],                  # there are no classes
    subset='validation'            # this subset is the validation one (aka the 10% of the data)
)

# NOTE: all the data (training and validation) belongs to the same directory
# and the internal subdivision is guaranteed by the subset keyword of the
# tensorflow function.

# Since we want to combine images and masks into a series of pairs, we
# can use a pre-processing on the data loader generator to obtain the
# correct input for our model

from itertools import itemgetter

# create training pairs
train_generator = zip(map(itemgetter(0), (train_image_generator)),
                      map(itemgetter(0), (train_masks_generator))
                     )
# create validation pairs
validation_generator = zip(map(itemgetter(0), (val_image_generator)),
                           map(itemgetter(0), (val_masks_generator))
                          )

Now we can build the model setting the missing training parameters, i.e. the loss function and the optimization strategy. Since the model architecture is already defined in the deepskin package, we can directly import it and setting the training parameters.

NOTE: Since we are inside an ASSL training round, the model weights must be initialized as random at each round!

The definition of the loss function and metrics used in the original Deepskin model were combinations of native functions. For their implementation we used the code provided by the segmentation_models package (ref. here). Importing this library, we defined the loss function as combination of Dice score and Binary Focal Loss functions. The monitoring of the model performances is defined using the standard IoU score and the F-score.

In [3]:

from deepskin import deepskin_model
import segmentation_models as sm
import tensorflow as tf

# define the model architecture
model = deepskin_model(verbose=False)

# define the model optimizer
optimizer = tf.keras.optimizers.Adam(
    learning_rate=1e-5,
    beta_1=0.9, beta_2=0.999,
    epsilon=1e-7,
    amsgrad=False,
    name='Adam'
)

# define the loss function
loss = sm.losses.DiceLoss() + (1 * sm.losses.BinaryFocalLoss())

# define the metric functions for the model evaluation along
# the training epochs
iou_score = sm.metrics.IOUScore(threshold=0.5)
fscore = sm.metrics.FScore(threshold=0.5)

# set the training parameters
model.compile(
    optimizer=optimizer,
    loss=loss,
    metrics=[iou_score, fscore],
    run_eagerly=False,
)

When everything about the model parameters is decided and fixed, we can start the training step, enabling all the utilities provided by the Tensorflow.

In [ ]:

# fit model
history = model.fit(
    x=train_generator, y=None, # define the input data generators
    batch_size=BATCH,          # set the batch size
    epochs=100,                # set the maximum number of epochs to perform
    steps_per_epoch=train_image_generator.n // BATCH, # define the number of steps for each
                                                      # epoch according to the data generator
    callbacks=[
                                                      # define the callback for the model checkpoint
                                                      # setting the output file in which save the best results
                                                      # given by the minimum of loss obtained
        tf.keras.callbacks.ModelCheckpoint(
            OUT_WEIGHT_FILE,
            save_weights_only=True,
            save_best_only=True,
            mode='min'
        ),
                                                      # define the callback for the reduction of learning rate
                                                      # when a plateau of performances is achieved
        tf.keras.callbacks.ReduceLROnPlateau(),
                                                      # define the callback for the early stopping of the training
                                                      # if there are no improvements in the validation loss for 50
                                                      # epochs
        tf.keras.callbacks.EarlyStopping(
            monitor='val_loss',
            min_delta=1e-4, patience=10,
            verbose=True,
            mode='auto', baseline=None,
            restore_best_weights=True
        ),
    ],
    validation_data=validation_generator,             # set the data validation generator
    validation_steps=val_image_generator.n // BATCH,  # define the number of steps for each
                                                      # validation according to the data generator
    initial_epoch=0,                                  # set the initial epoch counter
    validation_freq=1,                                # enable the validation at each epoch
    max_queue_size=10,                                # queue of data to use
    workers=1,                                        # number of threads to use
    use_multiprocessing=False,                        # disable multi-processing
    shuffle=True,                                     # enable the shuffling of the data at each epoch
    verbose=1                                         # set verbosity level of the training
)

At the end of the training the model will achieved the best performances of the current ASSL round. An important step for the monitoring of the performances is the visualization of the obtained results, expressed in terms of metric parameters and loss along the training epochs.

In [ ]:

import pylab as plt
import seaborn as sns
import matplotlib.patches as mpatches

# define the legends for the plots
fig1_lbl = [ mpatches.Patch(facecolor='blue', label='Train Loss', edgecolor='k', linewidth=2),
             mpatches.Patch(facecolor='orange', label='Val Loss', edgecolor='k', linewidth=2)
           ]

fig2_lbl = [ mpatches.Patch(facecolor='blue', label='IoU train score', edgecolor='k', linewidth=2),
             mpatches.Patch(facecolor='orange', label='IoU val score', edgecolor='k', linewidth=2)
           ]

fig3_lbl = [ mpatches.Patch(facecolor='blue', label='F1-score train', edgecolor='k', linewidth=2),
             mpatches.Patch(facecolor='orange', label='F1-score val', edgecolor='k', linewidth=2)
           ]

epochs = np.arange(len(history.history['loss']))

with sns.plotting_context('paper', font_scale=2):
    fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3, figsize=(30, 8))
    loss = sns.lineplot(x=epochs, y=history.history['loss'],
                        markers=True, dashes=False,
                        ax=ax1)
    val_loss = sns.lineplot(x=epochs, y=history.history['val_loss'],
                            markers=True, dashes=False,
                            ax=ax1)
    ax1.set_ylabel('Mask Loss values')
    sns.despine(ax=ax1, offset=10, top=True, right=True, bottom=False, left=False)
    ax1.legend(handles=fig1_lbl, loc='upper right')


    loss = sns.lineplot(x=epochs, y=history.history['iou_score'],
                        markers=True, dashes=False,
                        ax=ax2)
    val_loss = sns.lineplot(x=epochs, y=history.history['val_iou_score'],
                            markers=True, dashes=False,
                            ax=ax2)
    ax2.set_ylabel('IoU loss')
    sns.despine(ax=ax2, offset=10, top=True, right=True, bottom=False, left=False)
    ax2.legend(handles=fig2_lbl, loc='best')

    loss = sns.lineplot(x=epochs, y=history.history['f1-score'],
                        markers=True, dashes=False,
                        ax=ax3)
    val_loss = sns.lineplot(x=epochs, y=history.history['val_f1-score'],
                            markers=True, dashes=False,
                            ax=ax3)
    ax3.set_ylabel('F1-score loss')
    sns.despine(ax=ax3, offset=10, top=True, right=True, bottom=False, left=False)
    ax3.legend(handles=fig3_lbl, loc='best')

Now we need to take care about the prediction of the model on the new data, i.e. the data which belongs to the whole dataset. Since we set the restoring of the best model parameters at the end of the training epochs, we can directly apply the model on the new images, sampled on the dataset folder.

NOTE: since the image needs some pre-processing step, we need to manually apply the required sequence of instruction before inserting it in the model.

The correct management of the ASSL training strategy requires the validation of the entire set of images in the dataset at each round of training. Therefore, if you want to check the effectiveness of the ASSL training, the list of files for the model prediction must be collected from the ALL_IMAGE_FOLDER folder. In contrary, if you want to use the ASSL training strategy just to speed-up your data annotation, you can avoid the re-labeling of the pre-validated images, focusing only on the remaining ones. In the below code, this second option could be enabled un-commenting the first lines.

In [ ]:

import cv2
import numpy as np

files = glob(f'{ALL_IMAGE_FOLDER}/*')
## Uncomment this line for the data labeling feature
#files = list(set(files) - set(glob(f'{TRTRAIN_IMAGE_FOLDER}/*')))
print(f'{len(files):d} files in the global DB')

# get the model input shape
  _, h, w, c = model.input.shape

# loop along the available images
for i, f in enumerate(files):
    # log progress
    print(
        f'\rFiles {i + 1:d}/{len(files):d}',
        flush=True,
        end=''
    )

    # get the image name
    name = os.path.basename(f)
    # remove the extension to be sure that
    # the predicted images will be saved as png
    name, _ = os.path.splitext(name)

    # load the image
    bgr = cv2.imread(f)
    # convert the image in RGB fmt
    rgb = bgr[..., ::-1]
    # resize the image to the
    resized = cv2.resize(
        rgb,
        dsize=(h, w),
        interpolation=cv2.INTER_CUBIC
    )
    # convert the image into floating-point values
    resized = np.float32(resized)
    # normalize the image into [0, 1] range
    resized *= 1. / 255
    # extend the dimensionality of the input array
    # to the [batch, h, w, c] format
    resized = resized.reshape(1, *resized.shape)

    # apply the model to get the prediction
    pred = model.predict(resized)
    # remove useless dimensions from the image
    pred = np.squeeze(pred)
    # filter the mask output to binary format
    pred = np.where(pred > tol, 255, 0)
    # convert the mask into uint8 fmt
    pred = np.uint8(pred)

    # resize the output mask to the same
    # shape of the original image, with an
    # appropriated interpolation algorithm
    pred = cv2.resize(
        pred,
        dsize=(bgr.shape[1], bgr.shape[0]),
        interpolation=cv2.INTER_NEAREST_EXACT
    )

    # define a canvas on which overlay the predicted mask
    # initialized as a copy of the original image
    canvas = bgr.copy()
    # determine the mask contours for the wound
    cnt, _ = cv2.findContours(
        pred[..., 0],            # we are assuming that the first channel
                                 # will be related to the wound area
        cv2.RETR_TREE,
        cv2.CHAIN_APPROX_SIMPLE
    )
    # draw the contours on the canvas
    # as gold lines
    canvas = cv2.drawContours(
        canvas,
        cnt,
        -1,
        (0, 255, 255), # color contours in BGR fmt
        3              # linewidth of the contour
    )

    # save the predicted mask
    cv2.imwrite(f'{PRED_MASKS_FOLDER}/{name}.png', pred)

    # save the canvas image for the ASSL validation
    cv2.imwrite(f'{PRED_IMAGE_FOLDER}/{name}.png', canvas)

At the end of this step, we have all the new prediction obtained by this round of ASSL training ready for the manual evaluation by the experts. In the PRED_MASKS_FOLDER directory we have the masks generated by the model at this round, while in the PRED_IMAGE_FOLDER the list of the original images with the prediction overlayed. Therefore, we are ready for the active learning evaluation which can be easily performed using the active_learning_validator scripts available here. For the correct usage of the validator, we need to move the PRED_IMAGE_FOLDER directory in the root of the validator project. Using the web interface we can scroll the list of images, labelling the correctness of the prediction as simple yes-or-no.

After the manual validation of the prediction, the active_learning_validator software will produce a response file related to this round of ASSL training. In the next code we will assume to have already downloaded the response file and renamed it as response_round0.csv. Using the information stored in this file, we will move the correct evaluation to the TRAIN_MASKS_FOLDER along the corresponding images. In this way the dataset of available samples will be increased and ready for the next round of ASSL training.

In [ ]:

import pandas as pd

RESPONSE_FILE = f'./response_round{ASSL_ROUND:d}.csv'

# load the response outcomes
response = pd.read_csv(RESPONSE_FILE, sep=',', header=0)
# select only the valid items
valid = response.query('response == "yes"')
# get the corresponding filenames
valid = set(valid['Filename'].values)

# log the information about this round of validation
valid_perc = len(valid)/len(all_files)*100
print(f'Correct image validated at round {ASSL_ROUND:d}: {len(valid):d} ({valid_perc:.3f}%)')

# evaluate the improvement of the round
improvement = len(valid)/len(all_files) - len(round_files)/len(all_files)
improvement = '+{:.3f}%'.format(up*100) if up > 0. else '{:.3f}%'.format(up*100)
print(f'Improvement obtained at round {ASSL_ROUND:d}: {improvement}')

According to the validate images/masks, we can move the data re-sampling the training dataset for the next round.

In [ ]:

import shutil

# declare the image folder for the next round of ASSL training
NEXT_TRAIN_IMAGE_FOLDER = f'./data/deepskin_images_round{ASSL_ROUND + 1:d}'
# generate the folder
os.makedirs(NEXT_TRAIN_IMAGE_FOLDER, exist_ok=False)

# loop along the valid indexes
for name in valid:
    # build the corresponding filename from the ALL_IMAGE_FOLDER
    src = f'{ALL_IMAGE_FOLDER}/{name}'
    # create the new destination filename for the copy
    dst = f'{NEXT_TRAIN_IMAGE_FOLDER}/{name}'
    # copy the file from the whole dataset to the next training set
    shutil.copyfile(src, dst)

# declare the masks folder for the next round of ASSL training
NEXT_TRAIN_MASKS_FOLDER = f'./data/deepskin_masks_round{ASSL_ROUND + 1:d}'
# generate the folder
os.makedirs(NEXT_TRAIN_MASKS_FOLDER, exist_ok=False)

# loop along the valid indexes
for name in valid:
    # build the corresponding filename from the PRED_MASKS_FOLDER
    src = f'{PRED_MASKS_FOLDER}/{name}'
    # create the new destination filename for the copy
    dst = f'{NEXT_TRAIN_MASKS_FOLDER}/{name}'
    # copy the file from the whole dataset to the next training set
    shutil.copyfile(src, dst)

At the end of this step, the folder tree should be something like

data/
├── deepskin_images_round0
├── deepskin_images_round1
├── deepskin_masks_round0
├── deepskin_masks_round1
├── deepskin_images
├── deepskin_masks
├── validation_images_pred_round0
└── validation_masks_pred_round0

where: * deepskin_images contains the entire set of available images in the dataset; * deepskin_masks contains the entire set of validated masks in the dataset; * deepskin_images_round0 contains the images to use during the current (round 0) round of ASSL training; * deepskin_masks_round0 contains the masks to use during the current (round 0) round of ASSL training; * deepskin_images_round1 contains the images to use during the next (round 1) round of ASSL training; * deepskin_masks_round1 contains the masks to use during the next (round 1) round of ASSL training; * validation_images_pred_round0 will be filled by the images with the predictions overlayed for the ASSL validation; * validation_masks_pred_round0 will be filled by the predictions of the trained models.

The entire code could be re-run setting the ASSL_ROUND as 1 to obtain the next validation step.