Image

Medical Image Segmentation With UNET

The main goal of this project is to segment medical images using U-Net, an advanced convolutional neural network architecture. The project contributes to illness diagnosis, treatment planning, and surgical guidance by precisely identifying anatomical structures and pathological anomalies within MRI, CT, and X-ray scans.

It improves research endeavors, enables individualized treatment plans, and supports image-guided therapies with automated segmentation and quantitative analysis. The project also fosters a deeper grasp of intricate medical imaging data and disease processes for medical professionals, making it an invaluable educational tool. In the end, it helps to improve patient outcomes by identifying diseases early, making precise therapies, and providing continuous care.

Explanation All Code

Step:1

The Python code given is for Google Colab, a cloud-based platform for Python programming. A user can easily access files in Colab environment through google drive.. After logging in, the user's Google Drive may be accessed at '/content/drive' in the Colab environment. The line drive.mount('/content/drive') starts the mounting procedure. When working with files or datasets kept on Google Drive, this is useful.
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')
Imports Necessary libraries For Numerical Operations, Image Processing, Machine Learning And Visualization
!pip install tensorflow
!pip install keras
!pip install utils

Step:2

Import necessary libraries
import numpy as np
from tensorflow.keras.utils import Sequence
import cv2
import tensorflow as tf
import pickle
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import sklearn
from sklearn.cluster import KMeans
from tensorflow.keras.layers import *
from tensorflow.keras import models
from tensorflow.keras.callbacks import *
import glob2
from sklearn.utils import shuffle
import matplotlib.pyplot as plt
from tensorflow.keras.metrics import MeanIoU
Defines utility functions cvtColor and func.

Here are the definitions of two assistance functions. If the values of the pixels are less than 255, the first function ('cvtColor') sets the values to 0. Applying the 'cvtColor' function to every pixel in an image is what the second function ('func') does.
# Define a function to convert color space to RGB
def cvtColor(x):
x[x < 255] = 0
return x
# Define a function to apply cvtColor to each pixel in the image
def func(img):
d = list(map(lambda x: cvtColor(x), img.reshape(-1,3)))
return np.array(d).reshape(*img.shape[:-1], 3)
Defines a Custom Data Generator Class DataGenerator Inherited From Sequence.

The DataGenerator class, defined by this code, creates batches of data to train a model. Here's a quick rundown of each section:
  • The class constructor sets up a number of arguments, including the data filenames, input and batch sizes, shuffle and random seed options, colour mode, encoding dictionary, and optional processing function.
  • Using the supplied dictionary, the 'processing' technique encrypts the mask.
  • The number of batches per epoch is determined by the 'len' method using the batch and dataset sizes as input.
  • By obtaining a subset of filenames based on the batch index and then using the 'data generation' function to load and preprocess the images and masks, the 'getitem' method creates a single batch of data.
  • After every epoch, the 'on_epoch_end' method modifies the indexes and, if desired, shuffles them.
  • The photos and masks for the current batch are loaded and preprocessed using the 'data_generation' method. It adjusts the image and mask sizes, handles various colour modes (HSV, RGB, and grayscale), applies optional processing operations, and normalizes the pixel values.
class DataGenerator(Sequence):
def __init__(self, all_filenames, input_size=(128, 128), batch_size=8, shuffle=True, seed=123, encode: dict = None, color_mode='hsv', function=None) -> None:
super(DataGenerator, self).__init__()
# Check if the encoding dictionary is provided
assert encode != None, 'Not empty !'
# Check if the color mode is valid
assert color_mode == 'hsv' or color_mode == 'rgb' or color_mode == 'gray'
# Initialize instance variables
self.all_filenames = all_filenames
self.input_size = input_size
self.batch_size = batch_size
self.shuffle = shuffle
self.color_mode = color_mode
self.encode = encode
self.function = function
# Set random seed for shuffling
np.random.seed(seed)
# Shuffle the data at the start
self.on_epoch_end()
def processing(self, mask):
# Encode mask based on the provided dictionary
d = list(map(lambda x: self.encode[tuple(x)], mask.reshape(-1, 3)))
return np.array(d).reshape(*self.input_size, 1)
def __len__(self):
# Calculate the number of batches per epoch
return int(np.floor(len(self.all_filenames) / self.batch_size))
def __getitem__(self, index):
# Generate one batch of data
indexes = self.indexes[index * self.batch_size : (index + 1) * self.batch_size]
all_filenames_temp = [self.all_filenames[k] for k in indexes]
X, Y = self.__data_generation(all_filenames_temp)
return X, Y
def on_epoch_end(self):
# Update indexes after each epoch
self.indexes = np.arange(len(self.all_filenames))
if self.shuffle == True:
np.random.shuffle(self.indexes)
def __data_generation(self, all_filenames_temp):
# Generates data containing batch_size samples
# Initialize arrays for images and masks
batch = len(all_filenames_temp)
if self.color_mode == 'gray':
X = np.empty(shape=(batch, *self.input_size, 1))
else:
X = np.empty(shape=(batch, *self.input_size, 3))
Y = np.empty(shape=(batch, *self.input_size, 1))
# Iterate over the filenames in the current batch
for i, (fn, label_fn) in enumerate(all_filenames_temp):
# Load and preprocess image
img = cv2.imread(fn)
if self.color_mode == 'hsv':
img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
elif self.color_mode == 'rgb':
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
elif self.color_mode == 'gray':
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = tf.expand_dims(img, axis=2)
img = tf.image.resize(img, self.input_size, method='nearest')
img = tf.cast(img, tf.float32)
img /= 255.
# Load and preprocess mask
mask = cv2.imread(label_fn, 0)
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2RGB)
mask = tf.image.resize(mask, self.input_size, method='nearest')
mask = np.array(mask)
if self.function:
mask = self.function(mask)
mask = self.processing(mask)
mask = tf.cast(mask, tf.float32)
# Assign images and masks to the arrays
X[i,] = img
Y[i,] = mask
return X, Y

Step:3

Converts pixel-wise labels to unique integers and saves the mapping in a pickle file.

The 'encode_label' function transforms a mask into a 2D array of pixel values, creates unique labels, constructs an encoder dictionary, and saves the dictionary to a pickle file, preserving the mapping for later use.
def encode_label(mask):
# input (batch, rows, cols, channels)
# Initialize an empty list to store unique labels
label = []
# Iterate over each pixel in the mask
for i in mask.reshape(-1, 3):
# Convert each pixel to a tuple and append it to the label list
label.append(tuple(i))
# Convert the list of tuples to a set to get unique labels
label = set(label)
# Create an encoder dictionary where keys are unique labels and values are their indices
encoder = dict((j, i) for i, j in enumerate(label)) # key is tuple
# Save the encoder dictionary to a pickle file
with open('label.pickle', 'wb') as handle:
pickle.dump(encoder, handle, protocol=pickle.HIGHEST_PROTOCOL)
# Return the encoder dictionary
return encoder
# Print the function reference (not calling the function)
print(encode_label)
Decodes model predictions back to pixel-wise labels using the saved mapping.

The function converts predicted values into labels, reshapes them into an image with 3 channels, and returns the resulting image, without calling itself but printing its reference.
def decode_label(predict, label):
# Convert predicted values to labels using argmax along the channel axis
predict = np.argmax(predict, axis=3)
# Map label indices to label values using the provided label dictionary
d = list(map(lambda x: label[int(x)], predict.reshape(-1, 1)))
# Reshape the decoded labels into an image shape with 3 channels
img = np.array(d).reshape(*predict.shape, 3)
# Return the decoded image
return img
# Print the function reference (not calling the function)
print(decode_label)
Given file paths for training and validation data, prepares instances of the DataGenerator.

This function loads and preprocesses a selection of masks for label encoding, then uses the 'encode_label' function to build label dictionaries. For training data, it creates a "DataGenerator"; if validation data is supplied, it creates another one. These generators are returned by this function. Instead of calling itself, it prints its reference.
def DataLoader(all_train_filename, all_mask, all_valid_filename=None, input_size=(128, 128), batch_size=4, shuffle=True, seed=123, color_mode='hsv', function=None) -> None:
# Randomly select a subset of masks for encoding labels
mask_folder = sklearn.utils.shuffle(all_mask, random_state=47)[:16]
# Load and resize the masks
mask = [tf.image.resize(cv2.cvtColor(cv2.imread(img), cv2.COLOR_BGR2RGB), input_size, method='nearest') for img in mask_folder]
mask = np.array(mask)
# Apply preprocessing function to masks if provided
if function:
mask = function(mask)
# Encode the masks to create label dictionaries
encode = encode_label(mask)
# Create DataGenerator for training data
train = DataGenerator(all_train_filename, input_size, batch_size, shuffle, seed, encode, color_mode, function)
# If validation filenames are provided, create DataGenerator for validation data
if all_valid_filename is None:
return train, None
else:
valid = DataGenerator(all_valid_filename, input_size, batch_size, shuffle, seed, encode, color_mode, function)
return train, valid
# Print the function reference (not calling the function)
print(DataLoader)

Step:4

Defines a downsampling block in the U-Net model.

The function creates a downsampling block for a convolutional neural network, consisting of two convolutional layers with predefined filters, batch normalization, and LeakyReLU activation. It uses MaxxPoling2D optionally and returns the block's output with a stride of (2, 2), instead of calling itself.
def down_block(x, filters, use_maxpool=True):
# Apply two convolutional layers with specified filters and LeakyReLU activation
x = Conv2D(filters, 3, padding='same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)
x = Conv2D(filters, 3, padding='same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)
# Optionally apply MaxPooling2D with a stride of (2, 2)
if use_maxpool:
return MaxPooling2D(strides=(2, 2))(x), x
else:
return x
# Print the function reference (not calling the function)
print(down_block)
Defines an upsampling block in the U-Net model.

The function defines a convolutional neural network block for upsampling, which involves concatenating input feature maps, applying convolutional layers, batch normalization, and LeakyReLU activation, and returning the output feature map, without calling itself.
def up_block(x, y, filters):
# Upsample the input feature map
x = UpSampling2D()(x)
# Concatenate the upsampled feature map with the corresponding feature map from the contracting path
x = Concatenate(axis=3)([x, y])
# Apply two convolutional layers with specified filters and LeakyReLU activation
x = Conv2D(filters, 3, padding='same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)
x = Conv2D(filters, 3, padding='same')(x)
x = BatchNormalization()(x)
x = LeakyReLU()(x)
# Return the output feature map
return x
# Print the function reference (not calling the function)
print(up_block)
Defines the U-Net model architecture using TensorFlow/Keras.

The U-Net model architecture for semantic segmentation is defined by this function. It consists of an encoding path with 'down_block' functions used for downsampling, and a decoding path with 'up_block' methods used for upsampling together with skip connections from the encoding path. A softmax activation function is used to generate the final output. The specified model is returned by the function. Instead of calling itself, it prints its reference.
def Unet(input_size=(128, 128, 3), *, classes, dropout):
# Define the number of filters for each downsampling level
filters = [64, 128, 256, 512, 1024]
# Input layer
input = Input(shape=input_size)
# Encoding path
# Apply down_block for each downsampling level and store the skip connections
x, temp1 = down_block(input, filters[0])
x, temp2 = down_block(x, filters[1])
x, temp3 = down_block(x, filters[2])
x, temp4 = down_block(x, filters[3])
x = down_block(x, filters[4], use_maxpool=False) # Last down_block without max pooling
# Decoding path
# Apply up_block for each upsampling level using the stored skip connections
x = up_block(x, temp4, filters[3])
x = up_block(x, temp3, filters[2])
x = up_block(x, temp2, filters[1])
x = up_block(x, temp1, filters[0])
# Apply dropout
x = Dropout(dropout)(x)
# Output layer
output = Conv2D(classes, 1, activation='softmax')(x)
# Define and summarize the model
model = models.Model(input, output, name='unet')
model.summary()
# Return the model
return model
# Print the function reference (not calling the function)
print(Unet)

Step:5

Defines a class m_iou for calculating Mean Intersection over Union (IoU) metrics.

Two methods are implemented in this class:
  • Mean_iou: Uses Keras's 'MeanIoU' metric to calculate the mean Intersection over Union (IoU) metric for each class.
  • Miou_class: Determines the IoU for every class separately and outputs the findings.
The number of 'classes' in the segmentation task is specified by the classes parameter. It is intended to be used in semantic segmentation tasks in which a class label is supplied to every pixel.
from tensorflow.keras.metrics import MeanIoU
import numpy as np
class m_iou():
def __init__(self, classes: int) -> None:
# Initialize the number of classes
self.classes = classes
def mean_iou(self, y_true, y_pred):
# Compute mean IoU metric using Keras's MeanIoU
y_pred = np.argmax(y_pred, axis=3)
miou_keras = MeanIoU(num_classes=self.classes)
miou_keras.update_state(y_true, y_pred)
return miou_keras.result().numpy()
def miou_class(self, y_true, y_pred):
# Compute IoU for each class
y_pred = np.argmax(y_pred, axis=3)
miou_keras = MeanIoU(num_classes=self.classes)
miou_keras.update_state(y_true, y_pred)
values = np.array(miou_keras.get_weights()).reshape(self.classes, self.classes)
for i in range(self.classes):
class_iou = values[i, i] / (sum(values[i, :]) + sum(values[:, i]) - values[i, i])
print(f'IoU for class {str(i + 1)} is: {class_iou}')
Plots training and validation loss, accuracy, and mean IoU over epochs.

This function shows a model's training history. Plotting training and validation loss, accuracy, and mean IoU over epochs is done if validation data is available. It just plots the training metrics otherwise. The function is adaptable and capable of handling various training histories.
Code Editor