Vegetable classification with Parallel CNN model

Project Overview

This project aims to build a parallel CNN model that classifies vegetable types. The model analyzes vegetable images and finds key features that set each type apart. It uses two branches of convolutional layers. These branches merge into a dense network for classification. We also trained a traditional CNN model to compare its performance with that of parallel CNN. We check the results to ensure vegetables are sorted correctly into their categories.

This tutorial covers data preparation, model evaluation, and visualization. It also includes troubleshooting steps. Along with vegetable sorting, it teaches key machine learning ideas. These include data cleaning, augmentation, and model improvement. These concepts apply to broader classification tasks as well.

Prerequisites

Before starting, you should know the basics of Python and machine learning. You should also understand deep learning concepts. Specifically, knowledge of convolutional neural networks (CNNs) will be essential. Familiarity with TensorFlow and Keras will make the implementation easier to follow. Experience with datasets and processing image data using Pandas and NumPy is useful.

You need access to Google Colab or Jupyter Notebook to run the code and handle computations. Additionally, Basic knowledge of metrics like accuracy, precision, recall, and AUC is helpful. It will help you measure the model's performance well.

Approach

In this project, we use a machine-learning method with parallel CNN models. The traditional CNN model is our benchmark. The parallel CNN uses multiple branches of layers for deeper feature extraction. We train both models on the same vegetable image dataset. We focus on testing how well the models generalize using validation and test sets.

We use image preprocessing methods like resizing, normalization, and data augmentation. These steps help with training and prevent over-fitting. We also use callbacks like early stopping and learning rate reduction. This improves training time and performance. We show the results with a confusion matrix, classification report, and training metrics.

Workflow and Methodology

The overall workflow of this project includes:

Data Collection: Gathering images of different vegetable types for classification.

Data Preprocessing: Resizing, normalizing, and augmenting images to prepare for training.
Model Design: Building both the custom CNN model and the parallel CNN model.
Training: Training the models with training data and evaluating them using validation data.
Evaluation: Testing the models on unseen test data to check classification accuracy.
Visualization: Plotting metrics like accuracy, precision, recall, AUC, and confusion matrix.
Optimization: We use callbacks like early stopping and learning rate reduction. These callbacks help improve the model.

The methodology involves:

Data Preprocessing: We resize and normalize raw images. This converts them into tensors for CNN input. This ensures the model processes consistent image sizes.
CNN Architecture: We design convolutional layers for traditional and parallel CNN models. These layers extract important features for classification.
Metrics: We evaluate model performance using accuracy, precision, recall, and AUC. This ensures a thorough performance check.
Callbacks: We use callbacks like early stopping and learning rate reduction. These help prevent overfitting and improve training efficiency.

Data Collection

The dataset contains images of vegetables like tomatoes, cucumbers, bean, bitter gourd, brinjal, broccoli, cabbage, capsicum, carrot, cauliflower, papaya, potato, pumpkin and radish.

We divided the dataset into three parts:

Training.
Validation, and
Test sets.

The images are stored in separate directories for easy model training and evaluation. You can upload the dataset to Google Drive and mount it in Google Colab for quick access during the project.

Data Preparation

The images first resize to 224x224 pixels to ensure uniformity in the dataset. Next, we apply normalization to scale pixel values between 0 and 1, which speeds up model training. Additionally, We use data augmentation methods like flipping, rotation, and zooming. This creates more diverse training data. This process reduces overfitting and improves the model's ability to generalize.

Data Preparation Workflow

Image Resizing: We make sure all images are 224x224 pixels. This keeps the dimensions uniform.
Normalization: We scale pixel values from 0-255 to 0-1. This improves training efficiency.
Augmentation: We use data augmentation methods like flipping, rotation, and zooming. This creates more diverse training data.
Dataset Splitting: We split the dataset of training, validation, and test. This helps us develop and evaluate the model properly.

Code Explanation

STEP 1:

You can mount your Google Drive in a Google Colab notebook with this piece of code. This makes it easy to view files saved in Google Drive. In Colab, you can change and analyze data. You can also train models.

from google.colab import drive
drive.mount('/content/drive')

Import the necessary packages.

This block of code sets up the necessary tools and layers to build a CNN model. It trains and evaluates the model for image classification using TensorFlow and Keras.

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from tensorflow.keras.metrics import SparseCategoricalAccuracy
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.layers import Dropout, BatchNormalization
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
from sklearn.metrics import classification_report
import tensorflow.keras.layers as layers
from tensorflow.keras.layers import Input, GaussianNoise
from tensorflow.keras.models import Model
from tensorflow.keras.metrics import Precision, Recall, AUC
from sklearn.metrics import confusion_matrix
from tensorflow.keras.metrics import SparseCategoricalAccuracy
import seaborn as sns
import numpy as np
import pandas as pd
import random
import time
import math
import matplotlib.pyplot as plt
import plotly.express as px
%matplotlib inline

Check GPU availability

This code checks for available GPUs and sets them to use memory dynamically. It then returns the number of GPUs detected.

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)
len(gpus)

STEP 2:

Data processing

This block of code initializes variables for a TensorFlow data pipeline. It sets the batch size and image dimensions. It also defines a random seed and the path to the dataset. Additionally, it enables automatic tuning for better data-loading performance.

batch_size = 32
img_height = 224
img_width = 224
seed = 42
PATH = "/content/drive/MyDrive/Vegetable Images"
AUTOTUNE = tf.data.experimental.AUTOTUNE

This code creates a training dataset by loading images from `{PATH}/train`. It resizes the images to specific dimensions. The `image_dataset_from_directory` function batches and shuffles the dataset. A seed is used for randomization.

train_data = tf.keras.utils.image_dataset_from_directory(
    f"{PATH}/train",
    seed=seed,
    image_size=(img_height, img_width),
    batch_size=batch_size,
    shuffle=True
)

This code creates a testing dataset by loading and resizing images from `{PATH}/test`. The `image_dataset_from_directory` function resizes images and batches them. It keeps the original order (shuffle=False) for consistent evaluation.

test_data = tf.keras.utils.image_dataset_from_directory(
  f"{PATH}/test",
  seed=seed,
  image_size=(img_height, img_width),
  batch_size=batch_size,
  shuffle=False
)

This code creates a validation dataset. It loads and resizes images from the directory `{PATH}/validation`. Furthermore, the image_dataset_from_directory function resizes and batches the images. It shuffles the images (shuffle=True) using a seed. This ensures a random order for robust model evaluation.

val_data = tf.keras.utils.image_dataset_from_directory(
  f"{PATH}/validation",
  seed=seed,
  image_size=(img_height, img_width),
  batch_size=batch_size,
  shuffle=True
)

Show class names

This code gets and shows the list of class names (labels) in the training dataset (`train_data`). Additionally, these class names match the different vegetable categories in your dataset.

class_names = train_data.class_names
class_names

Counting Unique Classes in the Dataset

This code calculates the total number of unique classes in the dataset. It does this by determining the length of the class_names list. Then, it displays the result. This gives you the number of different vegetable types that the model will classify.

num_classes = len(class_names)
num_classes

STEP 3:

Data Analysis and Visualization

Plotting distributions

This plot_label_distribution function visualizes label distribution in a dataset using Plotly. It combines all labels into a single array, maps them to class names, and creates a DataFrame. Then, it uses Plotly's px.histogram to create a count plot. Each label is shown with distinct colors. It adds titles and labels to the plot. Finally, it displays the plot.

def plot_label_distribution(dataset, class_names, num_classes):
    labels = np.concatenate([batch[1] for batch in dataset], axis=0)
    df = pd.DataFrame(labels, columns=['Labels'])
    df['Labels'] = df['Labels'].map({i: class_names[i] for i in range(num_classes)})
    # Using Plotly to create a count plot with different colors for different labels
    fig = px.histogram(df, x='Labels', color='Labels', title='Label Distribution',
                       labels={'Labels': 'Label'},
                       category_orders={"Labels": class_names},
                       color_discrete_sequence=px.colors.qualitative.Safe)
    fig.update_layout(xaxis_title='Labels', yaxis_title='Count',
                      title_x=0.5, showlegend=False)
    fig.show()

Show plot of train data distribution

This block of code is called the plot_label_distribution function. The function creates a visual plot to show the distribution of labels. It shows how many images belong to each vegetable class in the train_data dataset. The plot helps you check if the dataset is balanced. It also highlights any classes with significantly more or fewer images. This information is important for model training.

plot_label_distribution(train_data, class_names, num_classes)

Show plot of validation data distribution

This code creates a visual plot of label distribution for the validation dataset. It uses the `plot_label_distribution` function. It shows the distribution of images across vegetable classes in the validation set. This helps ensure that the validation data is balanced. It also checks if the validation data is representative of the training data.

plot_label_distribution(val_data, class_names, num_classes)

Show plot of test data distribution

This code creates a visual plot of label distribution for the test dataset. It uses the `plot_label_distribution` function. It shows how images are spread across vegetable classes in the test set. This allows you to assess the balance of the test data. It helps check if the test data is suitable for evaluating the model's performance.

plot_label_distribution(test_data, class_names, num_classes)

Step 4:

Prepare dataset for training and evaluation

This block of code preprocesses train_data, test_data, and val_data by unbatching them first. For `train_data` and `val_data`, it caches, shuffles, and re-batches them with a set batch size. It also improves performance. `Test_data` is unbatched, then re-batched without caching or shuffling. It also prefers to load data efficiently during training and evaluation.

train_data = train_data.unbatch().cache().shuffle(2000).batch(batch_size, drop_remainder=True).prefetch(buffer_size=AUTOTUNE)
test_data = test_data.unbatch().batch(batch_size, drop_remainder=True).prefetch(buffer_size=AUTOTUNE)
val_data = val_data.unbatch().cache().shuffle(1000).batch(batch_size, drop_remainder=True).prefetch(buffer_size=AUTOTUNE)

Plotting example images

The function `closestDivisors` finds two divisors of a number n. It returns the closest pair of these divisors. It starts from the square root of n and decrements until it finds a divisor. Then, it returns the closest divisors.

def closestDivisors(n):
    a = round(math.sqrt(n))
    while n%a > 0: a -= 1
    return a, n//a

The `plot_images` function shows a batch of images from a dataset. It defaults to `train_data`. It calculates the grid layout by finding the closest divisors of the batch size for rows and columns. Then, it displays the images in a grid. It sets each subplot title to the corresponding class name. It hides the axes to create a cleaner view.

def plot_images(data=train_data):
    n_rows, n_cols = closestDivisors(batch_size)
    plt.figure(figsize=(n_cols*2, int(n_rows*1.8)))
    for images, labels in data.take(1).cache(): # "take" takes random batch
        for i in range(n_rows*n_cols):
            ax = plt.subplot(n_rows, n_cols, i + 1)
            plt.imshow(images[i].numpy().astype("uint16"))
            plt.title(class_names[labels[i]])
            plt.axis("off")

plot_images()

plot_images(data=test_data)

plot_images(data=val_data)

STEP 5:

Choosing an AI Model

When selecting an AI model, consider the complexity of the classification task. For this project, convolutional neural networks (CNNs) are ideal because they excel at image recognition tasks. Additionally, CNNs are highly effective at capturing patterns and features in images, which makes them perfect for classifying vegetables.

We used both a traditional CNN and a parallel CNN model to compare their performance. The traditional CNN processes features through a single path, while the parallel CNN splits into two branches to capture deeper, more complex patterns. This approach enhances classification accuracy, especially when distinguishing between visually similar vegetables.

By carefully evaluating both models, we ensured the selected architecture aligned with the project's goal of efficient and accurate vegetable classification.

Build Custom CNN model (Prepare layers)

This block of code defines a preprocessing step to resize the images. It also normalizes pixel values to a 0-1 range. This is done before feeding the images into the model.

preprocessing_layers = tf.keras.Sequential([
  layers.Resizing(img_width, img_height),
  layers.Rescaling(1./255)
])

This code defines a sequential CNN model. The model is called `model_1`. It is used for image classification with TensorFlow and Keras. Moreover, it consists of multiple convolutional layers with different filter sizes and activations. These layers are interspersed with max pooling and batch normalization layers. Max pooling reduces the dimensions, and batch normalization normalizes activations. A flattening layer converts the 2D feature maps into a 1D vector. This is followed by fully connected layers with ReLU activations. A dropout layer is added for regularization. Finally, a softmax output layer classifies the images into num_classes categories.

model_1 = Sequential()
model_1.add(preprocessing_layers)
model_1.add(Conv2D(32, (3, 3), padding="same", activation="relu"))
model_1.add(Conv2D(80, (1, 1), padding="same", activation="relu"))
model_1.add(MaxPooling2D((2, 2)))
model_1.add(Conv2D(40, (5, 5), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(MaxPooling2D((2, 2)))
model_1.add(Conv2D(88, (3, 3), padding="same", activation="relu"))
model_1.add(Conv2D(80, (3, 3), padding="same", activation="relu"))
model_1.add(Conv2D(124, (5, 5), padding="same", activation="relu"))
model_1.add(MaxPooling2D((2, 2)))
model_1.add(Conv2D(44, (5, 5), padding="same", activation="relu"))
model_1.add(Conv2D(32, (5, 5), padding="same", activation="relu"))
model_1.add(Conv2D(88, (5, 5), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(MaxPooling2D((2, 2)))
model_1.add(Conv2D(92, (3, 3), padding="same", activation="relu"))
model_1.add(Conv2D(92, (5, 5), padding="same", activation="relu"))
model_1.add(MaxPooling2D((2, 2)))
model_1.add(Flatten())
model_1.add(Dense(680, activation='relu'))
model_1.add(Dense(552, activation='relu'))
model_1.add(Dropout(0.25))
model_1.add(Dense(num_classes, activation='softmax'))

This code defines two callbacks to improve neural network training and prevent overfitting. The EarlyStopping callback monitors validation loss (val_loss) during training. It stops training if the loss doesn't improve for 5 epochs, starting from the 3rd epoch. It also restores the best weights after stopping. The `ReduceLROnPlateau` callback lowers the learning rate by 0.15. It triggers if `val_loss` does not improve for 2 consecutive epochs. The minimum learning rate is set to 1e-10, with a min_delta of 0.0004. It reduces the rate when no new minimum loss is observed.

early_stopping=EarlyStopping(monitor='val_loss', patience=5,start_from_epoch=3,restore_best_weights=True)
reduce_lr=ReduceLROnPlateau(monitor='val_loss', factor=0.15, patience=2, min_lr=1e-10, min_delta=0.0004, mode='min')

This block of code defines custom metrics for precision, recall, and AUC. These metrics are designed for sparse categorical predictions. It subclasses TensorFlow's metrics and overrides the update_state method. The method converts predicted probabilities to class indices using tf.argmax. It compiles `model_1` with custom metrics. It uses the Adam optimizer and sparse categorical cross-entropy loss. Finally, it trains the model for 40 epochs on the training data. It uses validation data. It also includes callbacks for early stopping and learning rate reduction. The training history is saved in history_model_1.

# Custom metrics for sparse categorical predictions
class SparseCategoricalPrecision(tf.keras.metrics.Precision):
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
class SparseCategoricalRecall(tf.keras.metrics.Recall):
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
class SparseCategoricalAUC(tf.keras.metrics.AUC):
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
# Compile the model with the updated metrics
model_1.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
                metrics=[SparseCategoricalAccuracy(),
                         SparseCategoricalPrecision(),
                         SparseCategoricalRecall(),
                         SparseCategoricalAUC()])
epochs = 40
history_model_1 = model_1.fit(
       train_data,
       epochs=epochs,
       validation_data=val_data,
       batch_size=batch_size,
       callbacks=[early_stopping, reduce_lr],
       verbose=1)

This code shows a summary of 'model_1', giving an overview of its architecture. It includes the layers, output shapes, and the number of parameters for each layer and the whole model. Additionally, it offers a detailed breakdown of the model's structure.

model_1.summary()

This code creates a visual of 'model_1', showing the input and output shapes for each layer. The image resolution is set by `dpi=75`.

tf.keras.utils.plot_model(model_1, show_shapes=True, dpi=75)

Build Parallel CNN Model

Normalization layer: This code creates a normalization layer that rescales input pixels. It divides pixel values by 255 to normalize them from 0-255 to 0-1.

normalization_layer = tf.keras.layers.Rescaling(scale=1./255)

This code defines an input layer with shape (img_width, img_height, 3). It adds Gaussian noise with a 0.1 standard deviation. This helps regularize the model and make it more robust.

input_layer = Input(shape=(img_width, img_height, 3))
gaussian_noise = GaussianNoise(0.1)(input_layer)

First branch

This code defines the first branch of a neural network. It begins with a normalization layer. Then it adds convolutional layers with ReLU activations, batch normalization, and max pooling. These layers reduce the spatial size and increase feature depth. They extract important features from the input.

branch_1 = normalization_layer(gaussian_noise)
branch_1 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_1)
branch_1 = BatchNormalization()(branch_1)
branch_1 = MaxPooling2D((2, 2))(branch_1)
branch_1 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_1)
branch_1 = BatchNormalization()(branch_1)
branch_1 = MaxPooling2D((2, 2))(branch_1)
branch_1 = Conv2D(64, (3,3), padding='same', activation="relu")(branch_1)
branch_1 = BatchNormalization()(branch_1)
branch_1 = MaxPooling2D((2, 2))(branch_1)
branch_1 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_1)
branch_1 = BatchNormalization()(branch_1)
branch_1 = MaxPooling2D((2, 2))(branch_1)
branch_1 = Conv2D(16, (3,3), padding='same', activation="relu")(branch_1)
branch_1 = BatchNormalization()(branch_1)
branch_1 = MaxPooling2D((2, 2))(branch_1)

Second Branch

This block of code defines the second branch of a neural network, mirroring the first branch. It begins with a normalization layer and Gaussian noise. Then, it uses convolutional layers with ReLU activation. Next, it applies batch normalization. Finally, it performs max pooling. These layers gradually extract features from the input data.

branch_2 = normalization_layer(gaussian_noise)
branch_2 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_2)
branch_2 = BatchNormalization()(branch_2)
branch_2 = MaxPooling2D((2, 2))(branch_2)
branch_2 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_2)
branch_2 = BatchNormalization()(branch_2)
branch_2 = MaxPooling2D((2, 2))(branch_2)
branch_2 = Conv2D(64, (3,3), padding='same', activation="relu")(branch_2)
branch_2 = BatchNormalization()(branch_2)
branch_2 = MaxPooling2D((2, 2))(branch_2)
branch_2 = Conv2D(32, (3,3), padding='same', activation="relu")(branch_2)
branch_2 = BatchNormalization()(branch_2)
branch_2 = MaxPooling2D((2, 2))(branch_2)
branch_2 = Conv2D(16, (3,3), padding='same', activation="relu")(branch_2)
branch_2 = BatchNormalization()(branch_2)
branch_2 = MaxPooling2D((2, 2))(branch_2)

Merging layers

This code combines the outputs of the two branches along a specific axis (axis=1). It flattens the merged output into a 1D vector. This prepares it for the next dense layers in the neural network.

merged = tf.keras.layers.concatenate([branch_1, branch_2], axis=1)
merged = Flatten()(merged)

Output layers

This code defines the output layers of the neural network. It starts with two dense layers using ReLU activations. Then, a dropout layer with a 0.4 rate is applied for regularization. Finally, a dense layer with softmax activation outputs class probabilities for `num_classes` classification.

output_layers = Dense(128, activation='relu')(merged)
output_layers = Dense(64, activation='relu')(output_layers)
output_layers = Dropout(0.4)(output_layers)
output_layers = Dense(num_classes, activation='softmax')(output_layers)

Creating model instance

This code creates a Keras model, `model_2`. It sets `input_layer` as the input. It sets `output_layers` as the output. It connects all the layers into a complete neural network.

model_2 = tf.keras.Model(input_layer, output_layers)

Compiling an optimizer

This code defines custom metrics for sparse categorical predictions. It subclasses TensorFlow's Precision, Recall, and AUC metrics.The `update_state` method is updated. It converts predicted probabilities to class indices using `tf.argmax`. It compiles model_2 with the Adam optimizer and a learning rate of 1e-4. It uses sparse categorical cross-entropy loss. Custom metrics like accuracy, precision, recall, and AUC track performance during training.

class SparseCategoricalPrecision(tf.keras.metrics.Precision):
    def __init__(self, name='sparse_categorical_precision', **kwargs):
        super(SparseCategoricalPrecision, self).__init__(name=name, **kwargs)
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
class SparseCategoricalRecall(tf.keras.metrics.Recall):
    def __init__(self, name='sparse_categorical_recall', **kwargs):
        super(SparseCategoricalRecall, self).__init__(name=name, **kwargs)
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
class SparseCategoricalAUC(tf.keras.metrics.AUC):
    def __init__(self, name='sparse_categorical_auc', **kwargs):
        super(SparseCategoricalAUC, self).__init__(name=name, **kwargs)
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        return super().update_state(y_true, y_pred, sample_weight)
# Compile the model with metrics
model_2.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
                metrics=[tf.keras.metrics.SparseCategoricalAccuracy(),
                         SparseCategoricalPrecision(),
                         SparseCategoricalRecall(),
                         SparseCategoricalAUC()])

This block of code displays a summary of model_2. It provides details about the model's architecture. Each layer's type and output shape are included. It also shows the number of parameters for each layer. The total number of parameters is displayed as well.

model_2.summary()

This block of code defines two callbacks for training model_2. The first callback is EarlyStopping. It stops training if the validation loss doesn't improve for 5 epochs. It also restores the best weights. The second callback is ReduceLROnPlateau. It reduces the learning rate by 0.1 if validation loss doesn't improve for 2 consecutive epochs. The minimum learning rate is set to 1e-10. The improvement threshold is 0.0004.

early_stopping= EarlyStopping(monitor='val_loss', patience=5,start_from_epoch=3,restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, min_lr=1e-10, min_delta=0.0004, mode='min')

This code trains `model_2` for up to 30 epochs using `train_data` and `val_data`. It uses a set batch size and callbacks like early stopping and reduced learning rate. Training stops if validation loss doesn't improve. The learning rate adjusts as needed. Progress is displayed during training.

epochs = 30
history_model_2 = model_2.fit(
  train_data,
  epochs = epochs,
  validation_data = val_data,
  batch_size = batch_size,
  callbacks = [early_stopping, reduce_lr],
  verbose = 1
)

This code generates a visual of `model_2`, showing the input and output shapes for each layer. The resolution is set to `dpi=75`.

tf.keras.utils.plot_model(model_2, show_shapes=True, dpi=75)

STEP 6:

Plotting Matrices

The `plot_metrics` function shows training and validation metrics. It arranges them in a 2x3 grid. It then evaluates the model on a test dataset. The function prints the test metrics and counts the misclassified images. Finally, it returns the test metrics for further analysis.

def plot_metrics(history, model, test_data):
    # Extracting metrics from the history object
    acc = history.history['sparse_categorical_accuracy']
    val_acc = history.history['val_sparse_categorical_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    precision = history.history['sparse_categorical_precision']
    val_precision = history.history['val_sparse_categorical_precision']
    recall = history.history['sparse_categorical_recall']
    val_recall = history.history['val_sparse_categorical_recall']
    auc = history.history['sparse_categorical_auc']
    val_auc = history.history['val_sparse_categorical_auc']
    # Plotting accuracy, loss, precision, recall, and AUC
    plt.figure(figsize=(20, 12))
    plt.subplot(2, 3, 1)
    plt.plot(acc, label='Training Accuracy')
    plt.plot(val_acc, label='Validation Accuracy')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')
    plt.subplot(2, 3, 2)
    plt.plot(loss, label='Training Loss')
    plt.plot(val_loss, label='Validation Loss')
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.subplot(2, 3, 3)
    plt.plot(precision, label='Training Precision')
    plt.plot(val_precision, label='Validation Precision')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Precision')
    plt.subplot(2, 3, 4)
    plt.plot(recall, label='Training Recall')
    plt.plot(val_recall, label='Validation Recall')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Recall')
    plt.subplot(2, 3, 5)
    plt.plot(auc, label='Training AUC')
    plt.plot(val_auc, label='Validation AUC')
    plt.legend(loc='lower right')
    plt.title('Training and Validation AUC')
    plt.tight_layout()
    plt.show()
    # Evaluating the model on the test dataset
    test_loss, test_accuracy, test_precision, test_recall, test_auc = model.evaluate(test_data)
    print(f"Test Accuracy: {test_accuracy}")
    print(f"Test Precision: {test_precision}")
    print(f"Test Recall: {test_recall}")
    print(f"Test AUC: {test_auc}")
    print(f"Test Loss: {test_loss}")
    print(f"Number of misclassified images in test dataset: {int((1 - test_accuracy) * 3000)} of 3000")
    return test_accuracy, test_loss, test_precision, test_recall, test_auc

This code calls the `plot_metrics` function to visualize `model_1`'s training history. It evaluates `model_1` on the test data and stores the test metrics in variables.

test_accuracy, test_loss, test_precision, test_recall, test_auc = plot_metrics(history_model_1, model_1, test_data)

This code defines the `get_actual_predicted_labels` function. It gets actual labels from an unbatched dataset. It generates predictions using a model. It stacks the actual labels. It converts predicted probabilities to class indices with `tf.argmax`. Finally, it returns both the actual and predicted labels.

def get_actual_predicted_labels(dataset, model):
    actual = [labels for _, labels in dataset.unbatch()]
    predicted = model.predict(dataset)
    actual = tf.stack(actual, axis=0)
    predicted = tf.concat(predicted, axis=0)
    predicted = tf.argmax(predicted, axis=1)
    return actual, predicted

actual, predicted = get_actual_predicted_labels(test_data, model_1)

The `plot_confusion_matrix` function creates a confusion matrix and visualizes it using Seaborn's heatmap. It adds titles, labels, and class names, adjusting the size and font for readability.

def plot_confusion_matrix(actual, predicted, ds_type):
    cm = tf.math.confusion_matrix(actual, predicted, num_classes=num_classes)
    ax = sns.heatmap(cm, annot=True, fmt='g')
    sns.set(rc={'figure.figsize':(15, 15)})
    sns.set(font_scale=1.4)
    ax.set_title('Confusion matrix of object recognition for ' + ds_type)
    ax.set_xlabel('Predicted Object')
    ax.set_ylabel('Actual Object')
    plt.xticks(rotation=90)
    plt.yticks(rotation=0)
    ax.xaxis.set_ticklabels(class_names)
    ax.yaxis.set_ticklabels(class_names)

Plot Confusion Matrix for Custom CNN Model

This code collects true labels and model predictions from the parallel CNN model. It uses these for the validation dataset. It then converts these labels and predictions into arrays. After that, it prints a detailed classification report using classification_report. The report shows precision, recall, and F1-score for each class. Metrics are displayed with four decimal places.

plot_confusion_matrix(actual, predicted, 'test dataset')

This code evaluates the combined predictions of model_1 and model_2. It uses a validation dataset for this evaluation. It averages their predicted probabilities. Then, it compares the combined predictions to the true labels. It creates a detailed classification report. It includes precision, recall, and F1-score metrics.

test_accuracy, test_loss, test_precision, test_recall, test_auc = plot_metrics(history_model_2, model_2, test_data)

actual, predicted = get_actual_predicted_labels(test_data, model_2)

Plot Confusion Matrix for Parallel CNN Model

plot_confusion_matrix(actual, predicted, 'test dataset')

This block of code evaluates the combined predictions of model_1 and model_2 on a validation dataset by averaging their predicted probabilities and comparing the combined predictions to the true labels. It then generates and prints a detailed classification report, including precision, recall, and F1-score metrics.

# Collect true labels and predictions
true_labels = []
predictions = []
for x, y in val_data:
    # Get predictions from both models
    preds_1 = model_1.predict(x)
    preds_2 = model_2.predict(x)
    # Combine predictions (average the probabilities)
    combined_preds = (preds_1 + preds_2) / 2
    # Append true labels and combined predictions
    true_labels.extend(y)
    predictions.extend(np.argmax(combined_preds, axis=1))
# Convert lists to arrays
true_labels = np.array(true_labels)
predictions = np.array(predictions)
# Print classification report
print(classification_report(true_labels, predictions, digits=4))

This code calculates precision and recall for each class. It applies to multi-class classification problems. It uses a confusion matrix for this calculation. True positives (TP) are taken from the diagonal of the matrix. It then derives false positives (FP) and false negatives (FN). These values are used to calculate precision and recall for each class.

def calculate_classification_metrics(y_actual, y_pred):
    cm = tf.math.confusion_matrix(y_actual, y_pred)
    tp = np.diag(cm) # Diagonal represents true positives
    precision = dict()
    recall = dict()
    for i in range(len(class_names)):
        col = cm[:, i]
        fp = np.sum(col) - tp[i] # Sum of column minus true positive is false negative
        row = cm[i, :]
        fn = np.sum(row) - tp[i] # Sum of row minus true positive, is false negative
        precision[class_names[i]] = tp[i] / (tp[i] + fp) # Precision
        recall[class_names[i]] = tp[i] / (tp[i] + fn) # Recall
    return precision, recall

This block of code calculates the precision and recall for each class in the test dataset. It uses the calculate_classification_metrics function with the actual and predicted labels. Then, it prints out the resulting precision and recall values for the model.

precision, recall = calculate_classification_metrics(actual, predicted) # Test dataset
print(f"Model precision: {precision}")
print(f"Model recall: {recall}")

STEP 7:

Visualizing Predictions

This block of code iterates through the test_data dataset, storing each batch in a list. It then extracts individual images and labels from these batches. The images are appended to the test_images list, and the labels are added to the test_labels list. Finally, it converts these lists into numpy arrays.

test_images = []
test_labels = []
batches = []
for batch in test_data.as_numpy_iterator():
    batches.append(batch)
for next_batch in batches:
    for image in next_batch[0]:
        test_images.append(image)
    for label in next_batch[1]:
        test_labels.append(label)
test_images = np.asarray(test_images)
test_labels = np.asarray(test_labels)

y_pred = np.array(model_1.predict(test_images))
y_true = np.array(test_labels)

This code turns predicted probabilities (y_pred) into class labels. It does this by finding the index of the maximum probability for each prediction. The index corresponds to the predicted class label.

pred_max = []
for x in y_pred:
    pred_max.append(np.argmax(x))
y_pred = pred_max

This code goes through test images, showing their predicted and true labels. It appends incorrectly classified images to false_class. The correctly classified images are added to true_class. Both the predicted and true labels are stored with each image.

false_class = list()
true_class = list()
for i in range(len(test_images)):
    if y_pred[i] != y_true[i]:
        false_class.append((test_images[i],y_pred[i],y_true[i]))
    else:
        true_class.append((test_images[i],y_pred[i],y_true[i]))

This code evaluates the predictions of model_1 and model_2. It uses a validation dataset for this evaluation. It averages their predicted probabilities. Then, it compares the averaged predictions to the true labels. It generates and prints a detailed report with precision, recall, and F1-score metrics.

def plot_predictions(labels, cols=5, rows=4):
    random.shuffle(labels)
    fig = plt.figure(figsize=(3*cols, 10))
    for i in range(1, cols*rows +1):
        fig.add_subplot(rows, cols, i)
        plt.title(f'Predicted: {class_names[labels[i][1]]}\nTrue: {class_names[labels[i][2]]}')
        plt.axis("off")
        plt.imshow(labels[i][0].astype("uint16"))
    plt.tight_layout()

Visualizing False Predictions

This code visualizes the misclassified images (false_class) in a grid layout. It arranges the images in 4 columns. Each image displays both the predicted and true labels.

plot_predictions(false_class, cols=4)

Visualizing True Predictions

This code visualizes the correctly classified images (true_class) in a grid layout. It uses 4 columns to display the images. Each image shows both the predicted and true labels.

plot_predictions(true_class, cols=4)

Project Conclusion

In conclusion, the vegetable classification project shows how CNNs automate sorting. This automation makes work more efficient. It reduces manual tasks. It also ensures consistent results in agriculture, food retail, and supply chains. Using parallel CNN models enhances feature extraction, leading to more accurate classification outcomes.

In conclusion, the vegetable classification project shows how CNNs automate sorting. This automation improves efficiency and cuts down on manual work. It also ensures consistent results in agriculture, food retail, and supply chains.

Challenges and Troubleshooting

Several challenges came up during the project, particularly with overfitting and imbalanced datasets. The main issue was ensuring that the model generalized well to new data instead of just the training set. To prevent overfitting, we used data augmentation, early stopping, and dropout layers. Balancing the dataset was also key since some classes had more images. This imbalance caused poor performance for less-represented classes.

We also faced the challenge of ensuring efficient GPU usage in Google Colab. Configuring memory growth for the GPUs helped avoid errors and ensured smooth training. We improved the model's performance by adjusting the learning rate during training. This helped optimize learning on validation and test sets. Finally, Confusion matrices and classification reports helped us find where the model struggled. It struggled most with vegetables that looked similar. This was especially true with vegetables that looked similar. This was especially useful for similar-looking vegetables. These insights helped refine our data preprocessing methods and improve overall model accuracy.

Generative AI creates new content like text, images, videos, or music. It learns by studying existing data. Moreover, learn the basics of Generative AI in this beginner-friendly tutorial. Gain hands-on experience building a Generative AI model with an exciting Ai project.

FAQ

What is the primary purpose of this vegetable classification project?
- Answer: The project aims to automate vegetable classification using a Parallel CNN model. This increases efficiency and improves quality control in the agriculture and retail sectors.
Why is data augmentation used in this project?
- Answer: Data augmentation enhances the model’s generalization by creating varied training samples. This prevents overfitting and improves the model’s performance on unseen data.
How does a parallel CNN model improve accuracy compared to a traditional CNN?
- Answer: The parallel CNN extracts features from two paths, improving feature recognition. This results in higher classification accuracy compared to single-path CNN models.
What specific challenges did you encounter during the training process?
- Answer: Overfitting and imbalanced data were significant challenges during training. Early stopping, dropout, and data augmentation resolved these issues.
How did changing the learning rate improve performance on validation and test sets?
- Answer: Dynamic learning rate reduction improved the model’s convergence and fine-tuning. This led to better validation and test performance without overfitting.