Automatic Eye Cataract Detection Using YOLOv8
Cataracts are a leading cause of vision impairment worldwide, affecting millions of people every year. Early detection and timely intervention can significantly improve the quality of life for those at risk. However, manual detection methods can be time-consuming and prone to human error. To address this challenge, we present the Automatic Eye Cataract Detection system. This project leverages advanced computer vision techniques and the YOLOv8 model to automate the detection of cataracts from eye images, providing an efficient, accurate, and scalable solution. By integrating this technology into healthcare, we can facilitate early diagnosis and help reduce the burden of cataract-related blindness.
Explanation All Code
STEP 1:
Mount Google Drive
Mount your Google Drive to access and save datasets, models, and other resources.
Installing Necessary Packages
Installs the roboflow, gradio and ultralytics Python packages. roboflow is used for managing datasets, and ultralytics is used for working with YOLO models.
Downloading the Dataset from Roboflow
- Initializes a connection to Roboflow using your API key. Accesses a specific
- project in the workspace called "Kataract Object Detection."
- Downloads the third version of this dataset configured for YOLOv8.
STEP 2:
Imports necessary library
Imports the ultralytics library and runs a system check to ensure everything is set up correctly for training the model.
Install the Ultralytics library and run system checks to ensure all dependencies are correctly set up.
Train the YOLOv8 Model
- Trains the YOLOv8 model using the YOLOv8n (nano) pre-trained model.
- The dataset configuration is provided via the data.yaml file, and the training runs for 100 epochs with an image size of 640x640 pixels.
Visualizing Training Results
- This step is useful for visualizing and assessing the performance of your model.
After training, visualize the results such as confusion matrix, results summary, and sample training batches.
STEP 3:
Model Validation
Loads the best model (based on training) from the specified directory.
Runs validation on the model using the validation set and collects various metrics:
- map: Mean Average Precision across IoU thresholds.
- map50: Mean Average Precision at IoU threshold 0.50.
- map75: Mean Average Precision at IoU threshold 0.75.
- maps: List of per-class mAPs.
Visualizing Validation Results
- Opens and resizes images generated during validation, such as the confusion matrix and labeled batch images.
- These images help you visualize how well the model is performing on the validation set.
Display a validation batch with labels to visualize the model's accuracy on unseen data.
STEP 4:
Inference for Image Prediction
- Loads a trained model and uses it to predict objects in a specific image.
- Saves the prediction and then loads and displays the predicted image using matplotlib
Run the trained model on a test image to detect and classify Automatic Eye Cataracts.
Display Predicted Image
Load and display the predicted image to visualize the model's output.
STEP 5:
Set Up Gradio Interface for Image and Video Processing
Loading the YOLOv8 Model
Loads the trained YOLOv8 model from the specified file path. This model will be used to detect objects in both images and videos.
- List of file paths to the images.
- For each image, the function loads the image, converts it to a NumPy array, and then converts it from RGB to BGR (since OpenCV works with BGR format).
- The image is passed to the YOLO model for object detection.
- The function loops through the detected objects (boxes), extracting the bounding box coordinates, the class label, and confidence score.
- Bounding boxes and labels are drawn on the image using OpenCV.
- The image is then converted back to RGB and added to the output list.
- A list of processed images with drawn bounding boxes and labels.
Defining the predict_videos Function
Input:
- File path to the video.
Process:
- Opens the video file using OpenCV and initializes a list to store processed frames.
- A new video writer is set up to save the input video (for display purposes).
- Each frame of the video is read and passed to the YOLO model for object detection.
- Bounding boxes and labels are drawn on each frame for the detected objects.
- The processed frames are saved into a new video file.
Output:
- The path to the processed video file.
Defining the process_files Function
- Combines the predict_images and predict_videos functions to handle both images and video:
Input:
- List of image files and a video file.
Process:
- Calls the respective functions to process the images and video.
Configures the Gradio interface:
Inputs:
- Multiple image files (gr.File) and a video file (gr.Video) can be uploaded by the user.
Outputs:
- A gallery of processed images, the original video, and the processed video are displayed to the user.
This code provides a comprehensive solution for detecting objects in both images and videos using a trained YOLOv8 model, all within an easy-to-use web interface.