Unmasking the Predictions: Understanding Cataract Detection through Explainable AI

9 min readMay 19, 2023

Let’s get with the blog on Explainable AI, where we unravel the mysteries behind the decision-making process of artificial intelligence. As AI continues to permeate various aspects of our lives, there is a growing need for transparency and understanding in the way these algorithms arrive at their conclusions. In this blog, we will delve into the concept of explainability in AI, exploring its importance, challenges, and the methods used to make AI systems more interpretable, transparent, explainable. We will apply explainable AI model on Eye images to predict cataract.

What is Explainable AI?

Explainable AI refers to methods and techniques in the application of artificial intelligence technology (AI) such that the results of the solution can be understood by human experts.
It contrasts with the concept of the “black box” in machine learning where even their designers cannot explain why the AI arrived at a specific decision. XAI is an implementation of the social right to explanation [1]

In simple word, explainable AI helps the end-user for the particular system to understand what was the major factor considered while giving the output. For example, if we are predicting that an image consist of a dog or not. Then, by simply highlighting the region in the image where a dog is present, explainable AI can interpret which part of the image contributed most to the prediction.

Why do we need Explainable AI?

Recently, the use of neural networks has increased in prediction and classification. We don’t truly understand their decision-making processes. The process of creating an interpretable version of a model grows more challenging as models grow more complicated. Here are a few reasons why explainability is important in AI, particularly with neural networks:

Trust and accountability: As artificial intelligence (AI) technologies become more integrated into crucial sectors such as healthcare, banking, and autonomous cars, it is critical to develop trust and ensure accountability. Users, regulators, and stakeholders must comprehend how AI makes judgements, particularly when human lives or sensitive data are at stake. Explainability aids in the development of trust by providing insights into the thinking behind AI outputs.
Human-AI collaboration: In scenarios where AI is used as a decision support tool, explainability becomes essential for effective human-AI collaboration. By understanding the reasoning behind AI-generated recommendations or predictions, humans can make informed decisions and have a more active role in the decision-making process.
Compliance with regulations: Many industries have started implementing regulations and guidelines concerning AI systems. These regulations often require organizations to provide explanations for AI-driven decisions, especially in areas like finance (e.g., credit scoring) and healthcare (e.g., medical diagnoses). Explainable AI methods enable compliance with such regulations by providing interpretable and justifiable outputs.

Explainable AI (XAI) techniques and lets understand when you should use what?

LIME (Local Interpretable Model-Agnostic Explanations) and Grad-CAM (Gradient-weighted Class Activation Mapping) are two popular techniques for explaining the predictions of machine learning models, especially in the context of image classification tasks. While they serve similar purposes, they have distinct use cases and applications. Here’s when you should consider using LIME or Grad-CAM:

LIME: LIME is a versatile method that can be applied to various types of models, including both linear and non-linear models. It provides local interpretability by approximating the decision boundary of the model around a specific instance. LIME is useful when you need to understand the factors contributing to a prediction at the individual instance level. It generates explanations in the form of “perturbed” instances and their corresponding predictions, highlighting the importance of different features. LIME is beneficial when the model is complex and lacks inherent interpretability.

Example use cases for LIME:

Explaining the classification of an image by highlighting the regions that influenced the prediction.
Understanding the features driving a text classification model’s decision for a specific document.
Interpreting the factors behind a machine learning model’s recommendation for an individual user.

Grad-CAM: Grad-CAM, on the other hand, focuses specifically on visualizing the regions of an image that contribute most significantly to a specific class prediction. It uses the gradient information flowing into the last convolutional layer of a neural network to highlight the important regions. Grad-CAM is especially effective for models based on convolutional neural networks (CNNs) commonly used in image classification tasks. It provides a heatmap overlay on the input image, indicating the regions of interest.

Example use cases for Grad-CAM:

Understanding which parts of an image were crucial for a CNN-based model to classify it as a specific object or class.
Explaining the regions in a medical image that influenced a CNN’s prediction for a disease.
Visualizing the areas of focus in an image that contributed to a model’s decision in tasks like object detection or segmentation.

In summary, LIME is a more general-purpose technique that can be applied to various models and domains, providing local interpretability at the individual instance level. Grad-CAM, on the other hand, is specifically designed for CNN-based models and excels at visualizing important regions within images for classification tasks. Consider LIME when you need broader interpretability, and choose Grad-CAM when you want to understand the visual attention of a CNN model.

Explaining Cataract Detection with Explainable AI

In the field of medicine, Explainable AI can be of best use. Imagine you are creating a neural network (or any other black-box model) to help predict cataract given a patient’s records(Eye image).

You obtain a respectable accuracy and a strong positive predictive value after training and testing your model. When you present it to a doctor, they agree that it appears to be a powerful model.

But if doctor(or the model) are unable to respond to the straightforward inquiry “Why did model predict this person as more likely to develop cataract?” they will be hesitant to use it.

For the doctor who wants to understand how the model works to help them improve their service, this lack of transparency is a concern. The patient’s desire for a specific justification for this prediction presents a challenge as well.

Is it ethical to explain to a patient that they are more likely to develop an illness if your only justification is that “the black-box told me so”? Science and patient empathy both are equally important in the field of health care.

Algorithm used for Explaining Cataract Detection with Explainable AI:

a. Classification Algorithm: VGG19

A convolutional neural network with multiple operating layers is known as a VGGNet, or visual geometry group network. The ImageNet dataset is used to train the CNN model, VGGNet. 15 million high-resolution images with labels make up the dataset known as ImageNet. Since it has already been trained, it offers greater accuracy and requires less processing time than if we were to build a deep learning model from scratch. 97.47% accuracy is provided by VGG-19 when categorising photos of cataracts. Use of VGG-19 pre-trained model is decided since it is unable to use a large dataset. The matrix is shaped as (224,224,3) when it receives an input of a fixed-size (224 * 224) RGB image. The only preprocessing that is done to calculate the mean RGB value of each pixel across the entire training set. Kernels of (3 * 3) in size with 1 pixel strides is used, allowing them to cover the entire idea of the image. To maintain the image’s spatial resolution, spatial padding is applied. Max pooling is carried out over a 2 × 2 pixel window in stride 2. Then, by enhancing model classification, the rectified linear unit (ReLu) is employed to introduce non-linearity and accelerate computation. After this, implementation of three completely linked layers began, the first two of which had a size of 4096, followed by a layer with 1000 channels for classification using the 1000-way ILSVRC, and the third layer being a softmax function.

b. XAI: Lime

LIME stands for Local Interpretable Model Agnostic Explanation. Because of its model agnosticism, LIME can explain any particular supervised learning model by treating it as a distinct “black-box.” Therefore, practically every model currently in use can be supported by LIME. Local explanations are reasons offered by LIME that are precise in the area immediately surrounding the observation or sample being explained. LIME is one of the most popular XAI techniques, despite only supporting supervised machine learning and deep learning models at the moment. LIME has a substantial user base and a powerful open-source API that is accessible in R and Python. Its a github repository has about 8K stars and 2K forks.

XAI: Grad-CAM

By leveraging the gradients of each target idea flowing into the final convolutional layer, gradient-weighted class activation mapping (Grad-CAM) builds a coarse localization map that highlights the essential areas in the image for concept prediction.

The method is more accurate and adaptable than older techniques. Despite being complex, the outcome is fortunately obvious. Typically, we begin with a photo as our input and create a model that is stopped at the layer for which we want to create a GradCAM heat map. For prediction, we adhere the fully linked layers. Then, before the layer output and loss are gathered, the model is applied to the input. The output gradient of the model layer we’ve chosen is then calculated with regard to the model loss. Then, we reduce, resize, and rescale the gradient components that contribute to the prediction in order to overlay the heat-map with the original image.

Here is the System Architecture of cataract detection with explainable AI:

Assuming that we have a fundus images of an eye, we do preprocessing of the image according to our black-box prediction model that is in our case VGG-19 which is a pretrained model trained on millions of images. After classifying whether the image has a cataract or not, we then forwarded the black-box model to be interrogated using LIME and Grad-CAM, which is an explainable AI model that explains why the black-box model (VGG-19) predicted that output by producing the heatmap and highlighting the region of the eye that contributed to the eye having a cataract.

Here is the output using LIME:

The green highlighted output is the region that contributed the most while predicting the result that if an eye has a cataract.

Here is the some output of Grad-CAM model:

The highlighted part in above images is the region that contributed the most while predicting the result that if an eye has a cataract.

Expert Evaluation

Expert doctor evaluation is critical when working with XAI, particularly in the medical field, because it provides a means to validate and verify the AI model’s output. XAI approaches can provide useful insights into how an AI model makes predictions, but they are not perfect and require specialized domain knowledge to interpret. Our system generates heat maps to illustrate the output of VGG-19, but most of the time, only medical professionals can comprehend and analyze them. As a result, only professionals can determine the accuracy of the output, and this expert validation is required because no system is 100% accurate, and even a single erroneous decision in this particular scenario can have serious consequences.

Expert doctors in the medical field have years of training and experience diagnosing illnesses and analyzing imaging results. They are well-versed in the subtleties of various diseases and can provide valuable feedback on the accuracy and dependability of an AI model’s predictions. Expert doctor evaluation can assist in identifying mistakes or biases in the AI model’s predictions that XAI techniques may have overlooked. They can also assist in identifying circumstances when the AI model’s predictions are true but require additional investigation or testing. Furthermore, expert doctor evaluation can aid in the development of confidence and adoption of AI models in the medical arena. Doctors are frequently the end-users of AI systems in healthcare, and their buy-in and confidence are important for the systems’ effective implementation.

[1] Explainable Artificial Intelligence, Wikipedia