3D Model Construction for a target class using XAI
Team
- e/18/354, K.K.D.R.Tharaka, e18354@eng.pdn.ac.lk
- e/18/318, S.A.P.Sandunika, e18318@eng.pdn.ac.lk
- e/18/022, D.I.Amarasinghe, e18022@eng.pdn.ac.lk
Table of Contents
- Introduction
- Image Classification Model
- SHAP (SHapley Additive exPlanations)
- 3D reconstruction using Point E
- Links
Introduction
The aim of this project is to create a general 3D skeleton for a target object through a process which uses a model that understands the unique and real features of the target object using XAI.
Problem Domain
- Machine learning models for classification are unclear on how they classify images and what criteria they use.
- These models may learn irrelevant features that can affect their decision-making, potentially leading to inaccuracies.
Project scope
Here we aim to uncover the underlying decision-making criteria and features using explainable methods and reconstruct them in 3D for a better understanding
—->
Image Classification Model
As the image classification model, we’re using MobileNet Image Classification With TensorFlow’s Keras API. MobileNet is a deep learning model architecture designed for efficient image classification on mobile and embedded devices. It was developed by Google and is widely used due to its small size and fast inference speed.
The MobileNet architecture employs depthwise separable convolutions, which significantly reduces the number of parameters and computations required compared to traditional convolutional neural networks (CNNs). This reduction allows MobileNet to achieve a good balance between accuracy and model size.
Advantages
- Efficient in terms of model size and computational complexity
- Has a small model size, making it easier to deploy on devices with limited storage capacity
- Enables faster inference times
- Achieves competitive accuracy levels
- Provides a strong foundation for transfer learning
SHAP (SHapley Additive exPlanations)
SHAP (SHapley Additive exPlanations) is an advanced method for explaining the predictions of machine learning models. It is based on cooperative game theory and aims to assign an importance value to each input feature or variable in a model, indicating how much that feature contributes to the model’s prediction for a particular instance. It aids in understanding the factors driving the model’s decisions, enhances interpretability, and enables trust in the model’s outputs, making it valuable in real-world applications.
Key features
- Can be applied to any black-box model (ex: MobileNet)
- Provides explanations at the individual instance level
- Offers a measure of global feature importance
- Interpretability and Trust
- Provides various visualization techniques to represent the explanations
3D reconstruction using Point E
Typical 3D reconstruction models takes a long time to reconstruct a 3D model. The state-of-the-art methods typically require multiple GPU-hours to produce a single sample.
Here we are using the Point.E method which is introduced with the research paper, “Point·E: A System for Generating 3D Point Clouds from Complex Prompts” by Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin and Mark ChenIn. In this model, an alternative method for 3D object generation is used which produces 3D models in only 1-2 minutes on a single GPU.
Their method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image.
Comparing with state-of-the-art method this method falls short in terms of sample quality but faster in sampling. Since our application wants to sample thousands of images, speed of the sampling is required over the quality. So, the traid-off is accepted.
Images Classification results
Result : Model predicting the image as a car correctly
SHAP output
3D reconstruction results
Process & Progression
Machine Learning Section
- Created and trained an image Classification model using MobileNet
- Outputs of the image classification model were explained using SHAP
- Classified images using trained ML model - Testing
- Removed the irrelevant pixels of an image using SHAP
2D to 3D Reconstruction Section
- Identified a suitable method for 3D reconstruction for a given 2D image using point E cloud techniques
- Using that method, created 3D models for preprocessed images
Software Engineering Section
- Created a Dashboard with the following features.
- Image classification
- Feature to Remove the irrelevant pixels of a given picture and output the 3D object of that.
Testing
- Testing for Machine Learning Model - manual testing
-
Dashboard - endpoint testing using post-man