Although machine learning has frequently been applied to different fields, trust is still an issue when it comes to suggesting it as a business solution to stakeholders. Deep learning is considered to be a black box, as a very high level of expertise is required to validate and find mistakes. In addition, the lack of insight increases the time needed to test and release models.
PwC Italy sponsored the work of engineer Luke Phairatt, who explored machine learning models using Local Interpretable Model-Agnostic Explanations (LIME) as a baseline deep learning tool for emotion classifiers (Xception), with the Kaggle data set as a case study. This project was supported by Luke’s mentor Simone Scardapane, researcher at La Sapienza University, and by Sébastien Bratières, Faculty Director of the AI Programme.
Video of the final presentation of the project
This work explores a new solution based on Local Interpretable Model-Agnostic Explanations (called LIME) which visually explains the reasons for the predictions. It uses emotion classification as a case study to determine the feasibility of interpreting black box models. The study demonstrates this by applying the interpreter (LIME) to the emotion classifier in order to highlight key facial features of each emotion type. By observing the results, it is possible to understand how well the interpretation method and the tool perform based on the match between the facial captions and the human interpretation.
Machine learning has increasingly been applied in sectors such as healthcare, public security, ATM alert systems, entertainment, market research and autonomous driving, but there is still a long road ahead when it comes to achieving greater accuracy.
To improve these machine learning models, it is important to understand which features are used to make the predictions, making sure that the reasoning logic is aligned with the human interpretation.
Data set. The emotion classifier for this project uses a convolutional neural network developed by Xception’s Convolutional Network Model. This was chosen as a baseline model as it provides reasonable accuracy as well as further insights that could be explored using the interpreter. Before applying the model to the interpreter, the model was re-trained with full images to confirm the expected accuracy.
Interpreting Black Box Models. Many different models weren’t designed to be interpretable. Approaches to explaining a black box model aim to extract information from the trained model in order to justify the outcomes of its predictions without knowing how the model works in detail. The literature features various innovative solutions in addition to research publications that suggest different techniques for interpreting deep learning models. LIME was identified as the best tool for this project, as it allows the interpretation process to be separate from the model implementation. The LIME approach does not require any specific model implementation, so the emotion classifier could be used as-is. In addition, LIME supports deep neural network models on image classification.
Explaining the LIME approach. LIME is a method that uses interpretable models which can explain the individual predictions of any black box model. The core of the LIME approach is to use interpretable models (Ridge or K-LASSO) to locally measure the importance of the input features in terms of how much they contribute to a given prediction.
Data selection. To capture essential interpretations without losing generality in the data selection process, ‘Softmax probability’ and ‘Confusion matrix’ were used as the criteria to pick relevant cases from within the data set.
Interpreting the emotion classifier. LIME was applied as it would be by a human analyst trying to gain insights into a classifier’s labelling. The research set a number of features for visualisation in order to highlight only the key descriptions. Based on the experiment to understand how the emotion classifier is working on the prediction, the research project uses the interpreter (LIME) to validate the features of every emotion being compared. The goal is to fully understand how the classifier picks up the signs of all emotion types.
The interpreting model was identified as an important aspect in order to gain trust when it comes to deploying ‘black box’ models in the real world. As demonstrated in this project, LIME provides good insights into the emotion classifier, helping us to understand the logic behind the prediction and identify faults when training the classifier. Improvements were made and the level of trust in the model increased. In addition, a deeply hidden issue – the incorrect association of a dark tone with a ‘sad’ emotion – was discovered very quickly using this interpretation approach. This could save a lot of time and resources in terms of testing and finding mistakes in real-world applications.
This project focuses on emotion classification within seven classes. It raises the question of how well the approach performs when faced with a larger data set and more classes (e.g. recognition of 500 different objects). Would a local fidelity approach (as in LIME) still reflect the correct interpretation of the global model? This area was not explored in this project and remains open for future research. With a small number of classes (e.g. seven emotions), the technique using softmax probability and a confusion matrix is adequate in order to select some interesting samples to investigate. However, with a hundred classes or more, it would be possible to go further in terms of data selection in order to capture all the model’s potential faults. This issue could also be addressed in future research.