Explainable AI (Part I): Explanations and Opportunities

Explainable AI (Part I)

As the ML / AI field matures and sophisticated models get deployed across new (and critical) industries, new challenges come up. One of the most common of those is the perception that those models are behaving as black boxes – i.e. it is difficult to understand how they work and the reasoning behind their predictions. This is coupled with increasing efforts on removing bias (i.e. discovering that the model uses discriminatory features in the prediction). Both of these lead to rising levels of mistrust in those ML / AI systems. The field that tackles these problems is called explainable AI, or xAI for short. In this blog post I will walk you through the motivations behind the field.

Image source: Sicara.ai (https://www.sicara.ai/blog/2019-07-31-tf-explain-interpretability-tensorflow)

Words of caution / Hype

Before getting started, there are a few words of caution I want to provide. While the “black-boxiness” of machine learning is not a new topic, and has been an issue since the first algorithms were created, the new wave of explainability methods is a relatively recent development. Because of this recency, many of the methods in the field have not been thoroughly tested in different environments, and academic work on the topic is also scarce. This is why you should be very careful in using those methods in critical production systems, and study the methods thoroughly before applying them blindly.

Unfortunately xAI is also caught up in the hype and buzzword storm that the general AI field is a victim of. This hype can obscure both the value, and the dangers of using those methods, and make the selection of the best method difficult.

What is xAI?

As a first step let’s define xAI. Even though there are differences between the two terms in the field, for the sake of simplicity, we will be using the words interpretability and explainability interchangeably. The best definition I found is this one:

Interpretability* is the degree to which a human can understand the cause of a decision (Miller, 2017).

We can focus on two specific words in this definition – human and decision. This shows that the most important idea of the field is to help humans understand machine learning systems.

Why do we need it

In the computing world before the advent of machine learning, the decisions made by machines were executed in a very strict fashion. This made the results of such programs much clearer to understand, since all you had to do is to understand the source code. Nowadays, however, even experienced data scientists might struggle with explaining the predictions rationale of their models, and the process seems like magic – give the data, add a target for the prediction (decision) and get the result – with nothing in between to show you a hint of the decision making process. This issue is illustrated in the drawing below:

Image source: Interpretable Machine Learning (C. Molnar)

Another very important motivating factor for the development of the xAI field is the increased maturity of production grade ML systems in critical industries. While around 2009 most of the machine learning systems deployed were within the products of tech-first companies (i.e. Google, Youtube), where a false prediction would result in the wrong recommendation shown to the application user, nowadays those algorithms are deployed in sectors such as the military, healthcare and finance. The results of predictions in those new AI industries can have far reaching and dramatic consequences on the lives of many people – thus it is imperative we know how those systems make their decisions.

Image source: DARPA (https://www.darpa.mil/program/explainable-artificial-intelligence).
DoD stands for Department of Defense.

Connected to this topic there are also laws such as the GDPR, and the “right to explanation”. It can be a legal obligation on the part of a data scientist who deploys a machine learning model to production to explain how it makes decisions if this decision can have a large impact on people.

xAi can also be beneficial to the end users of applications as well. With systems like this in place they will have increased trust. Let’s take an AI-powered healthcare application. What happens often in this case is that the technical team reports the model performance to the domain experts (or the application users, in this case). The engineers report that the model achieves 95% accuracy on predicting whether a patient has a certain disease or not. Most of the time the healthcare practitioners would be incredulous of such results – saying that it is simply not possible to be that accurate. If in this case we use a method such as Local Interpretable Model-agnostic Explanations (LIME) to explain why a certain patient is classified as not being sick the level of trust in the system should be improved. The doctors should be able to see that the model has a very similar logic to theirs when providing a diagnosis. This scenario is illustrated below, where the model shows that even if the patient has some symptoms, such as sneezing and headache, they are not sick because they exhibit no fatigue.

Image source: Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. https://doi.org/10.18653/v1/N16-3020

So now we went through the fundamentals of xAI and its opportunities. In the upcoming posts, we will share what methods and associated software tools are available for xAI.

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFlare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
ARRAffinity	session	ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user.
ARRAffinitySameSite	session	This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non-necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
elementor	never	This cookie is used by the website's WordPress theme. It allows the website owner to implement or change the website's content in real-time.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
li_gc	5 months 27 days	Used to store consent of guests regarding the use of cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
tableau_locale	session	We embed Tableau charts and interactivity on some of our pages. These cookies expire at the end of your session.
tableau_public_negotiated_locale	session	We embed Tableau charts and interactivity on some of our pages. These cookies expire at the end of your session.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
_dc_gtm_UA-111640802-1	1 minute	This cookie is used by Google Tag Manager to support Google Analytics on our Sites. It helps us monitor the use and performance of our Sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_ga_JWW0KP3X8Q	2 years	This cookie is installed by Google Analytics 4.
_gat_UA-111640802-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
ai_session	30 minutes	This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform.
ai_user	1 year	A unique user identifier cookie, set by Microsoft Application Insights software, that enables counting of the number of users accessing the application over time.
AnalyticsSyncHistory	1 month	Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries
prism_252943399	1 month	This cookie is used by Active Campaign for site tracking purposes.
visitorId	1 year	By default, the visitor ID is supplied to Coveo UA using the visitor (string) query parameter and kept in the local storage of the user browser. A third-party cookie can also be used to store the visitor ID if the current user browser accepts these kinds of cookies.
WFESessionId	session	These cookies are used by Microsoft Azure Application Insights, which collects site telemetry information, allowing us to analyze how some of our Sites are performing and to perform optimization.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
LinkedIn
muc_ads	2 years	Collects data on user behaviour and interaction in order to optimize the website and make advertisement on the website more relevant.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.

Cookie	Duration	Description
CONSENT	16 years 7 months 20 days 16 hours 15 minutes	No description
GetLocalTimeZone	session	No description
hid	session	No description available.

Explainable AI (Part I): Explanations and Opportunities

Explainable AI (Part I)

Words of caution / Hype

What is xAI?

Why do we need it

References & more

Details

Computer Vision: Create an API in 60 minutes

Data Governance Roles and Responsibilities

Guiding C-Level Executives Through Business Ethics in the Data and AI Age

DAIN Studios

Studio HELSINKI

Studio BERLIN

Studio MUNICH