XML Preliminaries

Kacper Sokol

Brief History of Explainability

Expert Systems (1970s & 1980s)

Transparent Machine Learning Models

Rise of the Dark Side (Deep Neural Networks)

No need to engineer features (by hand)
High predictive power
Black-box modelling

DARPA’s XAI Concept

Why We Need Explainability

Expectations Mismatch

We tend to request explanations mostly when a phenomenon disagrees with our expectations.

For example, an ML model behaves unlike we envisaged and outputs an unexpected prediction.

Stakeholders

Purpose or Role

Fairness
Privacy
Reliability and Robustness
Causality
Trust

Trustworthiness / Reliability / Robustness / Causality

No silly mistakes & socially acceptable
Fairness

Does not discriminate & is not biased

Benefits

New knowledge

Aids in scientific discovery
Legislation

Does not break the law
- EU’s General Data Protection Regulation
- California Consumer Privacy Act

Debugging / Auditing

Identify modelling errors and mistakes
Human–AI co-operation

Help humans complete tasks

Drawbacks

Safety / Security

Abuse transparency to steal a (proprietary) model
Manipulation

Use transparency to game a system, e.g., credit scoring

Pitfalls

Copy machine study done by Langer, Blank, and Chanowitz (1978):

Explanation Types

Explainability Source

ante-hoc – intrinsically transparent predictive models (transparency by design)
post-hoc – derived from a pre-existing predictive models that may themselves be unintelligible (usually requires an additional explanatory modelling step)

Explanation Provenance

Ante-hoc does not necessarily entail explainable or human-understandable

endogenous explanation – based on human-comprehensible concepts operated on by a transparent model
exogenous explanation – based on human-comprehensible concepts constructed outside of the predictive model (by the additional modelling step)

Explanation Domain

Original domain

Transformed domain

Explanation Types

model-based – derived from model internals
feature-based – regarding importance or influence of data features
instance-based – carried by rael or fictitious data point

meta-explainers – one of the above, but not extracted directly from the predictive model being explained (using an additional explainability modelling step, e.g., surrogate)

Explanation Family

associations between antecedent and consequent

feature importance
feature attribution / influence
rules

exemplars (prototypes & criticisms)

Explanation Family

contrasts and differences
- (non-causal) counterfactuals
  i.e., contrastive statements
- prototypes & criticisms

Explanation Family

causal mechanisms
- causal counterfactuals
- causal chains
- full causal model

Explanatory Medium

(statistical / numerical) summarisation
visualisation
textualisation
formal argumentation

Communication of Explanations

Static artefact
Interactive (explanatory) protocol
- ~~interactive interface~~
- interactive explanation

Explainability Scope

	global	cohort	local
data	a set of data	a subset of data	an instance
model	model space	model subspace	a point in model space
prediction	a set of predictions	a subset of predictions	a individual prediction

algorithmic explanation – the learning algorithm, not the resulting model; e.g., modelling assumptions, caveats, compatible data types, etc.

Explainability Target

Focused on a single class (technically limited)
- implicit context
  
  Why \(A\)? (…and not anything else, i.e., \(B \cup C \cup \ldots\))
- explicit context
  
  Why \(A\) and not \(B\)?
Multi-class explainability (Sokol and Flach 2020b)

If 🌧️, then \(A\); else if ☀️ & 🥶, then \(B\), else ☀️ & 🥵, then \(C\).

Important Developments

Where Is the Human? (circa 2017)

Insights from social sciences

Humans and Explanations

Human-centred perspective on explainability
Infusion of explainability insights from social sciences
- Interactive dialogue (bi-directional explanatory process)
- Contrastive statements (e.g., counterfactual explanations)

Exploding Complexity (2019)

Ante-hoc explainability

Ante-hoc vs. Post-hoc

Black Box + Post-hoc Explainer

Chose a well-performing black-box model
Use explainer that is
- post-hoc (can be retrofitted into pre-existing predictors)
- and possibly model-agnostic (works with any black box)

Caveat: The No Free Lunch Theorem

Post-hoc explainers have poor fidelity

Explainability needs a process similar to KDD, CRISP-DM or BigData
Focus on engineering informative features and inherently transparent models

It requires effort

XAI process

XAI process A generic eXplainable Artificial Intelligence process is beyond our reach at the moment

XAI Taxonomy spanning social and technical desiderata:
• Functional • Operational • Usability • Safety • Validation •
(Sokol and Flach, 2020. Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches)
Framework for black-box explainers
(Henin and Le Métayer, 2019. Towards a generic framework for black-box explanations of algorithmic decision systems)

Examples of Explanations

Permutation Feature Importance

Individual Conditional Expectation & Partial Dependence

FACE Counterfactuals

RuleFit

Useful Resources

📖 Books

Survey of machine learning interpretability in form of an online book
Overview of explanatory model analysis published as an online book

📝 Papers

General introduction to interpretability (Sokol and Flach 2021)
Introduction to human-centred explainability (Miller 2019)
Critique of post-hoc explainability (Rudin 2019)
Survey of interpretability techniques (Guidotti et al. 2018)
Taxonomy of explainability approaches (Sokol and Flach 2020a)

💽 Software

Microsoft’s Interpret
Oracle’s Skater
IBM’s Explainability 360
FAT Forensics
DALEX

alibi
iml
LIME (Python, R)
SHAP (Python, R)

Wrap Up

Summary

The landscape of explainability is fast-paced and complex
Don’t expect universal solution
The involvement of humans – as explainees – makes it all the more complicated

Bibliography

Belle, Vaishak, and Ioannis Papantonis. 2021. “Principles and Practice of Explainable Machine Learning.” Frontiers in Big Data, 39.

Doshi-Velez, Finale, and Been Kim. 2017. “Towards a Rigorous Science of Interpretable Machine Learning.” arXiv Preprint arXiv:1702.08608.

Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. “A Survey of Methods for Explaining Black Box Models.” ACM Computing Surveys (CSUR) 51 (5): 1–42.

Langer, Ellen J, Arthur Blank, and Benzion Chanowitz. 1978. “The Mindlessness of Ostensibly Thoughtful Action: The Role of ‘Placebic’ Information in Interpersonal Interaction.” Journal of Personality and Social Psychology 36 (6): 635.

Miller, Tim. 2019. “Explanation in Artificial Intelligence: Insights from the Social Sciences.” Artificial Intelligence 267: 1–38.

Poyiadzi, Rafael, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. “FACE: Feasible and Actionable Counterfactual Explanations.” In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 344–50.

Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15.

Sokol, Kacper, and Peter Flach. 2020a. “Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches.” In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 56–67.

———. 2020b. “LIMEtree: Consistent and Faithful Surrogate Explanations of Multiple Classes.” arXiv Preprint arXiv:2005.01427.

———. 2021. “Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence.” arXiv Preprint arXiv:2112.14466.

XML Preliminaries

Brief History of Explainability

Expert Systems (1970s & 1980s)

Transparent Machine Learning Models

Rise of the Dark Side (Deep Neural Networks)

DARPA’s XAI Concept

Why We Need Explainability

Expectations Mismatch

Stakeholders

Purpose or Role

Benefits

Drawbacks

Pitfalls

Explanation Types

Explainability Source

Explanation Provenance

Explanation Domain

Explanation Types

Explanation Family

Explanation Family

Explanation Family

Explanatory Medium

Communication of Explanations

Explainability Scope

Explainability Target

Important Developments

Where Is the Human? (circa 2017)

Humans and Explanations

Exploding Complexity (2019)

Ante-hoc vs. Post-hoc

Black Box + Post-hoc Explainer

Caveat: The No Free Lunch Theorem

Post-hoc explainers have poor fidelity

XAI process

Examples of Explanations

Permutation Feature Importance

Individual Conditional Expectation & Partial Dependence

FACE Counterfactuals

RuleFit

Useful Resources

📖 Books

📝 Papers

💽 Software

Wrap Up

Summary

Bibliography

Questions