We tend to request explanations mostly when a phenomenon disagrees with our expectations.
For example, an ML model behaves unlike we envisaged and outputs an unexpected prediction.
Trustworthiness / Reliability / Robustness / Causality
No silly mistakes & socially acceptable
Fairness
Does not discriminate & is not biased
New knowledge
Aids in scientific discovery
Legislation
Does not break the law
Debugging / Auditing
Identify modelling errors and mistakes
Human–AI co-operation
Help humans complete tasks
Safety / Security
Abuse transparency to steal a (proprietary) model
Manipulation
Use transparency to game a system, e.g., credit scoring
Copy machine study done by Langer, Blank, and Chanowitz (1978):
Ante-hoc does not necessarily entail explainable or human-understandable
Original domain
Transformed domain
contrasts and differences
causal mechanisms
Static artefact
Interactive (explanatory) protocol
global | cohort | local | |
---|---|---|---|
data | a set of data | a subset of data | an instance |
model | model space | model subspace | a point in model space |
prediction | a set of predictions | a subset of predictions | a individual prediction |
Focused on a single class (technically limited)
implicit context
Why \(A\)? (…and not anything else, i.e., \(B \cup C \cup \ldots\))
explicit context
Why \(A\) and not \(B\)?
Multi-class explainability (Sokol and Flach 2020b)
If 🌧️, then \(A\); else if ☀️ & 🥶, then \(B\), else ☀️ & 🥵, then \(C\).
It requires effort
A generic eXplainable Artificial Intelligence process is beyond our reach at the moment
XAI Taxonomy spanning social and technical desiderata:
• Functional • Operational • Usability • Safety • Validation •
(Sokol and Flach, 2020. Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches)
Framework for black-box explainers
(Henin and Le Métayer, 2019. Towards a generic framework for black-box explanations of algorithmic decision systems)