A system or automated process whose internal workings are opaque to the observer – its operation may only be traced by analysing its behaviour through its inputs and outputs
Sources of opaqueness:
Spectrum of opaqueness determined by the context (audience, purpose, etc.)
Interpretability is the degree to which a human can understand the cause of a decision
Explanation is an answer to a “Why?” question
Explanations should answer “Why?” and “Why-should?” questions until such questions can no longer be asked
Explanations “giv[e] a reason for a prediction” and answer “how a system arrives at its prediction”
Justifications “put an explanation in a context” and convey “why we should believe that the prediction is correct”
Transparency is a passive characteristic of a model that allows humans to make sense of it on different levels
Explainability is an active characteristic of a model that is achieved through actions and procedures employed (by the model) to clarify its functioning for a certain audience
Interpretability is the degree to which a human can consistently predict the model’s result
Transparency is the ability of a human to comprehend the (ante-hoc) mechanism employed by a predictive model on three levels
Marr’s three-level hierarchy of understanding information processing devices
Understanding why birds fly cannot be achieved by only studying their feathers:
In order to understand bird flight, we have to understand aerodynamics; only then do the structure of feathers and the different shapes of birds’ wings make sense.
Fidelity-based understanding
Mental models withing the completeness–soundness landscape
circular or tautological definitions
dictionary definitions
hierarchical and ontological definitions
component-based – pairings between keywords and technical component or properties
\[ \texttt{Explainability} \; = \] \[ \underbrace{ \texttt{Reasoning} \left( \texttt{Transparency} \; | \; \texttt{Background Knowledge} \right)}_{\textit{understanding}} \]
Explainability → explainee walking away with understanding
A continuous spectrum rather than a binary property
\[ f(\mathbf{x}) = 0.2 \;\; + \;\; 0.25 \times x_1 \;\; + \;\; 0.7 \times x_4 \;\; - \;\; 0.2 \times x_5 \;\; - \;\; 0.9 \times x_7 \]
\[ \mathbf{x} = (0.4, \ldots, 1, \frac{1}{2}, \ldots \frac{1}{3}) \]
\[ f(\mathbf{x}) = 0.2 \;\; \underbrace{+0.1}_{x_1} \;\; \underbrace{+0.7}_{x_4} \;\; \underbrace{-0.1}_{x_5} \;\; \underbrace{-0.3}_{x_7} \;\; = \;\; 0.6 \]