Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

Question

Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

subinium opened this issue 3 years ago · comments

Subin An commented 3 years ago

https://arxiv.org/abs/2006.11371
XAI Survey

Subin An · Answer 1 · Thu Feb 11 2021 23:20:34 GMT+0800 (China Standard Time)

Taxamonies & Organization

Scope (Section 4)

Local instance를 이해하려는건지
모델 전체(Global)를 이해하려는건지

Methodology (Section 5)

backpropagation-based
perturbation-based

Usage (Section 6)

model-intrinsic (model-specific)
post-hoc (model-agnostic)

Definition

DEF1 : Interpretability is a desirable quality or feature of an algorithm which provides enough expressive data to understand how the algorithm works
- If something is interpretable, it is possible to find its meaning or possible to find a particular meaning in it
DEF2 : Interpretation is a simplified representation of a complex domain, such as outputs generated by a machine learning model, to meaningful concepts which are human-understandable and reasonable
DEF3 : An explanation is additional meta information, generated by an external algorithm or by the machine learning model itself, to describe the feature importance or relevance of an input instance towards a particular output classification
DEF4 : For a deep learning model f, if the model parameters θ and the model architecture information are known, the model is considered a white-box.
DEF5 : A deep learning model f is considered a black-box if the model parameters and network architectures are hidden from the end-user.
DEF6 : A deep learning model is considered transparent if it is expressive enough to be human-understandable. Here, transparency can be a part of the algorithm itself or using external means such as model decomposition or simulations.
DEF7 : Trustability of deep learning models is a measure of confidence, as humans, as end-users, in the intended working of a given model in dynamic real-world environments.
DEF8 : Bias in deep learning algorithms indicate the disproportionate weight, prejudice, favor, or inclination of the learnt model towards subsets of data due to both inherent biases in human data collection and deficiencies in the learning algorithm.
DEF9 : Fairness in deep learning is the quality of a learnt model in providing impartial and just decisions without favoring any populations in the input data

Subin An · Answer 2 · Thu Feb 11 2021 23:20:56 GMT+0800 (China Standard Time)

목표

논문에서 언급한 해당 milestone급 논문을 모두 리뷰해보자!

Subin An · Answer 3 · Fri Feb 12 2021 08:15:46 GMT+0800 (China Standard Time)

Identity or Invariance: Identical data instances must produce identical attributions or explanations.
Stability: Data instances belonging to the same class c must generate comparable explanations g.
Consistency: Data instances with change in all but one feature must generate explanations which magnifies the change.
Separability: Data instances from different populations must have dissimilar explanations.
Similarity: Data instances, regardless of class differences, closer to each other, should generate similar explanations.
Implementation Constraints: Time and compute requirement of the explainable algorithm should be minimal.
Bias Detection: Inherent bias in data instances should be detectable from the testing set. Similarity and separability measures help achieve this.

Subin An · Answer 4 · Fri Feb 12 2021 08:19:44 GMT+0800 (China Standard Time)

Evaluation Schemes

System Causability Scale
Benchmarking Attribution Methods
Faithfulness and Monotonicity
Human-grounded Evaluation Benchmark