subinium / Deep-Papers

Deep Learning Paper Simple Review + Helpful Article

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey

subinium opened this issue · comments

Taxamonies & Organization

1

Scope (Section 4)

  • Local instance를 이해하려는건지
  • 모델 전체(Global)를 이해하려는건지

Methodology (Section 5)

  • backpropagation-based
  • perturbation-based

Usage (Section 6)

  • model-intrinsic (model-specific)
  • post-hoc (model-agnostic)

Definition

2

  • DEF1 : Interpretability is a desirable quality or feature of an algorithm which provides enough expressive data to understand how the algorithm works

    • If something is interpretable, it is possible to find its meaning or possible to find a particular meaning in it
  • DEF2 : Interpretation is a simplified representation of a complex domain, such as outputs generated by a machine learning model, to meaningful concepts which are human-understandable and reasonable

  • DEF3 : An explanation is additional meta information, generated by an external algorithm or by the machine learning model itself, to describe the feature importance or relevance of an input instance towards a particular output classification

  • DEF4 : For a deep learning model f, if the model parameters θ and the model architecture information are known, the model is considered a white-box.

  • DEF5 : A deep learning model f is considered a black-box if the model parameters and network architectures are hidden from the end-user.

  • DEF6 : A deep learning model is considered transparent if it is expressive enough to be human-understandable. Here, transparency can be a part of the algorithm itself or using external means such as model decomposition or simulations.

  • DEF7 : Trustability of deep learning models is a measure of confidence, as humans, as end-users, in the intended working of a given model in dynamic real-world environments.

  • DEF8 : Bias in deep learning algorithms indicate the disproportionate weight, prejudice, favor, or inclination of the learnt model towards subsets of data due to both inherent biases in human data collection and deficiencies in the learning algorithm.

  • DEF9 : Fairness in deep learning is the quality of a learnt model in providing impartial and just decisions without favoring any populations in the input data

목표

논문에서 언급한 해당 milestone급 논문을 모두 리뷰해보자!

3

  1. Identity or Invariance: Identical data instances must produce identical attributions or explanations.
  2. Stability: Data instances belonging to the same class c must generate comparable explanations g.
  3. Consistency: Data instances with change in all but one feature must generate explanations which magnifies the change.
  4. Separability: Data instances from different populations must have dissimilar explanations.
  5. Similarity: Data instances, regardless of class differences, closer to each other, should generate similar explanations.
  6. Implementation Constraints: Time and compute requirement of the explainable algorithm should be minimal.
  7. Bias Detection: Inherent bias in data instances should be detectable from the testing set. Similarity and separability measures help achieve this.

Evaluation Schemes

  • System Causability Scale
  • Benchmarking Attribution Methods
  • Faithfulness and Monotonicity
  • Human-grounded Evaluation Benchmark