tensorflow / model-analysis

Model analysis tools for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multiclass confusion matrix / binarized metrics need class names, not just class IDs

schmidt-jake opened this issue · comments

System information

  • Have I written custom code (as opposed to using a stock example script
    provided in TensorFlow Model Analysis)
    : Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
  • TensorFlow Model Analysis installed from (source or binary): binary (PyPI)
  • TensorFlow Model Analysis version (use command below): 0.22.2
  • Python version: 3.6.9
  • Jupyter Notebook version: 6.0.3
  • Exact command to reproduce:
from tensorflow_model_analysis import EvalConfig
from tensorflow_model_analysis.metrics import default_multi_class_classification_specs
from google.protobuf.json_format import ParseDict

class = ['class_1', 'class_2', ...]

eval_config = {
    'model_specs': [
        {
            'name': 'rig_state',
            'model_type': 'tf_keras',
            'signature_name': 'serve_raw',
            'label_key': ...,
            'example_weight_key': 'sample_weight'
        }
    ],
    'metrics_specs': [
        {
            'metrics': [
                {
                    'class_name': 'MultiClassConfusionMatrixPlot',
                    'config': '"thresholds": [0.5]'
                },
                {'class_name': 'ExampleCount'},
                {'class_name': 'WeightedExampleCount'},
                {'class_name': 'SparseCategoricalAccuracy'},
            ],
        },
        {
            'binarize': {'class_ids': {'values': list(range(len(classes)))}},
            'metrics': [
                {'class_name': 'AUC'},
                {'class_name': 'CalibrationPlot'},
                {'class_name': 'BinaryAccuracy'},
                {'class_name': 'MeanPrediction'}
            ]
        }
    ],
    'slicing_specs': [...]
}
eval_config: EvalConfig = ParseDict(eval_config, EvalConfig())

Describe the problem

Multiclass confusion matrices and binarized metrics should support class names, not just class IDs. Something like 'binarize': {'classes': [{'id': _id, 'name': name} for _id, name in enumerate(classes)]. As it stands, having integer value IDs for the classes is meaningless to data scientists and business stakeholders looking at the TFMA visualizations.

We are looking into this, but don't yet have a clear solution. We would like to get the class id -> name mappings via the label vocab, but we don't always have access to the vocab so we are currently looking into getting the APIs we need.

Typically the vocab is computed/known in an upstream step... would it be the worst idea to update the EvalConfig proto to have a field for vocab?

@mdreves any thoughts about this suggestion?

The idea has been floated internally a few times and we are still considering it, but the preference is to find something that is bundled with the model so that the config is shared across components.