Multiclass confusion matrix / binarized metrics need class names, not just class IDs

Question

Multiclass confusion matrix / binarized metrics need class names, not just class IDs

schmidt-jake opened this issue 4 years ago · comments

System information

Have I written custom code (as opposed to using a stock example script
provided in TensorFlow Model Analysis): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
TensorFlow Model Analysis installed from (source or binary): binary (PyPI)
TensorFlow Model Analysis version (use command below): 0.22.2
Python version: 3.6.9
Jupyter Notebook version: 6.0.3
Exact command to reproduce:

from tensorflow_model_analysis import EvalConfig
from tensorflow_model_analysis.metrics import default_multi_class_classification_specs
from google.protobuf.json_format import ParseDict

class = ['class_1', 'class_2', ...]

eval_config = {
    'model_specs': [
        {
            'name': 'rig_state',
            'model_type': 'tf_keras',
            'signature_name': 'serve_raw',
            'label_key': ...,
            'example_weight_key': 'sample_weight'
        }
    ],
    'metrics_specs': [
        {
            'metrics': [
                {
                    'class_name': 'MultiClassConfusionMatrixPlot',
                    'config': '"thresholds": [0.5]'
                },
                {'class_name': 'ExampleCount'},
                {'class_name': 'WeightedExampleCount'},
                {'class_name': 'SparseCategoricalAccuracy'},
            ],
        },
        {
            'binarize': {'class_ids': {'values': list(range(len(classes)))}},
            'metrics': [
                {'class_name': 'AUC'},
                {'class_name': 'CalibrationPlot'},
                {'class_name': 'BinaryAccuracy'},
                {'class_name': 'MeanPrediction'}
            ]
        }
    ],
    'slicing_specs': [...]
}
eval_config: EvalConfig = ParseDict(eval_config, EvalConfig())

Describe the problem

Multiclass confusion matrices and binarized metrics should support class names, not just class IDs. Something like 'binarize': {'classes': [{'id': _id, 'name': name} for _id, name in enumerate(classes)]. As it stands, having integer value IDs for the classes is meaningless to data scientists and business stakeholders looking at the TFMA visualizations.

Mike Dreves · Answer 1 · Wed Jul 01 2020 03:35:57 GMT+0800 (China Standard Time)

We are looking into this, but don't yet have a clear solution. We would like to get the class id -> name mappings via the label vocab, but we don't always have access to the vocab so we are currently looking into getting the APIs we need.

Jake Schmidt · Answer 2 · Wed Jul 08 2020 01:07:01 GMT+0800 (China Standard Time)

Typically the vocab is computed/known in an upstream step... would it be the worst idea to update the EvalConfig proto to have a field for vocab?

Jake Schmidt · Answer 3 · Thu Jul 16 2020 03:01:22 GMT+0800 (China Standard Time)

@mdreves any thoughts about this suggestion?

Mike Dreves · Answer 4 · Thu Jul 16 2020 03:43:22 GMT+0800 (China Standard Time)

The idea has been floated internally a few times and we are still considering it, but the preference is to find something that is bundled with the model so that the config is shared across components.