YisenWang / symmetric_cross_entropy_for_noisy_labels

Code for ICCV2019 "Symmetric Cross Entropy for Robust Learning with Noisy Labels"

Home Page:https://arxiv.org/abs/1908.06112

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How do you create asymmetric label noise on CIFAR-100?

XinshaoAmosWang opened this issue · comments

import os

import numpy as np

np.random.seed(123)

NUM_CLASSES = {'mnist': 10, 'svhn': 10, 'cifar-10': 10, 'cifar-100': 100}

noise_ratio=40
n = noise_ratio/100.0

nb_subclasses = 5

def build_for_cifar100(size, noise):
""" random flip between two random classes.
"""
assert(noise >= 0.) and (noise <= 1.)

P = np.eye(size)
cls1, cls2 = np.random.choice(range(size), size=2, replace=False)
P[cls1, cls2] = noise
P[cls2, cls1] = noise
P[cls1, cls1] = 1.0 - noise
P[cls2, cls2] = 1.0 - noise

#assert_array_almost_equal(P.sum(axis=1), 1, 1)
return P

P = build_for_cifar100(nb_subclasses, n)

print(np.matrix(P))

[[1. 0. 0. 0. 0. ]
[0. 0.6 0. 0.4 0. ]
[0. 0. 1. 0. 0. ]
[0. 0.4 0. 0.6 0. ]
[0. 0. 0. 0. 1. ]]

I do not understand this noise-transition matrix.

Thanks.

As the paper stated "for CIFAR-100, the 100 classes are grouped into 20 super-classes with each has 5 sub-classes, then flipping between two randomly selected sub-classes within each super-class."

Thanks. I would like to confirm the information again.

"Flipping between two randomly selected sub-classes within each super-class." This means within 5 subclasses, the flipping is done between two different classes, Right?

When noise rate is 40%:
In other words, given the training examples of class A, all 40% of its labels are flipped to a fixed different class e.g., B, instead of being flipped to one of other 4 classes uniformly, for example, 10% for each of other 4 subclasses.

In summary: given all training examples of class A
Option 1: All 40% of A's labels are flipped to a fixed different class;

Option 2: 40% of A's labels are flipped to one of other 4 classes uniformly, i.e., 10% for each of other 4 subclasses.

You used option1, Right?

I would like to compare with your method fairly, therefore, the noisy generation option is important. Thanks.