How do you create asymmetric label noise on CIFAR-100?

Question

How do you create asymmetric label noise on CIFAR-100?

XinshaoAmosWang opened this issue 5 years ago · comments

amos_xwang commented 5 years ago

import os

import numpy as np

np.random.seed(123)

NUM_CLASSES = {'mnist': 10, 'svhn': 10, 'cifar-10': 10, 'cifar-100': 100}

noise_ratio=40
n = noise_ratio/100.0

nb_subclasses = 5

def build_for_cifar100(size, noise):
""" random flip between two random classes.
"""
assert(noise >= 0.) and (noise <= 1.)

P = np.eye(size)
cls1, cls2 = np.random.choice(range(size), size=2, replace=False)
P[cls1, cls2] = noise
P[cls2, cls1] = noise
P[cls1, cls1] = 1.0 - noise
P[cls2, cls2] = 1.0 - noise

#assert_array_almost_equal(P.sum(axis=1), 1, 1)
return P

P = build_for_cifar100(nb_subclasses, n)

print(np.matrix(P))

[[1. 0. 0. 0. 0. ]
[0. 0.6 0. 0.4 0. ]
[0. 0. 1. 0. 0. ]
[0. 0.4 0. 0.6 0. ]
[0. 0. 0. 0. 1. ]]

I do not understand this noise-transition matrix.

Thanks.

Yisen Wang · Answer 1 · Tue Jan 07 2020 17:15:28 GMT+0800 (China Standard Time)

As the paper stated "for CIFAR-100, the 100 classes are grouped into 20 super-classes with each has 5 sub-classes, then flipping between two randomly selected sub-classes within each super-class."

amos_xwang · Answer 2 · Tue Jan 07 2020 19:13:56 GMT+0800 (China Standard Time)

Thanks. I would like to confirm the information again.

"Flipping between two randomly selected sub-classes within each super-class." This means within 5 subclasses, the flipping is done between two different classes, Right?

When noise rate is 40%:
In other words, given the training examples of class A, all 40% of its labels are flipped to a fixed different class e.g., B, instead of being flipped to one of other 4 classes uniformly, for example, 10% for each of other 4 subclasses.

In summary: given all training examples of class A
Option 1: All 40% of A's labels are flipped to a fixed different class;

Option 2: 40% of A's labels are flipped to one of other 4 classes uniformly, i.e., 10% for each of other 4 subclasses.

You used option1, Right?

I would like to compare with your method fairly, therefore, the noisy generation option is important. Thanks.

Yisen Wang · Answer 3 · Tue Jan 07 2020 19:23:07 GMT+0800 (China Standard Time)

Yes. Asymmetric noise is some like the option 1，but the two classes A B are randomly selected XAWang <notifications@github.com>于2020年1月7日周二19:13写道：

…

Thanks. I would like to confirm the information again. "Flipping between two randomly selected sub-classes within each super-class." This means within 5 subclasses, the flipping is done between *two different classes*, Right? When noise rate is 40%: In other words, given the training examples of class *A*, all *40% of its labels are flipped to a fixed different class* e.g., *B*, instead of being flipped to one of other 4 classes uniformly, for example, 10% for each of other 4 subclasses. In summary: given all training examples of class *A* Option 1: All *40% of A's labels are flipped to a fixed different class*; Option 2: 40% of A's labels are flipped to one of other 4 classes uniformly, i.e., 10% for each of other 4 subclasses. You used option1, Right? I would like to compare with your method fairly, therefore, the noisy generation option is important. Thanks. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2?email_source=notifications&email_token=AFUZS2CUGK4CGFGCC3735BDQ4RPXLA5CNFSM4KDVIYC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIIREAQ#issuecomment-571544066>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFUZS2BYBAFTZT5WVMB6OYTQ4RPXLANCNFSM4KDVIYCQ> .