nshaud / DeepNetsForEO

Deep networks for Earth Observation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

run the code with ISPRS Vaihingen data

oneOfThePeople opened this issue · comments

I downloaded the data.
when i create the label lmdb i got error because the code assume that the label are 1dim and i have 3dim
you make something to the original label ?
i use this path for the label - "~/ISPRS/Vaihingen/ISPRS_semantic_labeling_Vaihingen/gts_for_participants"
so i change the code for read the label to lmdb.
But of course i have a problem in the loss layer because againe the net expect for 1dim and not 3dim .
you have solution for me?
thenk you

Yes, indeed, we transformed the RGB-encoded labels in numerical labels (0,1,2,..,5). Instead of

[[[255,255,255],[255,255,255],...,[0,255,255]],
 [[255,255,255],[255,255,255],...,[0,255,255]],
 ...
 [[0,255,255],[0,0,255],...,[0,0,255]]]

we want :

[[0, 0, ..., 2],
 [0, 0, ..., 2],
 ...
 [2, 1, ..., 1]]

You can use this kind of script to do the conversion :

def pixel_to_label(pixel):
    label = None
    # Code for RGB values to label :
    r, g, b = pixel
    if r == 255 and g == 255 and b == 255:
        label = 0 # Impervious surfaces (white)
    elif r == 0 and g == 0 and b == 255:
        label = 1 # Buildings (dark blue)
    elif r == 0 and g == 255 and b == 255:
        label = 2 # Low vegetation (light blue)
    elif r == 0 and g == 255 and b == 0:
        label = 3 # Tree (green)
    elif r == 255 and g == 255 and b == 0:
        label = 4 # Car (yellow)
    elif r == 255 and g == 0 and b == 0:
        label = 5 # Clutter (red)
    if label is None:
        raise ValueError
    return label

from skimage import io


labels_rgb = io.imread('/path/to/your/labels.tif')
matrix_label = np.zeros(labels_rgb.shape[:2], dtype='uint8')) 
for x in range(labels_rgb.shape[0]):
    for y in range(labels_rgb.shape[1]):
        matrix_label[x,y] = pixel_to_label(labels_rgb[x,y])

thank i use another code:

def convert_from_color(arr_3d):
  arr_2d = np.zeros((arr_3d.shape[0], arr_3d.shape[1]), dtype=np.uint8)
  palette = {(0, 0, 0): 0,# Impervious surfaces (white)
             (0, 0, 255): 1,# Buildings (dark blue
             (0, 255, 255): 2,# Low vegetation (light blue)
             (0, 255, 0): 3,# Tree (green)
             (255, 255, 0): 4,# Car (yellow)
             (255, 0, 0): 5,# Clutter (red)
             (0, 0, 0): 6}# Unclassified

  for c, i in palette.items():
    m = np.all(arr_3d == np.array(c).reshape(1, 1, 3), axis=2)
    arr_2d[m] = i

  return arr_2d

the code taken from train-DeepLab

First, great job nshaud ! congrats!
Hi both,
I have the same problem but I don' really undestand in which point I have to make this conversion.
Could you help me?

thanks in advance

@jorgenaya : Caffe expects numerical labels when performing the prediction (e.g. 1 for "buildings", 2 for "low vegetation", 3 for "trees", etc.). However, in the ISPRS Vaihingen dataset, the provided ground truth is RGB-encoded, i.e. it is a color image with [0,0,255] for "buildings" (blue), [0,255,0] for "trees" (green), etc.

Therefore, before using the scripts for image extraction and LMDB creation, you should use a conversion function to transform the color-encoded ground truth into the label matrix. It will add some info in the README to hopefully clear things out.

Everything works! Thanks