NYUd2 Toolkit
Here, we provide simple tools of pre-processing for NYUd v2 dataset, as the the NYUd v2 dataset's author only provide the original dumped data collected by Kinect. When we apply monocular depth estimation on NYU-d v2 dataset, we shoule generate the RGB image and dense depth map ourself, the process method is as follows.
Requirements
These code are tested on Ubuntu 16.04 LTS with MATLAB 2015b and Python2.7.
Dataset preparation
-
Download the raw data of NYU-d v2 dataset, which more than
400G
, please make sure that you have enough disk space availabel. Then extract them into the directorynyud_raw_data
. At the same time, download theToolbox
from the same url above, and extract it. -
The dataset is divided into
590
folders which correspond to eachscene
being filmed, such asliving_room_0012
. The file is structured as follows:
/
../bedroom_0001/
../bedroom_0001/a-1294886363.011060-3164794231.dump
../bedroom_0001/a-1294886363.016801-3164794231.dump
...
../bedroom_0001/d-1294886362.665769-3143255701.pgm
../bedroom_0001/d-1294886362.793814-3151264321.pgm
...
../bedroom_0001/r-1294886362.238178-3118787619.ppm
../bedroom_0001/r-1294886362.814111-3152792506.ppm
- Files that begin with the prefix
a-
are the accelerometer dumps. Files that begin with the prefixr-
andd-
are the frames from the RGB and depth cameras, respectively. You can useget_synched_frames.m
function in the Toolbox to find the matching relationship betweenrgb image
anddepth map
.
Generate the RGB and Dense Depth map
-
Put the script
process_raw.m
and theToolbox
into dirnyud_raw_data
as mentioned above. -
Modify the
savePath
andstride
, which thesavePath
termed the output path and thestride
control the number of output files. The default value ofstride
is1
, which will save all images. -
Open matlab under
tmux
for the sake of long processing time and run the scriptprocess_raw.m
.
Sample results are as follows:
rgb image with resolution of 480 * 640
:
dense depth image with the same resolution of rgb image:
- Tips: For better training, I save the dense depth of RGB image with the data format of 16 bit, so the value of depth map is between
0
and65535
, as defined at:
imgDepth = imgDepth / 10.0 * 65535.0
imgDepth = uint16(imgDepth)
imwrite(imgDepth, outDepthFilename, 'png', 'bitdepth', 16);
- You can also save them with the format of
8bit
which limit the value of depth map between0
and255
, just change the script above.
Generate the NYU-d v2 thin dataset
In general, we can also generate a thin dataset which has 1449
images totally. 795
is for training and 654
is for testing.
- Firstly, you should download the thin dataset integrated into one
.mat
file from the same url above, namedLabeled dataset (~2.8 GB)
, then extract it to getnyu_depth_v2_labeled.mat
which contents:
accelData: [1449×4 single]
depths: [480×640×1449 single]
images: [480×640×3×1449 uint8]
instances: [480×640×1449 uint8]
labels: [480×640×1449 uint16]
names: {894×1 cell}
namesToIds: [894×1 containers.Map]
rawDepthFilenames: {1449×1 cell}
rawDepths: [480×640×1449 single]
rawRgbFilenames: {1449×1 cell}
sceneTypes: {1449×1 cell}
scenes: {1449×1 cell}
-
Run
save16bitdepth.m
to save the16bit
dense depth map of1449
images, while theRGB
images can be obtained by directly save from theimages
attributes ofnyu_depth_v2_labeled.mat
. -
Run
nyud_split.py
to split1449
images totest
andtrain
subset for practical application. You should change the variables refer toPATH
as you wish.