Recreating Online Arxiv Paper Results for TUM-VI

Question

Recreating Online Arxiv Paper Results for TUM-VI

ArmandB opened this issue 2 years ago · comments

Love the paper, thank you so much for putting it and the code out there!

When I was trying to recreate the paper results, I noticed that my EUROC results match but TUM-VI did not. Looking at the paper, I found:
Table B2 online-stereo-Normal SLAM has identical RMSE to postprocess (Table 4)

I suspect that this is just a typo although I could be wrong here.

Cheers and all the best!

Otto Seiskari · Answer 1 · Fri Sep 02 2022 14:52:53 GMT+0800 (China Standard Time)

Hello!

Good point, thank you for reporting. I think it's not really a typo, but in Table 4 we may have actually compared our online SLAM result to the postprocessed results for other methods, because our postprocessing did not work well with that dataset. This is OK in the sense that online methods are a subset of postprocessed methods (those with zero post-processing), but we should have written this more clearly in the paper. (also FYI @pekkaran)

Attached the tables here for reference:

Armand Behroozi · Answer 2 · Fri Sep 02 2022 23:48:27 GMT+0800 (China Standard Time)

Thanks for your fast response!

That's good to know b/c I'm running TUM w/ parameters: "-maxSuccessfulVisualUpdates=20 -useStereo -useSlam -timer=true" (like the Normal SLAM parameters in the paper) and am getting some different RMSEs than what the paper reported.
Room1 - "RMSE": 0.02413978985557401
Room2 - "RMSE": 0.027370991450391104
Room3 - "RMSE": 0.014600381138423596
Room4 - "RMSE": 0.017374768510435266
Room5 - "RMSE": 0.02477042526419998
Room6 - "RMSE": 0.022124688783236424
I ran w/ the TUM default settings in compare_to_others.py (-maxSuccessfulVisualUpdates=5) and results didn't match as well.

My fork is based off of commit: 3a46cd6 instead of e325353 so that could be part of the issue. I didn't see any changes to the src code between those two commits, but there could be something I'm missing. I can also try and look at a --postprocess run and see how the error compares. I'd have done it before, but there was some python error with running with the --postprocess run that I needed to sort through first. I'll report what the issue ends up being if I figure it out.

Pekka Rantalankila · Answer 3 · Sat Sep 03 2022 00:46:28 GMT+0800 (China Standard Time)

Hello.

Omitting -maxSuccessfulVisualUpdates=20 and using the default 5 is correct for the TUM data.

Did you obtain your RMSE numbers using the --outputDir argument of compute_paper_results.py? The metric computation is somewhat complex to reproduce using the vio_benchmark tools and/or HybVIO main binary manually. For example vio_benchmark implements multiple metrics and the default one isn't the same as the one used in the paper (SE3 RMSE, set for TUM here).

Looking at the TUM data conversion script, one thing I notice is that it uses the lower resolution version by default. Changing this line to use 1024 might improve the results (or do the opposite), although I'm rather sure we tested reproduction of the paper results including downloading the data from scratch with the provided scripts.

Finally, we sometimes had difficulties reproducing the exact results across different machines. For example, the test data is saved as videos which are decompressed using system install of FFmpeg (via OpenCV functions). A different version of FFmpeg or other related operating system differences could result in the decompressed images being slightly different and causing a butterfly effect in the VIO results. However, if you were able to produce exact same numbers with EuRoC as we did (up to the precision we reported), then maybe it would be plausible to do the same with the TUM data.

The commit you used for your fork should indeed produce the same results.

Armand Behroozi · Answer 4 · Mon Sep 05 2022 03:48:34 GMT+0800 (China Standard Time)

Your intuition was correct. Using the 1024x1024 resolution TUM-VI images and omitting -maxSuccessfulVisualUpdates=20 allowed me to recreate the online results to the precision that you reported. Thank you guys so much!!! I will change the title of this issue and also put some of my system/ffmpeg information below in case it helps posterity:

ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100