xinzhuma / monodle

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Code & Result on nuScenes

Treemann opened this issue · comments

Hi, I have reproduced your result on KITTI.
I can't find your submission on the nuScenes leaderboard, so I wonder if you have done experiments on the nuScenes dataset. If yes, could you share your accuracy? Will you release the code for nuScenes?

Thanks for your excellent work~

We didn't submit our results to nuScenes leaderboard, although we did conduct some experiments on the validation set. The mAP is about 0.3, and the code will be released.

When is the code expected to be released? Thanks for your sharing.

I need some time to re-organize these codes. Unfortunately, due to the coming of CVPR/ECCV, I have no time to do this during this period. So the release of the code may take three or four months.

Got it.

Hi @xinzhuma , I tried to modify your code and trained the model on the nuScenes dataset. Without so much hyperparameter tuning, I got mAP=0.26 based on your model structure & training parameters for KITTI. I am tuning the parameters to see if I could get better accuracy.
I wonder what configuration (model structure & training parameters, etc.) and tricks (like TTA, ensemble models) you use to get the result of mAP=0.3. Hope you could give some advices~

@Treemann No TTA or model ensemble tricks. the performance of the single model can achieve about 0.3. The hyper-params are different, including the learning rate, batch size, score thresholds, etc. Besides, some instances are invisible for a specific view, and you need to remove them.

@xinzhuma so the model structure is the same as that trained on KITTI ?

model:
  type: 'centernet3d'
  backbone: 'dla34'
  neck: 'DLAUp'

I'll continue to tune the hyperparameters, thanks for your reply~

@Treemann yes, only modify the number of the channels for the heatmap branch to detect all the ten classes. we didn't predict the velocity and attributes.

Hi @xinzhuma ,
I don't predict the velocity and attributes too and I get 0.276 now.
One more question, the image size of nuScenes is 1600x900 and the training is too slow, so I resize the input to 800x448. I wonder which input size & epochs you adopted.

Thanks for your patience.

800 * 448 is okay, we also adopt this setting. You can set the confidence threshold to 0.1, which is a better setting for nuScenes. we train the models with batchsize=192 and lr=0.0005 for 140 epochs.

OK, I'll try with your setting.
(: Training for 140 epochs takes a long time compared with FCOS3D which is trained for 12 or 24 epochs~