alexfrom0815 / Online-3D-BPP-PCT

Code implementation of "Learning Efficient Online 3D Bin Packing on Packing Configuration Trees". We propose to enhance the practical applicability of online 3D Bin Packing Problem (BPP) via learning on a hierarchical packing configuration tree which makes the deep reinforcement learning (DRL) model easy to deal with practical constraints and well-performing even with continuous solution space.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mask_logits in AttentionModel is set to False by default

caiqi opened this issue · comments

Thanks for the awesome work and for sharing the code! The mask_logits is set to False by default and thus the leaf logits are normalized without removing the masked leaf nodes. Is this intended or a typo?

Thank you so much for your feedback, leaf logits are normalized without removing the masked leaf nodes directly since we find the policy training is not stable when all leaf nodes are not valid and set to '-inf'. Instead, leaf nodes are removed when calculating action probabilities (lines 139 - 144).