In Mahjong game prediction, it appears that the order of state['current_hand'] influences the result of eval_step, what could be the reason?
jacy opened this issue · comments
jacy commented
In Mahjong game prediction, it appears that the order of state['current_hand'] influences the result of eval_step, what could be the reason?
jacy commented
found the root cause: in mahjong extract_state function the raw_legal_actions and legal_actions doesn't match, legal_actions is the unique list of player's hand, but raw_legal_actions is the list of player's hand