datamllab / rlcard

Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.

Home Page:http://www.rlcard.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

State Representation of Limit holdem / Leduc

DavidRSeWell opened this issue · comments

Hello,
It seems that the player to act i.e. (SB / BB) is not taken into account in the state representation. It seems to me that this would not be able to differentiate different states. For example a hand where the SB calls the BB checks back and they see a flop. The BB then checks. How would the state be differentiated for the SB / BB here? Same number of chips in the pot for both positions. The NFSP paper here https://arxiv.org/pdf/1603.01121.pdf for example uses the player as part of the state. Am I missing something?

If I am not mistaken, the current implementation for the DQN or NFSP is very basic, and just learns what combinations of Player Cards + Community Cards are good, and how much to bet based on that. Anything else like previous bets from other players on previous hands is not considered.

I am currently working on a NLH bot. If you are interested you can PM me on Slack, where I also replied to you msg.

Ok, I see. That makes sense. I didnt realize that was the intent. Thanks. I will talk on slack.

@befeltingu Thanks for the feedback. As @alexx-ftw mentioned, the current state feature is just a basic example to show how we can design features using the rlcard package. I expect that the performance can be improved significantly with better state and action features. One possible direction is to follow Figure 3 of the AlphaHoldem paper

In RLCard, we have actually carefully designed state/action features for the game of DouDizhu, and we observe that it can reach human-level performance. So I expect better features can also boost the performance of Hold'em games. Maybe the feature designs of DouDizhu are also helpful (see Table 4 of DouZero paper). We could discuss more in Slack if you are interested.

@daochenzha Ok, great thanks for the feedback. Somehow I have not seen the AlphaHoldem paper. I will take a look. I have been doing some experiments with a Kuhn poker game (which was easy to make with this framework) to keep things really simple. In that case just adding the player position was enough to be able to fully represent any state which I think would be the case with Leduc as well. Anyways thanks for the references. I will reach out on slack once I get to more complex scenarios.