SebLague / Chess-Challenge

Create your own tiny chess bot!

Home Page:https://www.youtube.com/watch?v=Ne40a5LkK6A

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Poor nodes/s performance

jongdetim opened this issue · comments

With a very simple evaluation function that only takes into account the piece values + bonus scores for piece positions, i'm getting poor performance. I've also implemented some move ordering and transposition tables which affect the speed (slower because of move ordering, faster because of t-tables).

I get around 150.000 nodes per second on a 2017 imac (3.4 GHz Intel Core i5). Without move sorting, it does about 210.000 nodes/s. It could be just a language limitation, as C# isn't the fastest language, but still shouldn't be that bad. It's not uncommon to easily get 2 million+ nodes visited per second in C++. Wondering what nodes/s others are getting!

1200-1500 kN/s for mate search only.
Material+basic mobility+move ordering+mate drops it to ~400 kN/s
I have a 2009 AMD Phenom II X4 955

Perhaps i packed my piece position value tables too tightly, but it does save a lot of tokens. I'll have to check the unpack function with a profiler. I currently have it packed like this:

    // every 32 bits is a row. every 64-bit int here is 2 rows
    static ulong[] piecePositionValueTable = {
        0x00000000050A0AEC, 0x05FBF60000000014, 0x05050A190A0A141E, 0x3232323200000000, // pawns
        0xCED8E2E2D8EC0005, 0xE2050A0FE2000F14, 0xE2050F14E2000A0F, 0xD8EC0000CED8E2E2, // knights
        0xECF6F6F6F6050000, 0xF60A0A0AF6000A0A, 0xF605050AF600050A, 0xF6000000ECF6F6F6, // bishops
        0x00000005FB000000, 0xFB000000FB000000, 0xFB000000FB000000, 0x050A0A0A00000000, // rooks
        0xECF6F6FBF6000000, 0xF605050500000505, 0xFB000505F6000505, 0xF6000000ECF6F6FB, // queens
        0x141E0A0014140000, 0xF6ECECECECE2E2D8, 0xE2D8D8CEE2D8D8CE, 0xE2D8D8CEE2D8D8CE  // kings
    };

    int GetPositionScore(int pieceType, int index) =>
        (sbyte)((piecePositionValueTable[pieceType * 4 + index / 16] >> (8 * (7 - (index % 8 < 4 ? index % 8 : 7 - index % 8) + index % 16 / 8 * 4))) & 0xFF);
    

I was also planning to replace calling this for the entire board during eval, instead maybe passing the value along & just calculating the difference between starting and target squares for the current move. If this is indeed a bottleneck, that should help.

You might also try to run in release mode instead of the default debug mode, to see if it makes a difference.

I'm running the project from the cli, not visual studio. Will dotnet run -c Release be different from a regular dotnet run command?

Yes, it is defaulting to Debug

Running in release mode should make a pretty big difference (it’s about 4x for me). The core engine still has a lot of room for optimization though, so you’re never going to get the kind of nps you might expect from more serious engines.

Thank you, running in release mode gives me a whopping ~3.5x speedup!