bood / go-test

Scenario based strength test tool for Go programs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add OGS 12435854 huhoho 2d vs RoyalZero 9d 8c67ecdc

Mardak opened this issue · comments

https://online-go.com/game/12435854

This game has RZ dropping from 60% winrate after move 148. Analyzing the game with the net used (8c67ecdc-120), the now best (5d6d9c5b-121), and current best 256x20 (1629cbfd-V5) for move 150 with 32000 visits seems to show the live eval thought the win rate of 48% for playing K2 was not actually as good.

weight: 8c67ecdc, visit: 32000
  L2 ->   11132 (V: 43.14%) (N:  1.53%) PV: L2 L4 K5 H2 K2 P5 P4 P6 Q7 O3 N6 L6 J7 K6 O7 J5 B10 L12 C11 D11 G17 M11 H10 O12 S15 S16 T16 T17 T15 S18 F18
  Q1 ->    6024 (V: 49.93%) (N:  0.03%) PV: Q1 R2 P1 R1 L4 P7 O8 L2 M2 H2 K2 L3 K5 K4 K6 J5 L6 J7 L1 K1 J1 H1 M1 F1 B10
  K2 ->    4656 (V: 42.85%) (N: 10.96%) PV: K2 L3 Q1 R2 P1 S2 L4 L2 M2 H2 L1 H1 E1 K1 K4 J3 D1 A5 K5 J5
  H2 ->    3323 (V: 43.02%) (N:  3.22%) PV: H2 L4 L3 P6 P4 O8 P7 Q6 R6 Q7 R2 S2 R3 S3 O3 R1 P1 Q2 S15 S16
  L4 ->    2985 (V: 40.35%) (N: 64.84%) PV: L4 L2 M2 H2 K2 L3 L1 H1 E1 K1 K5 K4 K6 J5 L6 J7
  R2 ->    1495 (V: 43.06%) (N:  1.12%) PV: R2 S2 L4 L2 M2 H2 K2 L3 L1 H1 R3 S3 E1 K1 K4 J3

weight: 5d6d9c5b, visit: 32000
  L4 ->    7914 (V: 40.18%) (N: 64.82%) PV: L4 L2 R2 R3 M2 P5 P4 P6 R6 O3 S2 P1 T3 T4 N4 R1 S1 T2 Q1 Q2 O1 S3 K2
  K2 ->    6567 (V: 41.13%) (N: 10.26%) PV: K2 L3 Q1 R2 P1 S2 L4 L2 M2 H2 K5 K4 K6 J5 L6 J7 L1 K1 J1 H1 K1 F1 B10
  R2 ->    3778 (V: 41.31%) (N:  1.16%) PV: R2 S2 R3 S3 L4 L2 M2 H2 K5 K4 K2 L3 L6 K6 L1 P7 O8 R6 Q6 Q7 P5 P9 S7
  L2 ->    3741 (V: 41.29%) (N:  1.71%) PV: L2 L4 K5 P5 P4 P6 R6 O3 H2 J5 G17 F18 G16 H17 S15 S16 T16 T17 T15 S18 B10 C11 A6
  R3 ->    3531 (V: 41.34%) (N:  0.28%) PV: R3 S3 L4 L2 R2 S2 M2 H2 K2 L3 K5 J5 L1 K1 J1 H1 K6 J7 M1 F1 S15 S16
  H2 ->    3456 (V: 41.24%) (N:  2.83%) PV: H2 L4 K5 P7 O8 J5 K6 L6 J7 M5 N6 N4 P4 O3 R3 S3 R2 M2 S2 R6 Q6 Q7 S7 P5 T6 L8 S15 S16

weight: 1629cbfd, visit: 32000
  R2 ->    7510 (V: 42.65%) (N:  2.40%) PV: R2 S2 Q1 R3 P1 S1 L4 L2 M2 H2 K2 L3 L1 K1 K5 K4 J1 H1 K6 J5 L6 J7
  L4 ->    5970 (V: 41.24%) (N: 60.58%) PV: L4 L2 R2 S2 R3 S3 Q1 Q2 K2 L3 H2 M2 O3 N1 N4 O1 P4 R1 G17 F18 G16 H17 S15
  P1 ->    4986 (V: 42.69%) (N:  0.04%) PV: P1 Q1 L4 L2 M2 H2 K2 L3 K5 J5 L1 H1 E1 K1 J1 J3 M1 K1 G17
  H2 ->    4062 (V: 42.45%) (N:  6.85%) PV: H2 L4 L3 L6 J3 P6 P4 O3 N6 P5 Q7 J7 K5 K4 K2 J5 G17 F18 G16 H17 S15
  K2 ->    3912 (V: 42.36%) (N:  8.97%) PV: K2 L3 R2 S2 Q1 R3 P1 S1 L4 L2 M2 H2 L1 K1 K5 K4 J1 H1 K6 J5 L6 J7
  R3 ->    2884 (V: 42.68%) (N:  0.11%) PV: R3 S3 L4 L2 R2 S2 Q1 Q2 K2 L3 H2 M2 O3 N1 N4 O1 P4 R1 G17 F18 G16 H17 S15
…
  L2 ->      23 (V: 32.23%) (N:  1.68%) PV: L2 L4 K5 H2 K2 J5 G17

I'm not really sure what's the right play here. Or maybe there is no definitive right or wrong play at this point? Notably the 256x20 didn't think very highly of the best move 150 L2 of the first network.

As a sanity check, move 148 E5 is generally good with all 3 networks although not quite the 60% win rate from the live game:

weight: 8c67ecdc, visit: 32000
 B10 ->   13554 (V: 51.08%) (N:  0.26%) PV: B10 L12 C11 D11 C12 G16 M11 E5 D12 E14 F15 G14 E12 D14 E11
  A6 ->    7745 (V: 50.79%) (N:  0.20%) PV: A6 H2 K3 E5 A4 C1 A2 L12 B10 M11 H10 C11 G17 F18 G16 H17 S15 O12 S17
  E5 ->    3387 (V: 50.49%) (N:  7.20%) PV: E5 K3 K2 L3 L4 L2 M2 H2 L1 H1 E1 K1 J1 J3 D1 A5
  F7 ->    2702 (V: 50.75%) (N:  0.80%) PV: F7 K3 L4 L2 M2 H2 K2 L3 K5 J5 L1 K1 J1 H1 E1 J3 D1 A5 S15
  L6 ->    2086 (V: 47.85%) (N: 43.65%) PV: L6 E5 G4 H2 K3 E7 A6 G7 A4 C1 A2 B3 A3 E1 B10 B9 C9 C10 B11

weight: 5d6d9c5b, visit: 32000
  A6 ->   15661 (V: 48.35%) (N:  0.26%) PV: A6 H2 K3 E5 A4 C1 F7 F6 B10 L12 A2 M11 H10 C11 G17 F18 G16 H17 S15 S16 T16 T17
 B10 ->    3480 (V: 48.33%) (N:  0.36%) PV: B10 L12 M11 H2 K3 E5 C11 D11 S15 S16 T16 T17 T15 S18 G17 F18 G16
  L6 ->    3126 (V: 46.37%) (N: 43.14%) PV: L6 E5 G4 H2 K3 E7 A6 G8 H8 G7 K6 J7 A4 C1 A2
  E5 ->    2988 (V: 48.02%) (N:  6.89%) PV: E5 K3 L4 L2 M2 H2 K2 L3 L1 H1 E1 K1 J1 J3 D1 A5 S15
  J7 ->    2463 (V: 48.18%) (N:  2.86%) PV: J7 E5 G4 H2 K3 E7 K4 K5 F7 D8 D9 C8 C9 B8 E8 B9
  F7 ->    1521 (V: 48.26%) (N:  0.75%) PV: F7 K3 R2 S2 L4 L2 M2 H2 K2 L3 L1 K1 R3 S3 J1 H1 E1 J3

weight: 1629cbfd, visit: 32000
  F7 ->   13345 (V: 52.09%) (N:  0.26%) PV: F7 L6 G17 F18 G16 H17 J7 K3 H2 P7 O8 R6 Q6 P5 N6 O7 Q7 P4 L2
 G17 ->    4557 (V: 52.07%) (N:  0.52%) PV: G17 L12 M11 H2 K3 E5 B10 C11 S15 S16 T16 T17 T15 S18 H17 H18 J17 J18 K17
 B10 ->    4011 (V: 52.08%) (N:  0.12%) PV: B10 L12 M11 H2 K3 E5 G17 M12 N13 N12 O13 C10 B9 B11 S15 S16 T16 T17 T15
  A6 ->    3738 (V: 52.08%) (N:  0.02%) PV: A6 L12 A4 C1 M11 H2 K3 E5 A2 M12 E1 A3 B3 B17 B18 A3 R2 S2 B3
  E5 ->    2792 (V: 51.97%) (N:  2.25%) PV: E5 K3 K2 L4 L3 P7 O8 P5 P4 R6 P6 Q6 P5 Q7 O7 J3 R2 R3 S2 R8 S8 S7 S9 T2

All 3 networks do have the PV: E5 K3 K2, but comparing the win rates from move 148 (50.49%, 48.02%, 51.97%) to the above move 150 K2 (after black plays K3): 42.85%, 41.13%, 42.36%. So even when playing the expected variation, the win rate drops by about 8% for each network, so it seems like the value head is off… ?

Still need to figure out why the sudden drop. Must be some blind point involved somewhere. Perhaps it thought the black group can be killed but actually not?

No idea how to test this right now.