Some test of "Hybrid" weight on 5x64 network of leela

Question

Some test of "Hybrid" weight on 5x64 network of leela

pangafu opened this issue 6 years ago · comments

As we discussed the "Hybrid weight" in #814 before, I did some test on 5x64 network, and have some interesting result, I open a new thread and post here.

I remember the learning rate is very low at the last of 5B network, so I download some 5B network, include last 5B king c83, second king 35d, and some random weights after c83 (because I don't have the win rate of 5B network), network list below:
c83e1b6e0ffbf8e684f2d8f6261853f14c553b29ee0e70ff6c34e87d28009c43
35df1f93351d7edea1f1251dbcff6131a18dc9b9c25d634558d747a55e6920e4
87973d8fe2db18599d136c47f1f54634a22fea1876f2283a8330fde13b5bf1aa
74ca7a1b11a7841a7f195f2b87dd55fa80dc82fb56887bad9c69e32f44717b4d
6fd1a91be8ed13b8cfc4886a51e77a2738705261c6ede9a4f117d8c4073faec8
5b2b40ea018492b26da33966f78abfe0caa35f98167c26a743340cc7e7232204

Then I wrote a match program to auto “hybrid” weight and match (https://github.com/pangafu/Hybrid_LeelaZero), and I found in Playout 200, the strongest weight is 879-6fd-c83_1-0.8-0.8.txt, match result is:
879-6fd-c83_1-0.8-0.8.txt vs 35df1f93351d7edea1f1251dbcff6131a18dc9b9c25d634558d747a55e6920e4.txt :27(67.5%) : 13(32.5%)
879-6fd-c83_1-0.8-0.8.txt vs c83e1b6e0ffbf8e684f2d8f6261853f14c553b29ee0e70ff6c34e87d28009c43.txt :23(57.5%) : 17(42.5%)

Then I match it in Playout 1,600, the result is:
879-6fd-c83_1-0.8-0.8.txt vs 35df1f93351d7edea1f1251dbcff6131a18dc9b9c25d634558d747a55e6920e4.txt :42(70.0%) : 18(30.000000000000004%)
879-6fd-c83_1-0.8-0.8.txt vs c83e1b6e0ffbf8e684f2d8f6261853f14c553b29ee0e70ff6c34e87d28009c43.txt :39(65.0%) : 21(35.0%)

Finally, I match it in Playout 16,000, the result is:
879-6fd-c83_1-0.8-0.8.txt vs c83e1b6e0ffbf8e684f2d8f6261853f14c553b29ee0e70ff6c34e87d28009c43.txt :39(65.0%) : 21(35.0%)

And the all match log can download from here:
match.log

The 879-6fd-c83_1-0.8-0.8 weight can download from here:
879-6fd-c83_1-0.8-0.8.zip

The Playout 16000 match can download from here:
H879vsC83_PO16000.zip

Some interesting result:

"Hybrid" can strength the network even in very low learning rate, so it maybe not the learning rate problem.
"Hybrid" can win at High Playout, so it may not cause by reduce the network noise
Three network "Hybrid" seem stronger than two network "Hybrid".
In my opinion, "Hybrid" maybe equal to assemble and average several network's output, which to make the network predict more accurate.