-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read SPSA from File #1926
Read SPSA from File #1926
Conversation
I made this a draft btw since it requires the equivalent to be merged in Stockfish at somewhat the same time.
basically saying that these are likely just random walks which happen to hit a local optimum, maybe @vdbergh would like to voice his opinions on this or something :) |
I doubt SPSA tunes are computationally expensive unless pushing parameter counts ridiculously high, its one of the cheapest tuning algorithms. The success rate cannot be measured until we try it, generally its not in the spirit of the project to reject trying something on theoretical grounds, otherwise 80% of fishtest patches would be not approved. The theoretical perspective also varies. I believe the most important factor for a SPSA is the elo across the parameter interval. At least for the 3072 L1 biases (which would already exceed the cutechess limit), there are no issues here. I believe large scale SPSA tunes can work on the principle of parameter continuity i.e by the average convergence of each parameter being towards their optimum, even if it doesnt fully converge. I would also argue this change is good even without allowing for large spsa tunes, since it reduces cutechess and OS dependence. There will be no change in behaviour possible if a transistion to fastchess occurs, or when switching from linux to windows. |
This would be all nice if this wasn't the case: |
Another one is about to pass (Jinx) |
I think it's about many variables approaching a decent value suggesting that the parameters are conitnious, meaning that most parameters approaching even half the way to their optimal value is suffecient for the tests to pass one might think that measuring quantized two points is different than trying to optimize, because when we do SPSA the assumption is that the values are not quantized or else the whole theory falls, since if you if we are at x1 and optimal is c1 then you would expect that (x1 + c1 )/ 2 is better than x1 if it is not the case collectively then there is no point in SPSA Suggesting that an optimal value for a single param if tuned alone can be in a superposition state of optimal values [7, 1, -9] which at this point we shouldn't try using SPSA at all. |
Lc0 has already started experiments with large scale SPSA tunes. zz4032 of the leela discord writes:
Note this is the person that does all SPRT testing on leela side, so the testing setup is likely to be correct also.
They run into no issues tuning |
This PR would be extremely beneficial IMO. Ability to read tune results from file is something that has been lacking for quite a long time, with no downsides. It also doesn't make sense to reject the patch based on the theoretical assumption that all large-scale tunes are inherently "random walks". At least in my search tuning attempts with 60-70 parameters SPSA works perfectly for my needs, and the observed success rate is way higher than 50%. Viren's passed L1-3072 net SPSA had even more parameters. I believe we shouldn't deter people from trying such large-scale tunes based on unproven theoretical basis alone, as other devs have said, this is against spirit of the project. |
Created by setting output weights (256) and biases (8) of the previous main net nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2 This used modified fishtest dev workers to construct .nnue files from spsa params, then load them with EvalFile when running tests: https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker Inspired by researching loading spsa params from files: official-stockfish/fishtest#1926 Scripts for modifying nnue files and preparing params: https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue spsa params: weights: [-127, 127], c_end = 6 biases: [-8192, 8192], c_end = 64 Example of reading output weights and biases from the previous main net using nnue-pytorch and printing spsa params in a format compatible with fishtest: ``` import features from serialize import NNUEReader feature_set = features.get_feature_set_from_name("HalfKAv2_hm") with open("nn-ddcfb9224cdb.nnue", "rb") as f: model = NNUEReader(f, feature_set).model c_end_weights = 6 c_end_biases = 64 for i in range(8): for j in range(32): value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127) print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020") for i in range(8): value = int(model.layer_stacks.output.bias[i] * 600 * 16) print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020") ``` For more info on spsa tuning params in nets: official-stockfish#5149 official-stockfish#5254 Passed STC: https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37 LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 32000 W: 8443 L: 8137 D: 15420 Ptnml(0-2): 80, 3627, 8309, 3875, 109 Passed LTC: https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 172176 W: 43822 L: 43225 D: 85129 Ptnml(0-2): 97, 18821, 47633, 19462, 75 bench 993416
Created by setting output weights (256) and biases (8) of the previous main net nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2 This used modified fishtest dev workers to construct .nnue files from spsa params, then load them with EvalFile when running tests: https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker Inspired by researching loading spsa params from files: official-stockfish/fishtest#1926 Scripts for modifying nnue files and preparing params: https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue spsa params: weights: [-127, 127], c_end = 6 biases: [-8192, 8192], c_end = 64 Example of reading output weights and biases from the previous main net using nnue-pytorch and printing spsa params in a format compatible with fishtest: ``` import features from serialize import NNUEReader feature_set = features.get_feature_set_from_name("HalfKAv2_hm") with open("nn-ddcfb9224cdb.nnue", "rb") as f: model = NNUEReader(f, feature_set).model c_end_weights = 6 c_end_biases = 64 for i in range(8): for j in range(32): value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127) print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020") for i in range(8): value = int(model.layer_stacks.output.bias[i] * 600 * 16) print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020") ``` For more info on spsa tuning params in nets: official-stockfish#5149 official-stockfish#5254 Passed STC: https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37 LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 32000 W: 8443 L: 8137 D: 15420 Ptnml(0-2): 80, 3627, 8309, 3875, 109 Passed LTC: https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 172176 W: 43822 L: 43225 D: 85129 Ptnml(0-2): 97, 18821, 47633, 19462, 75 closes official-stockfish#5459 bench 1120091
Created by setting output weights (256) and biases (8) of the previous main net nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2 This used modified fishtest dev workers to construct .nnue files from spsa params, then load them with EvalFile when running tests: https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker Inspired by researching loading spsa params from files: official-stockfish/fishtest#1926 Scripts for modifying nnue files and preparing params: https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue spsa params: weights: [-127, 127], c_end = 6 biases: [-8192, 8192], c_end = 64 Example of reading output weights and biases from the previous main net using nnue-pytorch and printing spsa params in a format compatible with fishtest: ``` import features from serialize import NNUEReader feature_set = features.get_feature_set_from_name("HalfKAv2_hm") with open("nn-ddcfb9224cdb.nnue", "rb") as f: model = NNUEReader(f, feature_set).model c_end_weights = 6 c_end_biases = 64 for i in range(8): for j in range(32): value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127) print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020") for i in range(8): value = int(model.layer_stacks.output.bias[i] * 600 * 16) print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020") ``` For more info on spsa tuning params in nets: official-stockfish#5149 official-stockfish#5254 Passed STC: https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37 LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 32000 W: 8443 L: 8137 D: 15420 Ptnml(0-2): 80, 3627, 8309, 3875, 109 Passed LTC: https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 172176 W: 43822 L: 43225 D: 85129 Ptnml(0-2): 97, 18821, 47633, 19462, 75 closes official-stockfish#5459 bench 1120091
So any progress on this one? |
This worked for ~1k params when I tried it. However started getting errors around 2k of the same param type (L2 weights). Something about connections stalling. |
connections stalling is the engine not responding quickly enough to the game manager / timeout. I'm assuming SF might not be linear scaling with the number of options / parameters it is given? |
I think that's 'just' the prefactor (obviously we should fix), and something scales quadratic in the number of options. |
This is with the fast tolower function.
|
perfect quadratic fit. |
Sometimes it's good if we also read the code.. https://github.com/official-stockfish/Stockfish/blob/master/src/tune.h#L70 As an explanation, when we set an option it's on_tune function is called, which again calls Since I think fishtest always sends all options (to be confirmed) you can just add |
so the PR is just fine, and changes are to be made in SF side? |
This is to allow for large tunes, skips dependent on cutechess and window command line restrictions for large amount of params. Should work with https://github.com/official-stockfish/Stockfish/compare/master...Disservin:Stockfish:read-tune-from-file?expand=1
Well there is no real timeout issue since the tuner just needs to enable the update on last or we change the default in stockfish. |
This is to allow for large tunes, skips dependence on cutechess and Windows command line restrictions for large amount of params.
Needs https://github.com/official-stockfish/Stockfish/compare/master...Disservin:Stockfish:read-tune-from-file?expand=1
fixes #1792