Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read SPSA from File #1926

Closed
wants to merge 1 commit into from
Closed

Read SPSA from File #1926

wants to merge 1 commit into from

Conversation

peregrineshahin
Copy link
Contributor

@peregrineshahin peregrineshahin commented Apr 5, 2024

This is to allow for large tunes, skips dependence on cutechess and Windows command line restrictions for large amount of params.

Needs https://github.com/official-stockfish/Stockfish/compare/master...Disservin:Stockfish:read-tune-from-file?expand=1

fixes #1792

@Disservin Disservin marked this pull request as draft April 6, 2024 17:54
@Disservin
Copy link
Member

Disservin commented Apr 6, 2024

I made this a draft btw since it requires the equivalent to be merged in Stockfish at somewhat the same time.
@vondele had some concerns regarding how computional expensive large tunes are and their success rate when it comes to tune with many parameters.

btw, I personally think this is not a good feature. Even with the current limitation, parameter tunes with a lot of params are taxing for fishtest.
I have also not seen convincing evidence that parameter tunes with 10-100s of parameters really work.. (from discord)

basically saying that these are likely just random walks which happen to hit a local optimum, maybe @vdbergh would like to voice his opinions on this or something :)

@Viren6
Copy link
Contributor

Viren6 commented Apr 7, 2024

I doubt SPSA tunes are computationally expensive unless pushing parameter counts ridiculously high, its one of the cheapest tuning algorithms. The success rate cannot be measured until we try it, generally its not in the spirit of the project to reject trying something on theoretical grounds, otherwise 80% of fishtest patches would be not approved.

The theoretical perspective also varies. I believe the most important factor for a SPSA is the elo across the parameter interval. At least for the 3072 L1 biases (which would already exceed the cutechess limit), there are no issues here. I believe large scale SPSA tunes can work on the principle of parameter continuity i.e by the average convergence of each parameter being towards their optimum, even if it doesnt fully converge.

I would also argue this change is good even without allowing for large spsa tunes, since it reduces cutechess and OS dependence. There will be no change in behaviour possible if a transistion to fastchess occurs, or when switching from linux to windows.

@Vizvezdenec
Copy link

This would be all nice if this wasn't the case:
https://github.com/official-stockfish/Stockfish/commits?author=XInTheDark
Sorry I fail to believe in the slightest that this are "random walks", since it has effectiveness rate of > 50%.

@peregrineshahin
Copy link
Contributor Author

This would be all nice if this wasn't the case: https://github.com/official-stockfish/Stockfish/commits?author=XInTheDark Sorry I fail to believe in the slightest that this are "random walks", since it has effectiveness rate of > 50%.

Another one is about to pass (Jinx)
https://tests.stockfishchess.org/tests/view/66115eacbfeb43334bf7eddd

@peregrineshahin
Copy link
Contributor Author

peregrineshahin commented Apr 8, 2024

I think it's about many variables approaching a decent value suggesting that the parameters are conitnious, meaning that most parameters approaching even half the way to their optimal value is suffecient for the tests to pass

one might think that measuring quantized two points is different than trying to optimize, because when we do SPSA the assumption is that the values are not quantized or else the whole theory falls, since if you if we are at x1 and optimal is c1 then you would expect that (x1 + c1 )/ 2 is better than x1 if it is not the case collectively then there is no point in SPSA

Suggesting that an optimal value for a single param if tuned alone can be in a superposition state of optimal values [7, 1, -9] which at this point we shouldn't try using SPSA at all.

@Viren6
Copy link
Contributor

Viren6 commented Apr 8, 2024

Lc0 has already started experiments with large scale SPSA tunes. zz4032 of the leela discord writes:

I've been testing around with my own SPSA-like tuning on the old and small 744204 network. Specifically the policy values. Got +15 Elo which is not bad for a saturated net at the end of its run. ( Selfplay at 800 nodes.)

Note this is the person that does all SPRT testing on leela side, so the testing setup is likely to be correct also.
They are even using an inefficient SPSA method, presumably because its easier to implement:

Not exactly SPSA:
Script takes the 147456 policy parameters
Assigns value+delta or value-delta (sign is random) to each parameter
Runs an iteration of X games
Saves match Elo for each parameter (and whether it was value+delta or value-delta)
after Y iterations network is finally saved with parameter value + or - delta based on which has higher Elo on average in all iterations
Basically running matches with all parameters randomly shifted and finding what was best on average for each parameter.

They run into no issues tuning 147456 parameters all at once. This should be a SF original invention, it would be unfortunate if we lost that due to delays in merging this. Other devs even from other projects can clearly see the potential.

@XInTheDark
Copy link

This PR would be extremely beneficial IMO. Ability to read tune results from file is something that has been lacking for quite a long time, with no downsides.

It also doesn't make sense to reject the patch based on the theoretical assumption that all large-scale tunes are inherently "random walks". At least in my search tuning attempts with 60-70 parameters SPSA works perfectly for my needs, and the observed success rate is way higher than 50%. Viren's passed L1-3072 net SPSA had even more parameters. I believe we shouldn't deter people from trying such large-scale tunes based on unproven theoretical basis alone, as other devs have said, this is against spirit of the project.

linrock added a commit to linrock/Stockfish that referenced this pull request Jul 6, 2024
Created by setting output weights (256) and biases (8) of the previous main net
nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2

This used modified fishtest dev workers to construct .nnue files from
spsa params, then load them with EvalFile when running tests:
https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker

Inspired by researching loading spsa params from files:
official-stockfish/fishtest#1926

Scripts for modifying nnue files and preparing params:
https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue

spsa params:
  weights: [-127, 127], c_end = 6
  biases: [-8192, 8192], c_end = 64

Example of reading output weights and biases from the previous main net using
nnue-pytorch and printing spsa params in a format compatible with fishtest:

```
import features
from serialize import NNUEReader

feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-ddcfb9224cdb.nnue", "rb") as f:
    model = NNUEReader(f, feature_set).model

c_end_weights = 6
c_end_biases = 64

for i in range(8):
    for j in range(32):
        value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127)
        print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020")

for i in range(8):
    value = int(model.layer_stacks.output.bias[i] * 600 * 16)
    print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020")
```

For more info on spsa tuning params in nets:
official-stockfish#5149
official-stockfish#5254

Passed STC:
https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 32000 W: 8443 L: 8137 D: 15420
Ptnml(0-2): 80, 3627, 8309, 3875, 109

Passed LTC:
https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 172176 W: 43822 L: 43225 D: 85129
Ptnml(0-2): 97, 18821, 47633, 19462, 75

bench 993416
vondele pushed a commit to vondele/Stockfish that referenced this pull request Jul 9, 2024
Created by setting output weights (256) and biases (8) of the previous main net
nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2

This used modified fishtest dev workers to construct .nnue files from
spsa params, then load them with EvalFile when running tests:
https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker

Inspired by researching loading spsa params from files:
official-stockfish/fishtest#1926

Scripts for modifying nnue files and preparing params:
https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue

spsa params:
  weights: [-127, 127], c_end = 6
  biases: [-8192, 8192], c_end = 64

Example of reading output weights and biases from the previous main net using
nnue-pytorch and printing spsa params in a format compatible with fishtest:

```
import features
from serialize import NNUEReader

feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-ddcfb9224cdb.nnue", "rb") as f:
    model = NNUEReader(f, feature_set).model

c_end_weights = 6
c_end_biases = 64

for i in range(8):
    for j in range(32):
        value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127)
        print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020")

for i in range(8):
    value = int(model.layer_stacks.output.bias[i] * 600 * 16)
    print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020")
```

For more info on spsa tuning params in nets:
official-stockfish#5149
official-stockfish#5254

Passed STC:
https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 32000 W: 8443 L: 8137 D: 15420
Ptnml(0-2): 80, 3627, 8309, 3875, 109

Passed LTC:
https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 172176 W: 43822 L: 43225 D: 85129
Ptnml(0-2): 97, 18821, 47633, 19462, 75

closes official-stockfish#5459

bench 1120091
yl25946 pushed a commit to yl25946/Stockfish that referenced this pull request Jul 9, 2024
Created by setting output weights (256) and biases (8) of the previous main net
nn-ddcfb9224cdb.nnue to values found around 12k / 120k spsa games at 120+1.2

This used modified fishtest dev workers to construct .nnue files from
spsa params, then load them with EvalFile when running tests:
https://github.com/linrock/fishtest/tree/spsa-file-modified-nnue/worker

Inspired by researching loading spsa params from files:
official-stockfish/fishtest#1926

Scripts for modifying nnue files and preparing params:
https://github.com/linrock/nnue-pytorch/tree/no-gpu-modify-nnue

spsa params:
  weights: [-127, 127], c_end = 6
  biases: [-8192, 8192], c_end = 64

Example of reading output weights and biases from the previous main net using
nnue-pytorch and printing spsa params in a format compatible with fishtest:

```
import features
from serialize import NNUEReader

feature_set = features.get_feature_set_from_name("HalfKAv2_hm")
with open("nn-ddcfb9224cdb.nnue", "rb") as f:
    model = NNUEReader(f, feature_set).model

c_end_weights = 6
c_end_biases = 64

for i in range(8):
    for j in range(32):
        value = round(int(model.layer_stacks.output.weight[i, j] * 600 * 16) / 127)
        print(f"oW[{i}][{j}],{value},-127,127,{c_end_weights},0.0020")

for i in range(8):
    value = int(model.layer_stacks.output.bias[i] * 600 * 16)
    print(f"oB[{i}],{value},-8192,8192,{c_end_biases},0.0020")
```

For more info on spsa tuning params in nets:
official-stockfish#5149
official-stockfish#5254

Passed STC:
https://tests.stockfishchess.org/tests/view/66894d64e59d990b103f8a37
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 32000 W: 8443 L: 8137 D: 15420
Ptnml(0-2): 80, 3627, 8309, 3875, 109

Passed LTC:
https://tests.stockfishchess.org/tests/view/6689668ce59d990b103f8b8b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 172176 W: 43822 L: 43225 D: 85129
Ptnml(0-2): 97, 18821, 47633, 19462, 75

closes official-stockfish#5459

bench 1120091
@Vizvezdenec
Copy link

So any progress on this one?
It worked even for leela 140000 params nets...

@linrock
Copy link
Contributor

linrock commented Jul 19, 2024

This worked for ~1k params when I tried it. However started getting errors around 2k of the same param type (L2 weights). Something about connections stalling.

@vondele
Copy link
Member

vondele commented Jul 19, 2024

connections stalling is the engine not responding quickly enough to the game manager / timeout.

I'm assuming SF might not be linear scaling with the number of options / parameters it is given?

@Disservin
Copy link
Member

Disservin commented Sep 7, 2024

I'm assuming SF might not be linear scaling with the number of options / parameters it is given?

Okay stockfish scales terribly.

Setting 9k options:

  1. Master

Time taken: 84.40199494361877s
image

okay tolower seems to eat cycles, for now let's get rid of the CaseInsensitiveLess for the map.

  1. Removed CaseInsensitiveLess already a lot faster.

Time taken: 16.95056438446045s

image

better but still quite slow..

  1. Change map to unordered_map

Time taken: 4.023869276046753s

image

accaptable timing I guess.


A tolower function which looks like this already gets it down to

Time taken: 18.308947801589966s

char fast_tolower(char c) {
    if (c >= 'A' && c <= 'Z')
        return c + ('a' - 'A');
    return c;
}

Probably we can also choose a different implementation for the std::map?

Note: This might be obvious, but I currently don't know the reason: Setting the first 2077 options was almost instantaneous across all test cases, but beyond that, it became very slow.

@vondele
Copy link
Member

vondele commented Sep 8, 2024

I think that's 'just' the prefactor (obviously we should fix), and something scales quadratic in the number of options.

@Disservin
Copy link
Member

Disservin commented Sep 8, 2024

I think that's 'just' the prefactor (obviously we should fix), and something scales quadratic in the number of options.

This is with the fast tolower function.

undefined

1000, 0.00825762748718261
2000, 0.0150268077850341
2500, 0.3237342834472656
3000, 0.7774295806884766
4000, 2.26151657104492
5000, 4.2387611865997314
6000, 6.699602127075195
7000, 10.013187408447266
8000, 13.900388479232788
16000, 70.52726817131042

@vondele
Copy link
Member

vondele commented Sep 8, 2024

perfect quadratic fit.

@Disservin
Copy link
Member

Disservin commented Sep 8, 2024

Sometimes it's good if we also read the code.. https://github.com/official-stockfish/Stockfish/blob/master/src/tune.h#L70

As an explanation, when we set an option it's on_tune function is called, which again calls Tune::read_options which loops over all tune variables again.

Since I think fishtest always sends all options (to be confirmed) you can just add UPDATE_ON_LAST(); and it will only update once the last option is set.
16k options take 0.15139484405517578s with that.

@peregrineshahin
Copy link
Contributor Author

so the PR is just fine, and changes are to be made in SF side?

This is to allow for large tunes, skips dependent on cutechess and window command line restrictions for large amount of params.

Should work with https://github.com/official-stockfish/Stockfish/compare/master...Disservin:Stockfish:read-tune-from-file?expand=1
@Disservin
Copy link
Member

Disservin commented Sep 9, 2024

Well there is no real timeout issue since the tuner just needs to enable the update on last or we change the default in stockfish.

@peregrineshahin peregrineshahin closed this by deleting the head repository Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

prevent windows workers from running spsa with large number of params
7 participants