Recommendation for CCC Engines #214

AndyGrant · 2024-06-14T02:36:01Z

To all CCC engine authors,

Due to the nature of the CCC events running on 250-threads, I would recommend the following to everyone. My reasons will follow below.

As soon as your engine knows its processing a uci "go ..." command, you should set the start time for your internal timers. You should not wait until you've allocated all your data, or until you've spawned all your helper threads, ..
You may want to consider using a Threading solution where your helper threads sit dormant until there is work todo, instead of having to { allocate, start, run, terminate } your helpers on every search. There is non-negligible overhead associated with this. But please be careful that your helper threads are not slamming the CPU when they are SUPPOSED to be sleeping.
For CCC, TCEC, and your own sanity, you should always report a final "info ..." command right before you report the best move.

It should be noted that not all of these things are done in Ethereal. So I'm not suggesting for anyone to look at Ethereal as an example of how to do it.

The reason for [1] is that I'm seeing engines in CCC that in the Cutechess logs, there is a discrepancy between the engine's understanding of the time ( as per the final "info ... time " ) output, compared to Cutechess's time stamps associated with the initial go command and receiving the info line. Some engines this time looks to be only a few milliseconds ( Torch, Stockfish, Arasan, Revenge, Akimbo, Igel to name a few ). But others appear to have excessive gaps, sometimes multiple hundreds of milliseconds ( BlackMarlin, Equisetum, Minic, Willow to name a few ).

The reason for [2] is related to [1]. Generally speaking, it should not take that much time to startup your search once you see the "go" command. I imagine most of that time for the engines with excessive gaps is stemming from having to allocate/start all threads every time. This chips into your time by a bit. But also, it can cause some weird bugs.

Lets take an imaginary engine, Weiss. Weiss' search startup looks like this:

for (int i = 1; i < num_threads; i++)
   create_and_start_search_thread();
start_search_thread(0);

Weiss' main thread will create and start all of the helper threads, before starting its own search. Lets say that this takes 500ms. Now lets imagine that Weiss was told it only had 520ms on the clock. As soon as Weiss finishes depth 1, it will check the clock and realize that it is about to flag, and then it will stop searching. This might lead to you playing a move only searched to depth 1. Although this is avoided if you employ the concept of thread voting.

The reason for [3] is mostly that it is just good practice. For most engines, most of the time, [3] probably seems like a guarantee. You finish a depth, you report it, you check the clock, you decide to stop, you print the bestmove. But imagine the following scenarios instead, where you can suddenly get a large gap in time between the last info line, and the bestmove report.

Common Case:
You are using only 1 search thread.
You finish depth 14, report the info, and you decide you want to continue searching.
While you are searching, you realize you are getting too close to flagging, or whatever concept your engine has of "max time" to spend.
You abort your depth 15 search in the middle.
You don't print anything, since you aborted the search.
You report the bestmove.

In Torch, I make sure that if the search ended in some weird way...

Decided to stop in the middle of a search due to time
Decided to stop in the middle of a search due to another thread saying we're done
Stopped in the middle of a search due to hitting the "go nodes " that we were sent
ETC

Then I make sure to report another info line.

…180) This rules out the first bullet point in AndyGrant/Ethereal#214, where Stash would previously start the timer from the main search thread after starting all the worker threads. We now instead start the timer from the UCI thread, after ensuring the possible previous search has completed. As a side-effect, this should allow us to handle stricter time constraints and machine loads without burning the clock (most notably on noobpwnftw's machines on Grantnet and the 256-core server at CCC), so the Move Overhead default value is reduced back to its historical 30ms. No timeouts were observed during testing for now, but I might adjust the overhead later if I start observing time losses again. Passed non-regression STC: Elo | 4.92 +- 4.39 (95%) SPRT | 8.0+0.08s Threads=1 Hash=16MB LLR | 2.95 (-2.94, 2.94) [-4.00, 1.00] Games | N: 6494 W: 1272 L: 1180 D: 4042 Penta | [56, 593, 1872, 655, 71] http://chess.grantnet.us/test/37474/ Bench: 3,844,164

cosmobobak mentioned this issue Jun 25, 2024

Make UCI go command more well-behaved. cosmobobak/viridithas#159

Merged

mhouppin mentioned this issue Jul 3, 2024

Initialize the search timer earlier to account for worker init delay mhouppin/stash-bot#180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommendation for CCC Engines #214

Recommendation for CCC Engines #214

AndyGrant commented Jun 14, 2024 •

edited

Loading

Recommendation for CCC Engines #214

Recommendation for CCC Engines #214

Comments

AndyGrant commented Jun 14, 2024 • edited Loading

AndyGrant commented Jun 14, 2024 •

edited

Loading