-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for a “Delta Compression + Neural Network” Approach in Nine Men’s Morris #4
Comments
Hello! These are interesting ideas! One potential difficulty that I can see is that if there is a time limit for the Alpha-Beta search, then it's not deterministic: even if it finds an optimal move in a certain game state on your machine, maybe it doesn't find an optimal move in the same game state on a user's machine. To circumvent this problem, you could use a node limit instead of a time limit: during searches, you could keep track of the number of nodes explored in the game tree, and stop the iterative deepening when this number reaches a limit. This has the drawback that the program will spend different times on different computers, but it might be worth it. You might be able to set the limit low enough that it plays reasonably fast on most computers. Another difficulty could be finding a representative sample of important game states: If you play games with your program, it might not wander into many game states that human users or other programs would wander into. One thing you could do is to play games with many different engine configurations, and maybe even with different tweaks to the evaluation function. In win/loss positions, will you consider optimality with taking into account the number of moves to the end of the game (depth-to-win)? If yes, then the Alpha-Beta search will have a tougher time to find an optimal move: finding the quickest way to win is often harder than just finding any win. This means that not taking into account depth-to-win could potentially make the delta database much smaller. However, not taking into account depth-to-win (i.e., just going with any winning move), can cause a program to get into a cycle where every position of the cycle is a winning position, but the program is not actually winning, because of being in a cycle. A middle ground between not taking into account depth-to-win at all and taking it into account at every win/loss position could be to just pick one position of a cycle of winning positions that the Alpha-Beta runs into, and include just that one position in the delta database, so that the cycle is broken. This will prevent the program from running around in such cycles instead of winning, but at the same time would make the database much smaller than simply taking into account the depth-to-win when determining optimality in all win/loss positions. I'm curious to see how your research turns out! It will be interesting to see statistics about how often the Alpha-Beta search can find an optimal move. One more thing: if the delta database fits in RAM, you could use it not just to look up the value of the game state that is on the board (i.e., the root of a potential Alpha-Beta search), but to look up every game state that you encounter during the Alpha-Beta search, and then not explore it with the Alpha-Beta search if it's present in the delta database as a win/loss (if it's just present in the database as a draw, then it's less useful for the Alpha-Beta search; in that case, it's just a very loose bound on the result that the Alpha-Beta search would return: it's somewhere between the best loss and the worst win). It seems hard to estimate how big of a boost this would give to the Alpha-Beta search, but it seems like an interesting experiment at the very least. |
Dear Mr. Gévay and Mr. Danner,
I hope this message finds you well. I have been greatly inspired by your work on Nine Men’s Morris, Morabaraba, and Lasker Morris, particularly regarding extended and ultrastrong solutions. Building on the foundations you established, I am developing a preliminary research plan to integrate perfect database knowledge with traditional Alpha-Beta search in a more streamlined way, ultimately aiming to train a lightweight neural network for efficient, near-optimal play.
Below is the detailed design of my intended approach. While none of these steps have been implemented yet, I would be grateful for any guidance or suggestions you might share.
1. Overall Goal
Create a “delta-compressed” subset of the tablebase that captures only those states in which Alpha-Beta search fails to replicate the optimal solution. This compressed database should be dramatically smaller yet still ensure overall near-perfect decision-making.
2. Methodology in Detail
2.1 Segmenting the Database
Phased or Subset Handling:
Sampling Strategies:
2.2 Alpha-Beta Search Validation
Running Alpha-Beta:
Mismatch Detection:
2.3 Building the “Delta” Compressed Database
Core Principle:
Iterative Construction:
Use in Gameplay:
2.4 Neural Network Training
Data Sources:
Network Structure:
Training Workflow:
3. Potential Refinements
4. Invitation for Your Guidance
As we have not yet begun implementation, your expertise could help us refine the initial strategy and prioritize our efforts. We welcome any broad or specific suggestions, such as:
We would be honored to incorporate any insights or recommendations you might share.
Thank you for taking the time to review this proposal. Your research has been a significant inspiration, and I look forward to any thoughts or advice you may be willing to offer.
Warm regards,
Calcitem
The text was updated successfully, but these errors were encountered: