Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Royal Game of Ur Environment #1115

Open
Alian3785 opened this issue Nov 11, 2023 · 4 comments
Open

Royal Game of Ur Environment #1115

Alian3785 opened this issue Nov 11, 2023 · 4 comments

Comments

@Alian3785
Copy link

Alian3785 commented Nov 11, 2023

I would like to request Royal Game of Ur environment for Pgx. Royal Game of Ur is a simple race game with chance and perfect information but it has some distinct features.

First, it might be an oldest board game: https://en.wikipedia.org/wiki/Royal_Game_of_Ur
Second, despite its relative simplicity (15 square board with 7 possible actions) it is not solved and there are no superhuman bots for it, though bots of various strength exist and can serve as benchmarks
Third, its hard to estimate it`s difficulty for RL algorithms but it seems that it may occupy the spot between very simple games (connect4) and big ones (chess, backgammon) while being real game and not a simplified version.

The rules of the game are here: https://royalur.net/rules/
You can play the game here: https://royalur.net/game/
Bots for possible benchmarks are here: https://github.com/RoyalUr/RoyalUr-Analysis/blob/main/docs/Agents.md#-the-random-agent-
Discord community: https://discord.com/invite/Ea49VVru5N
Python implementation: https://github.com/RoyalUr/RoyalUr-Python

Another word about discord community. For a while the community wants to develop self-play bots, ideally superhuman like it has been done for all Pgx games. We have sufficient computing resourses (high end ML accelerators) and time for training Alphazero. But we lack JAX knowledge to implement a game.

If you implement Game of Ur in your library we`ll be able to conduct training runs and testing against bots and humans and happily share the results. If we are lucky strongest bots may be achieved and teach us how this old game really should be played.

Anyway, thanks for the library and good luck at NeurIPS

@sotetsuk
Copy link
Owner

Thank you for proposing the Royal Game of Ur for the Pgx library! 👍
Your idea is truly captivating, especially given the limited number of stochastic games currently in Pgx.
The addition of such a historic game is very appealing.
While we're currently limited by resource constraints and cannot immediately embark on this project,
we recognize its potential value and are considering giving it a higher priority among new game candidates.
We greatly value your enthusiasm and contributions to the community.
Please continue to share your progress with us. Your support is much appreciated! 🙏

@Alian3785
Copy link
Author

Thanks, I'm glad you liked the idea. Most contributions to the community were not done by me personally. I still have one question and one (smaller) request:

Is the code of the game all in one file? Can I make a simple new game by editing only tic_tac_toe.py for instance or are there other code involved elsewhere in the library?

And related request is about instruction to create custom environments. Most RL libraries have such instructions on very simple envs. This one for instance https://colab.research.google.com/github/araffin/rl-tutorial-jnrr19/blob/sb3/5_custom_gym_env.ipynb Could you make one? Something with as little game specific code as possible and comments with explanations of jax parts for comprehension

@sotetsuk
Copy link
Owner

sotetsuk commented Nov 14, 2023

Is the code of the game all in one file?

Except for some complicated games, yes.
Tic-tac-toe logic is implemented in a single file.
Exceptions include chess and shogi.

Instructions for a new environment is a good idea. I'll add it till the end of this week this month👍

@Alian3785
Copy link
Author

Alian3785 commented Nov 29, 2023

While you at it, can Gumbel AlphaZero learn in games with chance like backgammon? Neither you nor Gumbel AlphaZero authors test on stockhastic games.

I have seen that Stochastic MuZero can, and so does basic Alphazero. Does Gumbel AlphaZero still retains this ability compared to basic AlphaZero?

stockh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants