Skip to content

An implementation of Deepmind's MuZero algorithm.

License

Notifications You must be signed in to change notification settings

MetaVai/MuZero

 
 

Repository files navigation

MuZero

This package started as Google Summer of Code 2021 project.

Here's a blogpost summarizing MuZero's summer journey.

This implementation is based on AlphaZero.jl, and is inspired by muzero-general.

TicTacToe Example

To train MuZero on tic tac toe, clone this repo, change branch to MuZero,

git clone https://github.com/michelangelo21/MuZero.git
cd MuZero
git checkout MuZero

and run

julia --project -e 'import Pkg; Pkg.instantiate()'
julia --project ./MuZero/scripts/train_tictactoe.jl 

then, to observe results, open tensorboard in a different terminal:

tensorboard --logdir results

after some time curves should look like this: learning results

Acknowledgement

This implementation wouldn't exist without Jonathan Laurent (project mentor, creator of AlphaZero.jl) and his valuable insights.

About

An implementation of Deepmind's MuZero algorithm.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Julia 99.9%
  • TeX 0.1%