Less than 1% of the biomass of all life is of organisms that have any neurons at all. That means over 99% of all life learns with DNA-replication + mutation alone. However, no modern ML techniques look anything like this. That should change. This repo can produce a perfect tic-tac-toe player in ~300 lines of code using DNA-like learning. There is no optimizer, no gradients, and no loss function. It is more robust, conceptually simpler, and far more beautiful than conventional ML techniques to solve tic-tac-toe.
A few dozen training runs look like this: When playing perfectly against a perfect player, games should last 9 moves. As you can see the DNA-based players learn to play 9 moves quite quickly. And training tends to converge at similar speed.
To train a tic-tac-toe player just run (can be visualized with tensorboard):
./batch_arena.py
To play against one of the players from the population you just trained:
./play.py
To play against a perfect player that was trained classically:
./play.py --perfect
- 20% of the population is hardcoded to be a perfect player, without that convergence is not reliable
- The architecture is handcoded and arbitrary, that should ideally also be learned