A package for using Communication Mod with Slay the Spire, plus a simple AI with a reinforcement learning drafting AI module.
-
Install Communication Mod and prereqs
- ModTheSpire - Steam Workshop version
- BaseMod - Steam Workshop version
-
Run modded Slay The Spire with communication mod enabled
-
Update Communicaton Mod Config that is created afterwards
~/.config/ModTheSpire/CommunicationMod/config.properties
for linux
-
set
command=python3 path_to_script/main.py
-
Set specific run configs
boto = False
:Set to true and update AWS resources to point to personal buckets for backing up training data to s3 bucketsolo = False
: Run single seed from seed_list, currently 53HJXL2N4CEYIcontrol_group = False
: Set to True for using built-in drafter (in priorities.py)epochs = 2
: How many traversals of the seed_list. 1 means each seed is played once.
-
Launch Slay The Spire with mods and go to communication mods setting and click "Start External Process"
-
Sit back and watch your Ironclad bot try and slay the spire!
- The mod SuperFastMode helps a -lot- for speeding up training.
The core of the idea for this drafter is that cards have synergies with other cards. These can be positive or negative and represented as a symmetric matrix where rows and columns are cards.
Example:
- note that these are only 4 cards and are examples, true weights are learned over time!
Given that we know what cards are in our deck, we can multiply our deck against this matrix and generate scores for how well any card will synergize with our deck.
Our bot tries a bunch of different synergy matrices and the ones that go higher up in the spire are kept. So far the learning has improved on the built-in drafter performance and has a lot of room for future growth!
This drafting bot uses reinforcement learning, looking to maximize the number of floors climbed.
On first start: initialize synergy matrix with all 1s
- The bot essentially is choosing cards randomly
- Play through a set of seeds, recording information about the run (floor reached, score, card choices offered, cards chosen)
- Adjust synergy weights of all cards offered during run
- Replay the same seeds, recording down information
- If the average performance is better (bot got to higher floors), keep weights
- Else discard weights
- Adjust weights again
This loop continues for as long as desired. The epoch hyper-parameter is how many times to update these weights and run the gauntlet of seeds.
- Include simulated annealing to the learning method
- This will help allow for finding a more global maximum for card synergies
- Expand matrix to include upgraded and common cards
- currently all cards are treated as the same regardless of upgrade
- Searing Blow decks would like to have a word
- currently all cards are treated as the same regardless of upgrade
- Package this cleanly for steam workshop mod