Solving Wordle with Z3

pil0u · on Jan 11, 2022

In France, there is a TV show called Motus which is basically the same as Wordle but with 8-letter words (https://fr.wikipedia.org/wiki/Motus_(jeu_t%C3%A9l%C3%A9vis%C...)

Two young CS researchers took part in the competition after building their home-made algorithm (I suppose based on tree search) to find optimal words to search for, given the constraints. They learnt these optimal words/paths by heart before the show.

They never lost a game. Opponents scores were embarrassingly low, they actually broke the game. It was also very funny to watch because they spelled words that most people didn't know they exist.

I'll edit if I can find a video link (which is surprisingly struggling).

jerf · on Jan 11, 2022

One of the things that often surprises me is how unprepared people are coming in to game shows. You can win your yearly salary but you walk in off the street with no preparation at all.

Obviously, I don't expect everyone to determine and memorize the information-theoretic optimal technique for extracting information from a word game or anything. But for Pete's sake, if you're going to be on the Wheel of Fortune, clock some time with one of the dozen or so computer or console implementations. If you're going to be on the Family Feud, take a moment to explain to funny Uncle Harry that saying the stupid-but-funny answer that popped into his head may cost him and four people who know where he lives a few thousand bucks. If you're going to play the old Concentration game, which was based on rebuses, spend a few hours brushing up on common rebus symbols and patterns.

(That's another show that, were it on today, would probably be broken by someone on the Internet. There is probably less than meets the eye when it comes to what can be done on a rebus. Someone could probably analyze the show's run and work out which are the most common syllables they can use, what is phonetically difficult to represent with rebuses, etc.)

ISTR that the Wheel of Fortune, for instance, edits out even more mistakes than you'll see on the show, which is plenty, according to one of the "behind the scenes" things. It isn't that hard, anyone can get it in 15 minutes or less, which suggests a lot of contestants come in with not even that much experience.

This is for many people the most bang-for-the-monetary-buck they will ever see in their lives. There are other things that will be more important, like having families, but monetarily, this is probably the most money-per-minute they will ever see. Put at least a bit of work into it!

keville · on Jan 11, 2022

Wheel of Fortune is just as much about your luck on the wheel as your skill with the phrases, so practice doesn't guarantee victory. WoF's final puzzle awards contestants the "RSTLNE" letters now because nearly _everybody_ came in to the final prepared, knowing the most commonly-occurring English letters.

jerf · on Jan 11, 2022

You are right. I know there's no way to guarantee victory. But there's still a lot of basic errors, if you pay attention carefully to the contestants (rather than the more-fun trying to solve the puzzle before them). My favorite is actually when you get towards the end and they are calling out letters that couldn't possibly make any words with the 2-5 blanks left over, for instance. Saying a vowel after a spin also happens much more often than it should.

I don't watch WoF a lot but you can also tell they're playing games with the most common letters after RSTLNE too... simply picking the next three consonants and the next vowel is a great way to lose.

And, I mean, yes, very stressful. But you might also be a bit less stressed if you prepared a bit more. Not relaxed, of course, but less stressed.

dekhn · on Jan 11, 2022

my dad plays game shows as a hobby. My main complaint is that he's really conservative and will basically fold at the earliest opportunity (many of these games are basically double-down, so you lose everything if you make an incorrect guess). I told him the amounts are so small (he won $5K on Who Wants to Be A Millionare and $15K on jeopardy) it's better to just take the risk to win a larger prize.

He used to be a reference librarian so he's good with facts, and does crosswords (cryptics) constantly so he knows pretty much every word.

desdiv · on Jan 11, 2022

>It was also very funny to watch because they spelled words that most people didn't know they exist.

This kinda reminds me of the New Zealand Scrabble player who won multiple French World Scrabble Championships despite not speaking French. He just memorized the French Scrabble dictionary.

[1] https://en.wikipedia.org/wiki/Nigel_Richards_(Scrabble_playe...

Denvercoder9 · on Jan 11, 2022

A similar thing happened in the Netherlands. The TV show is called Lingo (https://en.wikipedia.org/wiki/Lingo_(Dutch_game_show)), it uses 5-letter words and you get the first letter for free. The guy who more or less beat it wrote a blog post about it: https://vincenttunru.com/hacking-a-gameshow/

a_imho · on Jan 11, 2022

In Hungary we have this table top game, iiuc this has the same mechanics just with colors. It was (maybe still is) popular in primary school aged 8-12.

https://okosjatek.cdn.shoprenter.hu/custom/okosjatek/image/d...

jhgb · on Jan 11, 2022

Basically Mastermind. We had Logik instead: https://cs.wikipedia.org/wiki/Logik_(hra)

But in Mastermind, you for example aren't told which pins are in correct positions, whereas in Wordle you are told that.

croon · on Jan 11, 2022

Yes, it's called Mastermind [0]. Just fewer colors than letters.

[0] https://en.wikipedia.org/wiki/Mastermind_(board_game)

dekhn · on Jan 11, 2022

And for the nerds: https://en.wikipedia.org/wiki/Mastermind_(board_game)#Worst_... (Knuth)

and In November 2004, Michiel de Bondt proved that solving a Mastermind board is an NP-complete problem when played with n pegs per row and two colors, by showing how to represent any one-in-three 3SAT problem in it.

schnitzelstoat · on Jan 11, 2022

I'd argue it's harder though as colors can form any sequence - they don't have to form legal words.

theFco · on Jan 11, 2022

I'd argue that at the same time it is easier because you do not need to find valid colour sequences (in wordle you can only enter actual words to explore the solution). I wonder if the result is easier or harder overall?

RaitoBezarius · on Jan 11, 2022

Here's the video : https://youtu.be/9SWG7RT4MrM :-)

motdiem · on Jan 11, 2022

You may be interested in sutom https://sutom.nocle.fr a hybrid wordle/motus for French that recently popped up

throwaway81523 · on Jan 11, 2022

This was a fun read, and it's cool that using z3 like that works at all, but the takeaway for me is that a SAT solver isn't that great a way to solve wordles. Using some basic probability and scanning the word list for max-entropy guesses seems like a more natural match to the problem. If you want to encapsulate the searching in an external library, sqlite3 might be a good choice from Python. Sqlite3's Python bindings let you connect Python functions to your sqlite3 instance, so you can call them from SQL. So you'd write Python functions for "matched letter in the right place", "matched letter in the wrong place", and "no match for that letter". Then all your searches would be simple SQL queries.

hiyer · on Jan 11, 2022

> the takeaway for me is that a SAT solver isn't that great a way to solve wordles.

It may not be the best way, but it seems to lead to very clean, compact and understandable code. I have no background with Z3 (or any SAT solver for that matter) but I didn't have much trouble following the author's approach.

toppy · on Jan 11, 2022

Don't get me wrong but what's your idea about representing 'matched letter in wrong place' in SQL instead of string indexing in Python for instance?

jeremysalwen · on Jan 11, 2022

> Python bindings let you connect Python functions to your sqlite3 instance, so you can call them from SQL

I think the idea is calling python functions from SQL to check 'matched letter in wrong place'.

notafraudster · on Jan 11, 2022

I treated yellow squares as two sequential rules: (1) regex with [A-Z] character classes less the characters you know are not in the exact spots -- this certifies you've eliminated the characters that shouldn't be there. These can be directly encoded into a single SQL query; (2) a series of in operators to check that the character is there at all. In SQL this could be `word LIKE '%S%'` after the initial (1) rule.

andopp · on Jan 11, 2022

Cool! I made a solver myself, yesterday. For every move, it ranks each legal word based on how much information that word will gain. It figures this out by trying all possible solutions that are still compatible with the information acquired so far. The word that gives the biggest reduction (worst case) in size of candidate solution pool, is the chosen word.

For yesterday's word, it yielded the following sequence of guesses: AESIR, DROPT, ABCEE, QUERY!

Most useful for us humans is that AESIR seems to be the best starting word.

Here is my java source code for this solver: https://pastebin.com/k1tTCyUR A bit obfuscated by some necessary optimziations.

loeg · on Jan 17, 2022

Mine does AESIR, DROPT, EVERY, then QUERY for Wordle 205. There are 3197 equally good guesses for the 3rd word (ABCEE / EVERY) on that puzzle, or 336 if you prioritize words on the answer list (which is why I land on EVERY instead of ABCEE).

Dwolb · on Jan 11, 2022

How are you calculating “biggest reduction in size of candidate pool”?

Wondering if you’re trying to split the pool 50-50 or more like 90-10.

jhgb · on Jan 11, 2022

It would seem to me that 50-50 would be the "biggest reduction" since presumably you're judging the pruning based on the worst case. In the latter case the worst case is that you're still left with 90 options.

Maybe I'm looking at it too much as some kind of decision tree but it looks to me like one, or at least something very similar. You don't want any branch to be too long.

Dwolb · on Jan 11, 2022

I think what we’re missing is information gained by understanding the letter location.

So if S is in 90% of words, well you also reveal whether or not it’s in the nth location. Whereas an incorrect guess gives you no location information.

So, you’d want to guess the word that equally splits the word pool given both the letter and letter location (or just letter in a given location).

andopp · on Jan 11, 2022

I try to cut it down to as small a set of candidates as possible. Any reason to think a different split is better?

Sohcahtoa82 · on Jan 11, 2022

A guess that eliminates 90% of possibilities is actually going to only eliminate 10% of the possibilities 90% of the time.

Trying to come up with a word that splits the list in half is actually ideal.

Dwolb · on Jan 11, 2022

I was thinking that if you guess S correctly and it is in 90% of words and not in 10% of words, you’ve only removed 10% of words from the pool.

Whereas if there were a letter closer to occurring in only 50% of the pool, you at least eliminate half.

Does that logic make sense here or no? I’m thinking I’m missing something related to “maximum information gain”.

andopp · on Jan 11, 2022

Imagine you have made a few guesses already. Then you have a set of candidate solutions; words that are compatible with the responses you’ve gotten from the game so far. The question then is, what word from the list of legal words should you now pick that no matter what the solution is, will ensure that that your candidate set becomes as small as possible based on the response you would receive? So basically, you have to compute, for each pair of candidate solution and legal word, the number of other candidate solutions that would yield the same response for that legal word.

Dwolb · on Jan 11, 2022

But what do you consider to be “as small as possible”? Is 50% the minimum?

andopp · on Jan 11, 2022

Say you have a 10 000 word list. In the beginning all those are candidates. Picking for instance SMOKE might give you a response that is only compatible with 1 000 words. You won’t know that before you guess it. But you can check, for any possible solution, what is the worst case scenario for SMOKE? Perhaps some solution would leave you with 1 500 possible words. Maybe FIRE as a guess would instead leave us with less than 1 000 words no matter what the solution is. Then that’s a better guess. So we pick the word that gives the lowest worst case remaining words.

ghusbands · on Jan 11, 2022

"Aesir" wouldn't be an accepted word in most word games, being a proper noun (capitalised). But "raise" uses the same letters, though obviously reveals differently positional info.

andopp · on Jan 11, 2022

When I tried with the /usr/share/dict/words dictionary it guessed "raise" first, so I can confirm it's a great way to start the game.

notafraudster · on Jan 11, 2022

ARISE and SERAI also have the same letters and different positional info. In the dictionary I used, SERAI was the optimal 5 letter word to start with.

thom · on Jan 11, 2022

These guesses seem counterintuitive in that they waste letters by repeating previous guesses. Interesting if this is optimal though.

ghusbands · on Jan 11, 2022

Reducing the possible position of letters can drastically reduce the set of possible words; but intuition probably tends towards getting more information about the presence or lack rather than the position of letters.

I don't think anyone has released an 'optimal' player, yet. I imagine getting all target words (in the wordle dictionary) in four guesses is doable and three is probably not.

SideQuark · on Jan 11, 2022

My bot averages ~ 3.5 moves, does all in 5, and less than a few % uses 5.

I've tried many, many approaches to get all in 4, and so far none work.

It is very nearly optimal (many of the parts are provably optimal, the only thing not yet optimal is doing an entire tree search, which is likely computationally impossible due to required tree size), using quite a bit of computation, precomputation, caching, etc., for the searching.

Oh, I also have a bot that solves Wordle in one move, every time.

Hint: the source code for Wordle is viewable from the page, and is easy to use to predict the word for each day.... But I've said too much now :)

ghusbands · on Jan 11, 2022

(Warning: To anyone who might follow the hint, maybe heed my warning in another comment on this HN page.)

Yeah, it might be bit cheeky to have a bot that guesses one word, and then gives you the correct answer and the date on which Wordle used or will use it.

Is the tree size still too big if you're only using Wordle's word list? (I'm almost interested enough to code something up, but asking you might mean I don't get fully nerd-sniped.)

SideQuark · on Jan 11, 2022

>Is the tree size still too big if you're only using Wordle's word list?

Yes, it's what I use.

Wordle has ~2300 words as possible hidden words, ~12,000 more allowed as guesses. To get best scores you need to sample from all ~15k words.

So, to build a tree: for each hidden word (2k), pick a guess word (15k), gain knowledge (729 possibilities, but only one per hidden/guess pair). This reduces your possible hidden list to 70-1kish. Repeat..

Worst tree is 5 levels deep, mine averages 3 levels deep, pruning and memoization is nearly nonexistant (I checked).

You now have 30M first move outcome nodes, 2k of which are wins. After second move, you have a billion plus (I've sampled these to gain knowledge of what to expect). The next few rounds push the compute time into crazy realms of time (again, I've sampled to estimate sizes).

I'd guess given a lot of computing power, it could be done, since around 3-4 levels most of the games complete. But since you cannot easily store this tree, I gave up for now.

You should be able to store the tree with best move only at each node, which is good for gaming, but loses interest for statistical knowledge of the tree.

Good luck :) Nerd sniping complete :)

andopp · on Jan 11, 2022

I think it's better to not repeat too much from previous guesses. I don't think my greedy algorithm is optimal, and also there are multiple ways to define "optimal". For instance, you might optimize average number of moves, or try to lower the upper bound for all possible words.

andopp · on Jan 11, 2022

Oh I misunderstood the comment I was replying to. Sometimes it is better to repeat letters, because it's a balance between discovering letters and figuring out the order.

jhgb · on Jan 11, 2022

You have a list of legal words? Where is it available?

bfung · on Jan 11, 2022

Goto wordle, right click “view source”, click and view the “main.<hash>.js” file.

Search a word from previous answers, there are 2 arrays that the game uses to check with. Copy those and that’s the dictionary.

Spoiler alert: the first array is the answer array for even future wordles.

jhgb · on Jan 11, 2022

> Spoiler alert: the first array is the answer array for even future wordles.

That's presumably the reason why I wouldn't want to view the source.

bfung · on Jan 11, 2022

This lists are pretty big where if you were to copy and paste them real fast, your eyes would glaze over them enough to not know the answers... maybe just the next day ;)

andopp · on Jan 11, 2022

There is also /usr/share/dict/words or https://github.com/dwyl/english-words which can be used.

jhgb · on Jan 11, 2022

I actually have several such lists already, but if the actual list of candidate words in this game is a subset of the ones I have, I imagine solution can be arrived at quicker by excluding guesses that couldn't possibly lead to the solution.

detaro · on Jan 11, 2022

in the source code of the site.

grenoire · on Jan 11, 2022

A little tip, you don't really have to generate the 5-letter word list, as Wordle's client-side script already does this validation over a known list of around 2200~ words. It's unobfuscated, you should be able to pick it up easily.

ghusbands · on Jan 11, 2022

WARNING: Do not do this if you care to try the puzzle in future. All of the past and future target words are listed in sequence in the source.

(I only got spoiled for tomorrow before I noticed.)

caitlinface · on Jan 11, 2022

This gave me an idea for a fun experiment...

Take the future word list, and each day use tomorrow's word as much as can be naturally fit into normal conversation.

Then see if all of my friends start to score better, since they were primed.

(I am too protective of spoilers to actually do this, but still.)

grenoire · on Jan 11, 2022

In order!? Oh man, I was just planning on reverse-engineering the network connection, it seems like there's none!

dejawu · on Jan 11, 2022

We can do a lot better than random on our first guess, though. Playing the game as a human, I found that it helped to eliminate (or confirm the presence of) vowels and common consonants ASAP. A quick search for five-letter words with three distinct vowels (can't get more than that without straying beyond common English) and the most common consonants (R, S, T) yields words like "autos", "raise", and "suite", which I've anecdotally found to make for great starting words.

bryans · on Jan 11, 2022

I almost feel like packing the first word with vowels isn't even particularly helpful. I've kept with the same two starting words -- slate and round -- and it has been very easy to guess by the third turn every day. In fact, the only day it took more (5) is when I strayed from using round as the second word. If nothing from those two words hits, the remaining letters leave only very obvious words using I and/or Y.

There really aren't that many 5-letter words, so there's going to be a few specific combinations of two starting words that rapidly whittle the list.

dfan · on Jan 11, 2022

You might enjoy the extra challenge of hard mode (access it through the gear icon), where every guess you make has to be a potential solution based on the existing history.

IanCal · on Jan 11, 2022

Something it doesn't stop is reusing eliminated letters, which adds a little more to it. You can do that yourself though.

Imnimo · on Jan 11, 2022

I wrote a script to calculate the expected information gain for any possible guess. It has several caveats - it probably doesn't share exactly the same word list as the game, it assumes that all valid words are equally likely, etc. But apparently TARES is the most informative first guess in expectation. RATES is also very close, and feels a bit more plausible as possible answer.

XYLYL is the worst.

thom · on Jan 11, 2022

I generally start with TEARS, INDOL and CHUMP, which seemed to match the 15 most common letters in the right order in the word list I had available at the time.

maxerickson · on Jan 11, 2022

Part of what makes it fun is that a random guess can be a lot stronger against a puzzle than the common word guess.

sgk284 · on Jan 11, 2022

ADIEU seems to be a popular four vowel choice. Seems to work pretty well in practice.

Imnimo · on Jan 11, 2022

It would be interesting to see if you could solve the puzzle by looking at the grids people post on Twitter. Each grid shows you the colors they got from the guesses, but not the letters. For a particular grid to be possible, there must be a sequence of valid words that produce that grid given the day's answer. If you have hundreds of peoples' grids for the day, is that sufficient to narrow down the space of possible words to 1?

RobertoG · on Jan 11, 2022

I would appreciate recommendations for learning about SMT solvers. A good book, for instance?

ahelwer · on Jan 11, 2022

Programming Z3 https://theory.stanford.edu/~nikolaj/programmingz3.html

fallingmeat · on Jan 11, 2022

Another "popular" option is Yices: https://yices.csl.sri.com

bfung · on Jan 11, 2022

Tangent: for people on the job interview circuit, wordle is basically a modified on hangman game. It's worth coding an algorithm here here in case you're asked to do a take home hangman code up!

zvrba · on Jan 11, 2022

So someone "reinvented" mastermind with letters instead of colors.

aaronsdevera · on Jan 11, 2022

or, you know, you could just paste this into the javascript console.

JSON.parse(window.localStorage.getItem("gameState")).solution