theorycrafting and quantification · web viewtheorycrafting and quantification. at its highest...

Theorycrafting and quantification

At its highest levels, Scrabble is a math game. As you proceed through the ranks, you will hear numerous theories about advanced techniques: perhaps you will even come up with some theories yourself. Many of these theories sound good at first, but are they good ideas? How much do these theories matter?

One of the skills that is important for any top player is to evaluate and quantify various concepts to test whether or not they can be valuable metrics used to help you improve at Scrabble. This article will focus on giving you the tools to evaluate and quantify various ideas you might hear about Scrabble.

In this article, I'll be covering a handful of concepts and quantifying them, but the key here is not the evaluation of these concepts: it's showing you how to evaluate concepts so you can think and evaluate these concepts on your own. For this reason, I've developed a set of sets that you should use to evaluate any potential Scrabble concept:

1. Understand the hypothetical reasoning and logic behind the concept. Is the question generalizable, or are there subcases that have dramatically different effects?

2. Translate the concept into mathematics and statistics.

3. Think about how you would refute such a theory, and its possible negatives.

4. Devise an experiment that will allow you to test and/or quantify this theory. Often, this experiment will involve simulation of some sort.

5. Run the experiment and view the results. Review the results to see if they make any intuitive sense.

Idea 1: Is it bad to leave an S as the last letter in a 3W alley?

Intuition: The S can pluralize any singular noun or unconjugated verb, making it extremely dangerous to leave in a 3W alley.

Translation: In reality, this question has nothing to do with a 3W alley: it's just an idea of quantifying how often an S pluralizes a word. First, it's useful to pull some statistics:

~34% of 8 letter words end in S, more than triple any other letter~11% of 8 letter words *start* with S, more than any other letter

It's unquestionable this is going to have an effect. Now we have to *quantify it*. The first question we have to ask is "compared to what". If it's compared to nothing, well, it's important to keep in mind that in most decision making processes it's going to be the S *versus* some other tile. Even putting a V in that position will have some effect that will increase your opponent's chances of playing a bingo.

We'd like to create as simple of a position as possible, but focus on the S only ending words. To do this, we'd first like to start with as plain of a board as possible and run some simulations.

We can run the following comparisons:

KILNS 8d vs. KILNS 8h (Score of opponent)LINK vs. SLINK (opponent bingo percentage)SLINK 8h vs. SLINK 8f: (Score); then compare to KILNS comparisonSLINK 8h vs. CLINK 8h (Opponent's score: focusing on ending words)

Some notes:

Opponent bingos over 25% of the time with a random rack. In game, most racks are not random, so it will often be higher.

The difference from Triple to Double should be around the same as the difference between Double to Single, and the difference between KILNS 8h and 8d includes scoring. Thus, the difference between KILNS 8h and 8d should be smaller than the other comparisons.

CLINK annuls the starting effect: it starts almost the same number of words as the S, but those bingos score 6 more points.

The S adds about a 2% likelihood to bingos as opposed to no S in LINK vs. SLINK: in other words, when 7 letter bingos are largely available, the S still does make more bingos playable.

Conclusion: The effect of the S is about 3 points in a Triple lane. This makes sense, as it implies that at 100% bingo rate, an S in a triple lane adds about 12 points to an average score. Obviously, the higher the points on your rack the higher the score difference, but the less likely the bingo.

Results:

CLINK vs. SLINK is about a 3 point effect.KILNS vs. KILNS also implies about a 3 point effect (1.1 x 2 for Double, + 1 point for scoring effect)SLINK vs. SLINK implies a 4 point effect, but it greatly neutralizes scoring as well as bingos, so 3 points also seems reasonable.

Thus, while this effect does exist and is significant, it might not be as significant as first thought. The effect of the S is much more pronounced in its ability to allow bingos much more than the ability to increase the score of said bingos.

Idea 2: Playing more tiles allows you to get to the blanks and S tiles more quickly

While Question 1 could be answered primarily through data, question 2 will be answered more through straight up theory. The first thing we need to do is quantify the worth of the S and blank in this way.

Intuition: The blank and S are worth substantially more than all the other tiles, most of which are worth pretty close to zero. Getting these tiles first also deprives your opponent of these valuable tiles.

Mathematics: The blank is worth 24, while the S is worth 12.

Scrabble is a zero sum game, meaning that the effect of these tiles is doubled.

Hidden information often informs us that the bag is probably weaker than the opponent's rack in most cases, depending on the history of play.

Based on pure math, if there is one tile in the blank and either we draw it or the opponent draws it, the value of drawing an extra tile is +48. If there are 3 tiles, it becomes 48/3 or +16. And if there are 12 tiles, it's 48/12, or +4 (as a tile equity, not accounting for entropy etc.) Again, this assumes that these tiles are in the bag.

This also requires accounting for the entire bag, including the bad tiles, entropy, etc. While in late game or preendgame situations turnover is extremely relevant (especially with two blanks in the bag), this isn't such a big factor barring extreme circumstances.

Idea 3: How much is it worth to block off an entire section of the board?

Score: 154-78Last play: Exch. 4

This question is significantly harder because now we're trying to quantify something in terms of valuation (points and leave) ALONG with entropy. Entropy is extremely situation dependent: some boards are going to be more closed than others, and the effect of closing a quadrant is only so indefinite, since it's almost always *possible* to open a closed quadrant.

As a general question, this is pretty impossible to answer. However, we can take an individual position and try to quantify it, and then take those answers in aggregate and try to come up with some general patterns to provide a rough estimation of how to quantify this effect in a more general way.

Idea 4: When you are behind, it is a good idea to play fewer tiles to prolong the game

This is a purely theoretical idea that has some merit. To take an extreme example, let's say that you are behind by 100 points and there are 14 tiles in the bag. Let's say you have a choice between playing off 3 tiles and playing a bingo next turn 50% of the time, playing 2 tiles and getting a bingo 40% of the time, or playing off 1 tile and playing a bingo on your next turn 20% of the time. The problem is that if your opponent can play enough tiles such that there are 6 or less tiles, you are essentially dead. If there are exactly 7 tiles left, your opponent will know your exact tiles and will be able to block any chance at your last bingo a huge percentage of the time. Playing one tile is almost definitely the best option.

However, this idea clearly has less merit when there are, say, 50 tiles left in the bag and you are down by 70 points. In this case, giving up 5 points to keep an extra tile in the bag is somewhat silly: especially because the 50 point bingo bonus creates a huge discreet effect of randomness.

How do we quantify this?

The first thing to note is the *distribution* of this effect. At 14 tiles this effect is huge, but the effect at 16 tiles is much smaller: at 17 tiles is very little. On the other hand, at 20 tiles it's not much different than 17 tiles, because your opponent will still play off enough tiles to get back into this situation. It's a weird, curvy exponential distribution that flattens out the further you get away from 14 tiles in the bag, in this above example. And that means that the utility curve also gets much flatter.

The second thing to note is "how much valuation are we giving away". The natural approximation (albeit not exact) is that the value of playing a tile is something like #pointsbehind/tiles remaining. But that implies that Scrabble is a monotonic and normal distribution when this is clearly not the case because of the significant bingo bonus: and as a result, when you subtract 50 from such a formula for the bingo, you ultimately get very small numbers.

Upon investigation, it seems fairly clear that while this effect exists, this isn't a very significant effect.

Idea 5: Leaving vowels next to the DLS square is not a bad idea

The main intuition here is that most plays using DLS squares on turn 2 really don't score that much and often give back a lot on turn 3 dues to TLS squares being far more potent. These facts are definition true, but the effect is not zero. The main issue is that this is a case of overgeneralization. Rather than think of them as squares, it's more useful to think of it as a parallel opportunity. A play like PINITOL with the P on the star (exposing the I and O to the DLS squares) is very different than a play like CIVIC (exposing the I, but offering no parallel options)

As you can see, it turns out that it depends on the parallel-ability of the vowels next to the DLS squares. This also changes based on the letters that you keep or don't keep in your rack (read: high point tiles) and the parallel letters left (A, I, and O are more dangerous than E and especially U).

Idea 6: Passing with IORSTUV and like racks is viable when balanced by weaker racks

Since many of the ideas so far have been somewhat simplistic, I decided to include this much more advanced idea.

Intuition:IORSTUV and likewise racks are extremely synergic and only suffer from the

unusual circumstance of a blank board. By passing especially against opponents who are overly averse to putting consonants on the center square (since it often puts vowels next to DLS squares) you will get a high scoring bingo.

A huge part of this problem is that your opponent might figure this out and break your expectations if they do have a good rack. For example, your opponent might trade, then trade again, then bingo with a word like TONNERS but place the first N on the star because they suspect you're up to exactly a rack like you have. And in reality there's no way to balance this, since there are so many consonants and since you can trade on your first rack. Against opponents who are capable of thinking on this level this concept falls apart very quickly. Heck, they don't even need a bingo: Even something like QI with the Q on the star keeping a strong bingo leave really takes a lot of the sting out of this strategy. Luckily, only the most advanced of players are capable of figuring this out.

Because of this, it isn't a very large mathematical effect, but only an exploitive effect.

theorycrafting and quantification · web viewtheorycrafting and quantification. at its highest...

Documents