|| Take a visit to the “Tak Talk” Discord and you’ll quickly notice Tak players love to analyze games. What is less common though is a broader statistical analysis of many higher level (i.e., non-casual) games. For example, analyzing all the games of a tournament. Statistics is a powerful tool but, like any tool, it must be used properly. I am familiar enough with statistics to be dangerous…to myself! Nevertheless, let’s take a look at one of the premier Tak tournaments - the USTA's 2020 U.S. Open.
The 2020 U.S. Open gives us access to a new rich (though limited) dataset of competitive games to analyze. Just to be upfront, there were few variables that correlated to a high degree of probability with victory. This is too bad, as a “One Quick Tip to Dominate the Tak Board” would garner far more interest. Nevertheless, we can gather some insight into the game by examining the correlations that I did find, as well as visualizing the data and determining what the top players have in common. The following is a collection of my thoughts after spending time analyzing the data. As with all statistics, keep in mind that this is a limited dataset from a specific setting, and that correlation does not imply causation.
First Player Advantage (FPA) outpaced previous observations.
Out of 94 games, a total of 60 (64%) were won by white while and only 34 (36%) were won by black. This is a pretty large gap, though it had a limited outcome on the tournament as each player played an equal number of games as both colors. In addition, this advantage was rarely enough to overcome discrepancies in skill level, as the games won by black were mostly by much stronger players. The average difference between players in games won by white was white +120, and 14 (30%) were won by the player with the lower ELO. The average difference in games won by black was black +203, and only 5 (15%) of games won by black were won by the player with a lower ELO.
Elo is a decent predictor of final tournament placement.
The lower and upper Elo rankings align with the final tournament placement as we would expect. However, there is a bit of a breakdown of the correlation from Placement 3 to 8. This could be a deficiency in the rating system (i.e., not enough games have been played by these players to capture their true skill). Another possibility is that there is simply not much of a difference in skill represented in these mid-level players. Perhaps day-to-day variability in skill is so large that it swamps the true skill that Elo is able to capture with a much larger data set.
Start in adjacent corners most common.
Only 29 games started in a position other than adjacent corners. Of these games black won only 6 (21%). Of course, a few players used a majority of the non-adjacent starts which skews the data, but we still see a trend of higher placing players utilizing adjacent starts. Below is the starting position of each game organized by tournament position when they played black (black usually determines the starting position).
Capstone placement was early.
The average turns before a Capstone is placed is 6.8 turns for black and 8.2 turns for white. There was no correlation between when each side played its Capstone and the outcome of the game. However, there is a trend that players that finished higher in the tournament used their Capstones early, especially as black. This result may just be influenced by more experienced players engaging in the meta of using their Capstone early, but we see that AI Tak bots also often play a Capstone early. This may be a tactic you should consider trying out if you find that your Capstones are less effective than you might hope.
Black did not play walls much earlier than white, on average.
The average number of turns until a wall is played is essentially the same for white and black (15 and 14.3 respectively). I also don’t see much of a clear pattern when they are first placed by players of different skill. More interesting are the number of walls used. On average, the winning player placed 1 wall while the losing player placed 1.5. We also see that our two top players rarely placed 0 walls, even as white (though I admit our number 3 player often used 0 walls).
Win conditions are unclear.
It is unclear if one color is favored by specific win conditions. Only 8 (8.5%) of the games ended with a non-road victory (meaning the game ended with a timeout or a flat count). Both black and white won 4 of these games. This is not enough data to get a clear picture of what is going on. Though we can certainly say that road victories are by far the most common! In the future, it would be interesting to further break down the road victories by Tinuë vs missed threats. This also doesn’t account for times a player “gave up” due to time or some other issue, such as falling behind on flats and making desperate moves.
The winner generally played more flats.
Obviously, flats are necessary to win and having more on the board helps. However, the number of flats placed is hard to analyze due to the fact that the biggest predictor of the number of flats played is the number of turns in the game. In general, the winner of the game played more flats than the loser, and the flat difference is much narrower in games won by black than by white. While the average difference in flats was 1.7 for games won by white and 1.2 in games won by black, this doesn’t tell the whole story. The most common flat difference in games won by black was black +0 and only 26% of games were won with a difference of less than 0. In contrast, the most common flat difference in games won by white was white -1 and 38% of games were won with a difference less than 0. This indicates that achieving a small flat lead as black is often not enough. In order to win, the black player will often have to achieve a lead of two or more flats. This highlights the importance of forcing your opponent to move or throw walls rather than increasing their flat count.
Consider this an initial and tentative step in the direction of exploring what statistical analysis of tournament-level games can tell us (or not tell us). Despite the limitations to this specific dataset, it was an interesting journey into looking at the USTA 2020 Open from a new perspective.
If you’d like to access the dataset to run your own analysis, please contact the Tak Times.