Making Jeopardy! Predictions: A Methodology

Note: This methodology has been updated.

So, you’ve probably seen recently – either here on the sidebar, or on Twitter – my quoting the chances a player has of winning a certain number of games on the show.

As a note: this article will rely heavily on discussion of a Coryat score. Here’s the definition, via J! Archive:

n. a player’s score if all wagering is disregarded. In the Coryat score, there is no penalty for forced incorrect responses on Daily Doubles, but correct responses on Daily Doubles earn only the natural values of the clues, and any gain or loss from the Final Jeopardy! Round is ignored.

In other words: give a player credit (or neg) for the clue value of any clue they get correct (or incorrect). Negs on Daily Doubles don’t count as they are considered a “forced guess”. Coryat scoring tracks “unforced errors”, not “forced ones”.

This project started out when I asked myself “I wonder what someone’s chances of winning a game was, given their specific Coryat score?”

For this, I took a look at every single regular-play game in J! Archive from October 4, 2004 through to March 30, 2016 – a period of over 2,000 games. I then sorted each game’s Coryat scores into “winning” and “non-winning” and took the winning percentage at each level.

Why October 4, 2004? For the beginning of Season 21’s tapings, Jeopardy! realized that their challengers might not be getting enough rehearsal time on the buzzer; Ken Jennings’ streak made this very apparent. October 4, 2004 was the first episode of the “extra rehearsal time”.

The below graph has the plot of Coryat vs winning percentage, along with the trend line that fits the data.

Figure 1: Winning percentages for each Coryat score in regular play games from October 4, 2004 – March 30, 2016

That trend line thus gives us a corresponding win percentage for each Coryat score.

To account for certain improbable occurrences that have yet to actually happen, any Coryat lower than $3,200 is assigned a 1% chance of winning, and a Coryat higher than $28,000 is assigned a 99% chance of winning.)

I’ll be updating my data and making small adjustments to the trend line about four times a season.

To run my simulations, I have a Python script to generate 1,000,000 normally distributed Coryats (given the mean and standard deviation of the player’s Coryats thus far) and from there, generate the mean winning percentage to determine both the player’s chances of winning their next game as well as a predicted length of their streak (in this case, r / (1 – r), where r is a player’s chances of winning one game).

For runs that have yet to reach 5 games, I will be adding an extra game, with the mean Coryat, into the calculation, to account for the possible case that a player had an outlying performance.

4 Comments on "Making Jeopardy! Predictions: A Methodology"

DeeDee McCormick | May 26, 2016 at 2:50 pm |

Have My husband tape it for me ,so I can watch when I get home. Its a pleasure, seeing Buzzy Win
Knowledge Dropper | September 21, 2016 at 5:22 pm |

Does the methodology take into account the fact that many games feature fewer than 60 clues (30 clues each in the Jeopardy and Double Jeopardy rounds) revealed?
- Andy Saunders | September 21, 2016 at 5:55 pm |
  
  That’s the beauty of having all of the numbers from actual games make up the model. Because there are some 61-clue games and some games with significantly fewer clues in the model itself, looking at an independent $15,000 Coryat game (to throw an example number out there) gives you the winning percentage with the historical proportion of full-board and not-quite-full-board games calculated in already.
Harley M Littlejohn | December 18, 2017 at 10:05 pm |

Denise Littlejohn is my spouse’s name in Maryland. How irronic. 🙂 Might be relatived.