The Playoff Odds - An Explanation and AMA

Rockies GM (Dan)
Trade Panel Member

Posts: 2,279

The Playoff Odds - An Explanation and AMA Aug 12, 2018 16:58:50 GMT -8 Nationals GM (Preston - Old), White Sox GM (Aidan), and 2 more like this

Quote

Post by Rockies GM (Dan) on Aug 12, 2018 16:58:50 GMT -8

Explaining the Playoff Odds Model

The podcast isn’t the first time I’ve heard/read someone from the league dismiss the playoff odds. Now, while I’m open to peer review, it does dishearten me to hear someone just say “I don’t trust them” out of hand without learning how they work and why they say the things they do. In the end, what that says to me, is I’ve been maybe a bit too proprietary of my work and methodology. So, in the spirit of transparency. Not only will I explain everything I can here, I will also open this up to any and all questions. If you wonder why or how I do something in this, please feel free to ask and I’ll respond.

Accumulation of Statistics

Before I get into the nitty gritty of simulation, let’s talk about where I get my numbers. At the beginning of the season, I pull all projections from Fangraphs Depth Chart projections, which attempts to combine ZIPS and Steamer. In the middle of the season, I add the Depth Chart RoS projections to the statistics already accumulated by the teams in the league (pulling them straight from the Standings>>Season Stats tab in the league.

I automated a process by which the program creates a Firefox browser, signs into Fantrax and grabs the .csv of each team available from the team’s home page. I then parse this list and check each and every player against the large Pitching and Batting projection .csv files gathered earlier. I double check this by listing every player I can’t find in the Java console. I’ve got methods in there for first names that don’t match between databases, and names with Jr., as that can throw off the simple search.

Once it pulls projections for all players, I modify those numbers. There is a loose correlation between number of AB/IP expected per week and the percentage of a stat that is accumulated (since you cannot play every player on your roster at all times). Those formulae are:

xRuns = totalRuns*(1.3939-.00447015*(AB/week))
xHR = totalHR*(1.44006-.00469675*(AB/week))
xRBI = totalRBI*(1.77222-.00720542*(AB/week))
xSB = totalSB*(1.38918-0.00462705*(AB/week))
xW = totalW*(1.03494-.002418843*(IP/week))
xK = totalK*(1.033395--.002332322*(IP/week))
xHLD = totalHLD*(1.097794-.00406*(IP/week))

There are 5 categories that do not adhere to this correlation:

    - Saves – I count all saves. Generally, there is no historical drop off between the total saves accumulated vs saves counted. The reason for this is the fact that there are only 30 closers in the league, therefore players expected to earn saves are generally played every day.

And then there’s the rate stats, which don’t fit the AB/IP per week correlation. There’s no way for me to really guess who any team will use, so what I do is the following:

    - Add up every players rate stats (AVG, OPS, ERA, WHIP) multiplied by their projected IP or AB, whichever the relevant stat
    - Divide the sum of all players’ rate stats by the total IP or AB in order to get a good, weighted number
        o The benefits here being that your hitting stars, more likely to get ABs count more towards the hitting stats
        o On the pitching side, it’s a good way to weight the starters and relievers contributions towards the rate stats. If you’re the Astros, where your relievers accumulate a lot of innings, it’ll be affected thus. If you are starter heavy, like the Giants, that will be weighted

Finally, I add the expected counting stats to the already accumulated stats in the middle of the season. For the rate stats, the formula looks a little something like this (for AVG, for example):

    - ((AccumulatedAVG*AlreadyPlayedWeeks) + (ProjectedAVG*WeeksToPlay))/Total Weeks

Finally, a note that if a team is not expected to accumulate the minimum IP/week or AB/week, I assign a BA/OPS of .000 and an ERA/WHIP of 10, to account for the automatic loss of those categories in Fantrax.

And that’s how all the teams get their stats.

Simulation

Now that we’ve got all of our statistics, it’s time to simulate. But to do that, we need to first sort and rank the teams by stats. Then, I assign a winning% to each stat, based on the winning percentage in league history of that position in the rankings (which had a strong-ish correlation). The number rank I use in the formulae corresponds to what you’d see on the Fantrax Standings>>Season Stats page (30 points to the league leader in that category; 1 to the worst statistical performer). These end up being the main numbers the program uses.

I have hard-coded the schedule into the program. When I start the program, it simulates the season week-by-week. Each game works the same way:

Team 1 has a winning percentage in the runs category of A
Team 2 has a winning percentage in the runs category of B

The accepted formula in simulations (Bill James’ Log5 Formula) is the following:

Winning Percentage for Team 1 = (A*(1-B))/((A*(1-B)) + ((1-A)*B))

Or, in other words – the chances that team 1 wins multiplied by the chance team 2 loses, divided by that same number plus the chances that team 1 loses multiplied by the chance team 2 wins.

Thus, if Team 1 wins the runs category 60% of the time and Team 2 wins the category 40% of the time:

W% = (.6*(1-.4))/((.6*(1-.4)) + ((1-.6)*.4))
W% = .36/(.36+.16)
W% = .6923

So, when these teams play, Team 1 wins runs 69.23% of the time.

In each and every matchup, I do winning % for each stat. Then, I do a simple random number simulation and come up with the winner for each stat in this theoretical game. I then record a win, loss or tie (using actual league tiebreakers)

I simulate the entire season game-by-game using this algorithm. I then determine playoff teams, breaking ties by league rules, put them into ArrayLists by seed, play the matchups and eventually come up with the world series champ.

Then, once that whole season is completed once, I do it 99,999 times again. That gives me a good enough sample size to eliminate random noise.

The Numbers You See

I get that seeing that the Colorado Rockies having a 95% shot at the division with 4 games left and a one game lead can look like I’m putting bias into the program, but that’s simply not true. Those numbers can seem callous and cold, I understand, but it’s based on specific games and likely outcomes.

With 100,000 season simulations, you get almost every conceivable outcome. True, the Washington Nationals aren’t going to show you a 0-21 season, but that isn’t even realistic in this fantasy simulation. Many different outcomes go into that average wins number, and even that doesn’t give you a full appreciation for the variety of the outcomes

FOR EXAMPLE, below you can see the win spread for each team by division if the season started today with a blank slate but with the same team rosters and accumulated statistics as it stands today.

How to Read These Graphs

The x-axis is the number of games won. The y-axis is the percentage that outcome occurs. In general, you want to be the team with the top-right line in your division (the right-most and most top-heavy towards the right).

Last Edit: Aug 12, 2018 16:59:35 GMT -8 by Rockies GM (Dan)

2020 World Series Champions

Rockies GM (Dan) Trade Panel Member Posts: 2,279	The Playoff Odds - An Explanation and AMA Aug 12, 2018 17:03:07 GMT -8 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Rockies GM (Dan) on Aug 12, 2018 17:03:07 GMT -8 IF THE LEAGUE STARTED TODAY (with all current stats) NL Wins Spread AL Wins Spread
	Last Edit: Aug 12, 2018 17:05:11 GMT -8 by Rockies GM (Dan) 2020 World Series Champions

Rockies GM (Dan)
Trade Panel Member

Posts: 2,279

The Playoff Odds - An Explanation and AMA Aug 12, 2018 17:13:01 GMT -8

Quote

Post by Rockies GM (Dan) on Aug 12, 2018 17:13:01 GMT -8

Cons of this method

- None of the correlations are 100% perfect, so this is by no means a completely bulletproof algorithm
- That being said, looking at the win spreads by completely randomizing the season with current stats, everyone looks to be where they ought to be. It works and is accurate. I haven't noticed a single team whose current win pace doesn't even appear among the probability curve. Some might be in low percentile portions of that curve, but that's probability for you. The majority of teams are close to their apex of most probable win numbers, and that's a win for accuracy.

- This program cannot tell the future. I joke, but there was a HIGH degree of variability this whole offseason. Projections vary heavily, and no one really knows who will break out (on my own team alone, I don't think many people thought Bauer would be as good as he has been, certainly not the projections). Injuries also cannot be taken into account.
- This program takes time to normalize (I don't have an exact time frame), so the first few weeks things may vary more heavily week-to-week

- There may be a better way to count expected stats than the IP/week or AB/week method. If you have a better idea, let me know. I'm not keen on deciding starters for each team, since I value certain things and you might value other statistics that I don't value as heavily.

Anyway, here's the AMA portion. If you want to know anything else about the way this works, or if you have any questions about things maybe I wasn't clear on, ask me and I'll answer

2020 World Series Champions

Nationals GM (Preston - Old) Hall of Famer Commissioner Emeritus '14 PB Fantasy Baseball Champions; '11, '13, '14, '15, '17 NL Champion Posts: 2,622	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:27:00 GMT -8 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Nationals GM (Preston - Old) on Aug 12, 2018 18:27:00 GMT -8 Awesome work Dan.

Former Twins GM (Robin) Hall of Famer How about those Twins? Posts: 1,665	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:27:44 GMT -8 via Tapatalk White Sox GM (Aidan) likes this Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Former Twins GM (Robin) on Aug 12, 2018 18:27:44 GMT -8 Hmm seems like a smoke screen. I'm not sure I trust his method.
	2014 AL Champions 2015 AL Central Champions 2019 AL Central Champions

Diamondbacks GM (Ethan) All-Star 2019 NL Wild Card Posts: 872	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:32:12 GMT -8 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Diamondbacks GM (Ethan) on Aug 12, 2018 18:32:12 GMT -8 You and your fancy numbers

Diamondbacks GM (Ethan) All-Star 2019 NL Wild Card Posts: 872	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:33:06 GMT -8 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Diamondbacks GM (Ethan) on Aug 12, 2018 18:33:06 GMT -8 My only question is why FG projections are used and not something like BP’s PECOTA?

Former Twins GM (Robin) Hall of Famer How about those Twins? Posts: 1,665	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:34:08 GMT -8 via Tapatalk Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Former Twins GM (Robin) on Aug 12, 2018 18:34:08 GMT -8 "Doesn't pass the eye test." All the Mets GM's ever.
	2014 AL Champions 2015 AL Central Champions 2019 AL Central Champions

Rockies GM (Dan)
Trade Panel Member

Posts: 2,279

The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:43:25 GMT -8

Quote

Post by Rockies GM (Dan) on Aug 12, 2018 18:43:25 GMT -8

Aug 12, 2018 18:33:06 GMT -8 Diamondbacks GM (Ethan) said:

My only question is why FG projections are used and not something like BP’s PECOTA?

Because FanGraphs is free and downloads in two .csv files containing all pitchers and all batters. I don't know that BP is free or downloads quite as nicely.

Last Edit: Aug 12, 2018 18:43:50 GMT -8 by Rockies GM (Dan)

2020 World Series Champions

Rays GM (Donavan) Moderator Happily re-married (got it right this time) with 6 kids Posts: 1,754	The Playoff Odds - An Explanation and AMA Aug 12, 2018 18:59:18 GMT -8 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Rays GM (Donavan) on Aug 12, 2018 18:59:18 GMT -8 Awesome work Dan, I trust the numbers, if the GM's can play the right guys....my bench hit way better than my starters this week...

Rockies GM (Dan)
Trade Panel Member

Posts: 2,279

The Playoff Odds - An Explanation and AMA Aug 12, 2018 19:03:48 GMT -8

Quote

Post by Rockies GM (Dan) on Aug 12, 2018 19:03:48 GMT -8

Aug 12, 2018 18:59:18 GMT -8 Rays GM (Donavan) said:

Awesome work Dan, I trust the numbers, if the GM's can play the right guys....my bench hit way better than my starters this week...

I feel your pain - I think one of my losses this year (to the Phillies?) would have been a win if only I'd remembered to make a couple switches I meant to make and played one of my starting pitchers I left on my bench.

My first time in the playoffs, I lost because I played Aaron Sanchez (a RP at the time) when I didn't actually have to. He ended up allowing like 5 ER on 0.0 IP, and it lost me the ERA vs Beau, who I think won that playoff matchup by a tiebreaker

Last Edit: Aug 12, 2018 19:05:25 GMT -8 by Rockies GM (Dan)

2020 World Series Champions

Rays GM (Donavan)
Moderator

Happily re-married (got it right this time) with 6 kids

Posts: 1,754

The Playoff Odds - An Explanation and AMA Aug 15, 2018 3:29:06 GMT -8

Quote

Post by Rays GM (Donavan) on Aug 15, 2018 3:29:06 GMT -8

Aug 12, 2018 19:03:48 GMT -8 Rockies GM (Dan) said:

Aug 12, 2018 18:59:18 GMT -8 Rays GM (Donavan) said:

Awesome work Dan, I trust the numbers, if the GM's can play the right guys....my bench hit way better than my starters this week...

I feel your pain - I think one of my losses this year (to the Phillies?) would have been a win if only I'd remembered to make a couple switches I meant to make and played one of my starting pitchers I left on my bench.

My first time in the playoffs, I lost because I played Aaron Sanchez (a RP at the time) when I didn't actually have to. He ended up allowing like 5 ER on 0.0 IP, and it lost me the ERA vs Beau, who I think won that playoff matchup by a tiebreaker

Yeah, left 4 HR's, 2 SB's, 1 win & 1 save on the bench, in a week where I scored 3 HR's (going down by 1), and tied saves with just 1.