Originally posted on Fangraphs  |  Last updated 2/21/13
In my first article, I wrote about the limitations of the linear weights system that wOBA is based on when it comes to the context of unusual team offenses. In my second, I explained how Tom Tango, wOBA’s creator, also came up with a way of addressing some of these limitations by deriving a new set of linear weights for different run environments, thanks to BaseRuns. Today, I will tell you about the next step in the evolution of run estimators — the Markov model. Tom Tango created such a model that can be accessed through his website, and I’ve turned that model into a spreadsheet that I’ll share with you here. I’ve told you that the problem with the standard run estimator formulas is that they make assumptions about what a hit is going to be worth, run-wise, based on what it was worth to an average team. That means it’s not going to apply very well to an unusual team. What’s so great about the Markov is that it makes no such assumptions — it figures all of that out itself, specific to each team. And when I say it figures it out, I mean it basically calculates out a typical game for that team, given the proportion of singles, walks, home runs, etc. the team gets in its plate appearances. It therefore estimates the run-scoring of typical teams better than just about anything, but it also theoretically should apply much, much better to very unusual or even made-up teams. Will this spreadsheet thing make my life complete? Well, not really. But it is fun to explore. The thing I think it’s most useful for is to guess how many runs a team would score with or without certain players. To demonstrate why this may be eye-opening for you, I’m going to show you how even two players with identical wOBA and wRC+ ratings could have significantly different offensive values to different teams.Markov: I must break you…r perceptions of player values In 2011, Mark Trumbo and Alberto Callaspo had identical wOBAs (0.328) and therefore identical wRC+ as well (108), seeing as how they both played for the Angels. However, they achieved these above-average wOBAs in very different ways: Callaspo with a 0.366 OBP and 0.375 SLG, and Trumbo with a 0.291 OBP and 0.477 SLG. So, let’s place these two onto various teams to see what happens. To keep things simple, let’s just pretend there’s no such thing as park effects. Now, before I get into this, let me remind you that teams don’t have a fixed number of plate appearances per season, but their number of outs in a season is close to fixed; e.g. 162 games/season * 9 innings/game * 3 outs/inning = 4374 outs. Of course, it’s not exactly that, mainly because of extra innings and the fact that the home team won’t have a full 9 innings of offense in games they win. Anyway, I’m going to try to equalize Trumbo and Callaspo for playing time by giving them the same number of outs, defined as: Outs = PA – H – BB – HBP + CS + GDP. Ideally, that would also add outs on the bases as well, but FanGraphs doesn’t provide that as of yet. Another thing: I really ought to be removing a player from each of these teams to make room for Trumbo or Callaspo, but so as not to add the additional variable of different players being removed from different teams, we’ll just reduce each team’s outs (and the rest of their numbers proportionally) to make room. This means we’re basically just pretending that all the original players on that team had their playing time reduced a bit to make room. So, without further ado, here’s what happens when 2011 Trumbo’s (T) or Callaspo’s (C) numbers are inserted into various especially good or bad offenses: Season Team or Player OBP SLG Aggro Actual Markov (tweaked) Markov (default) BaseRuns Runs Created 2011 Mark Trumbo 0.291 0.477 -0.193 ? 4.440 4.765 4.828 5.066 2011 Alberto Callaspo 0.366 0.375 -0.043 ? 4.988 5.211 5.125 5.219 1963 Colt .45′s 0.283 0.301 0.190 2.864 2.837 2.774 2.921 2.959 1963 Colt .45′s+T 0.284 0.318 0.154 ? 2.997 2.975 3.115 3.156 1963 Colt .45′s+C 0.292 0.308 0.165 ? 3.023 2.978 3.114 3.162 1965 Mets 0.277 0.327 0.119 3.018 2.956 2.968 3.121 3.153 1965 Mets+T 0.278 0.342 0.089 ? 3.187 3.144 3.289 3.327 1965 Mets+C 0.286 0.332 0.105 ? 3.215 3.145 3.292 3.343 1968 Mets 0.281 0.315 0.238 2.902 2.945 2.850 3.035 3.110 1968 Mets+T 0.282 0.331 0.199 ? 3.094 3.040 3.214 3.289 1968 Mets+C 0.290 0.321 0.208 ? 3.120 3.042 3.216 3.300 2011 Mariners 0.292 0.348 0.195 3.432 3.454 3.385 3.538 3.608 2011 Mariners+T 0.292 0.361 0.159 ? 3.554 3.525 3.670 3.749 2011 Mariners+C 0.300 0.351 0.171 ? 3.590 3.537 3.681 3.763 1994 Yankees 0.374 0.462 -0.283 5.929 5.904 6.516 6.404 6.630 1994 Yankees+T 0.364 0.464 -0.271 ? 5.663 6.227 6.163 6.427 1994 Yankees+C 0.373 0.450 -0.246 ? 5.774 6.331 6.223 6.423 1996 Mariners 0.366 0.484 -0.197 6.168 6.098 6.526 6.452 6.765 1996 Mariners+T 0.360 0.483 -0.196 ? 5.911 6.328 6.279 6.602 1996 Mariners+C 0.366 0.473 -0.178 ? 5.989 6.397 6.323 6.607 1999 Indians 0.373 0.467 -0.161 6.228 6.119 6.547 6.454 6.688 1999 Indians+T 0.366 0.468 -0.162 ? 5.925 6.340 6.279 6.538 1999 Indians+C 0.373 0.457 -0.148 ? 6.006 6.414 6.321 6.535 A bit more explanation: besides the default version of the Markov that Tango has on his site, as well as the simple versions of BaseRuns and Bill James’ Runs Created that the webpage also produces, I’ve listed the results for a slightly altered version of the Markov that I came up with, which attempts to account for certain factors that are missing from the Markov (I’ll talk more about this later). The “aggro” factor is my stab at measuring base running aggression and effectiveness that I use in the tweaked Markov. So, at the top two spots on the list, we have the theoretical runs scored of teams full of clones of either Trumbo or Callaspo. This is basically the same idea as the RC27 you can find amongst ESPN.com’s sabermetric stats (which places Trumbo at 4.47 and Callaspo at 5.22, by the way). You can see right away that the Markovs favor Callaspo over Trumbo more than you might expect from their wOBAs and wRC+. Do you remember seeing the exponential growth curve of runs depending on team OBP in my last article? That explains why this is the case — it’s an important team effect that wOBA doesn’t try to account for. You’ll also notice that relative to Trumbo, Callaspo is worth a lot more to the good offenses than to the bad ones. In particular he’s worth more to the high-OBP teams, as besides the exponential impact his better OBP has on runs, his relative lack of power hurts less. That’s because the value of a single to a high-OBP team is greater than it is to a low-OBP team, especially relative to a HR (see the graphs in my second article if that confuses you). There is a threshold of team suckitude at which 2011 Trumbo’s offense would become more valuable to a team than 2011 Callaspo’s, but it appears that even a bad team in the deadball era of the 60s is still a little bit short of that. Play along at home or work I took a page out of Bradley Woodrum’s book and I’m giving you a peek via the Excel Web App. Just click on the green Excel icon in the bottom right area of the app to download the spreadsheet (about 1 MB in size). Once you’ve downloaded it, you’ll be able to paste data from the Standard section of team batting numbers from FanGraphs (link) into the “Enter Data Here” tab of my spreadsheet, or enter whatever you want manually. You’ll then be able to see the results of the calculations on the “Results” tab (surprise), which you should be able to find near the bottom of the spreadsheet. Here ya go: The Perfect Run Modeler? Almost. Tom Tango says his model is “mathematically perfect,” but readily acknowledges that it’s a bit simplistic, ignoring not only steals (SB) and caught stealing (CS), but grounded into double plays (GIDP) and other outs on bases (OOB). To properly account for these factors would require a much more complicated model, but I’ve come up with some modifications that attempt to account for those factors, without fundamentally changing Tango’s model. The first thing I did was to reduce each team’s expected plate appearances per game by their expected GIDP and CS per game, along with an empirically-derived OOB constant tied to their on base rates. It’s not a perfect solution, because, for one, OOB rates aren’t so constant, as James Gentile recently pointed out at THT. You can, however, get OOB data from Baseball-Reference.com, if you have the patience and the desire. Another issue (I think) is that GIDP rates are dependent on how likely it is for a batter to have men on base, which would mean, for example, that I shouldn’t be penalizing a team full of 9 Trumbos so much for GIDP, because that team would be less likely to be able to hit into one. That could be worked out better, but it’s tricky. The other main thing I did was to create the aforementioned base running aggressiveness modifier to the extra-base-taking rates that are essential to the model (they’re really the main assumptions in the model that are a bit tricky to estimate). It’s based on things like steals and caught stealing per runner on 1B, as well as 3B/2B. It’s probably not so proper that I’ve also included GIDP/PA as a major factor here, but the last trick I did didn’t fully account for the negative impact of GIDPs. I also included team OBP and SLG as factors, as one can expect weaker teams to be more aggressive on the base paths due to low odds of scoring without taking extra bases. Finally, I changed the default extra-base-taking rates to be more in-line with Tango’s empirical findings. Of course, those rates aren’t entirely stable. Feel free to change anything in the “Results” tab that is bordered in red, as you see fit. You can even mess around with the “Calculations” tab if you know your stuff. Well, that’s my time. Hope you’ve enjoyed. There’s plenty more I can say about this subject, if you’re interested — let me hear your questions and comments, and if you’d like to see me apply this to something else or make changes.
Is Madison Bumgarner a bully?
Ios_download En_app_rgb_wo_45

Hue Jackson shoots down Joe Thomas trade rumors

Teammates Kevin Harvick, Kurt Busch tussle after race

Report: Dez Bryant expected to play in Week 8

Jay Cutler cleared to return from thumb injury

Report: Geno Smith likely suffered torn ACL


Report: 49ers willing to trade LT Joe Staley

Report: Rockets management worked to cut Howard’s minutes

WATCH: Cardinals, Seahawks trade missed chip shot OT field goals in worst game ever

Gronk giggles like a school boy thinking of 69th career touchdown

WATCH: Stafford connects with Boldin for game-winning TD

Ryan Fitzpatrick: Benching from Jets inspired me

Week 8 in college football as explained by Elton John

Three reasons why the Cleveland Indians could win (or lose) the World Series

Good, bad, and ugly from Week 7

What's wrong with Man U?

Can Oilers' amazing start last?

WATCH: MLS player got a yellow card for twerking

Rock Band Rivals review: Score hunting

The NBA vets we're going to miss in 2016-2017 season

Tyrann Mathieu irritated over lack of offense for Cardinals

Four-star recruit on Notre Dame: Talking to Kelly ‘wasn’t fun’

Five best hockey fights from Week 2 in the NHL

Who and what is hot and cold to start the NHL season?

MLB News
Delivered to your inbox
You'll also receive Yardbarker's daily Top 10, featuring the best sports stories from around the web. Customize your newsletter to get articles on your favorite sports and teams. And the best part? It's free!

By clicking "Sign Me Up", you have read and agreed to the Fox Sports Digital Privacy Policy and Terms of Use. You can opt out at any time. For more information, please see our Privacy Policy.
Get it now!
Ios_download En_app_rgb_wo_45

Week 8 in college football as explained by Elton John

Three reasons why the Indians could win (or lose) the World Series

Good, bad, and ugly from Week 7

Rock Band Rivals review: Score hunting

The NBA vets we're going to miss in 2016-2017 season

Who and what is hot and cold to start the NHL season?

With Game 6 looming, Kershaw is not defined by inaccurate image

Best games to watch in Week 8 of CFB and the rest of the season

Five most interesting NBA players who don’t play for the Warriors or Cavs

Indians go from overlooked to potential World Series favorites

Today's Best Stuff
For Publishers
Company Info
Follow Yardbarker