Originally written on Fangraphs  |  Last updated 11/14/14

01 MAR 2009: Jason Heyward of the Atlanta Braves walks out towards the outfield during the spring training game between the Philadelphia Phillies and the Atlanta Braves at Champion Stadium in Kissimmee, Florida. Photo via Newscom Photo via Newscom

Getty Images

Understatement: Jason Heyward‘s 2011 season did not go quite as most of us expected. His (seemingly) long-awaited 2010 rookie campaign at age 20 mostly lived up to the hype, but he followed up with stereotypical “sophomore slump” season that was marred by nagging injuries, benchings, but most of all, poor hitting (relative to expectations, at least). Now 22, Heyward is so far may not be hitting as well as he did in his rookie year, but he is doing well. His resurgence over the first quarter of the season thus far is one reason the Braves are currently leading the National League East.

Many explanations have been given for Heyward’s 2011 issues, too many to deal with one way or the other. If it is not too boring, I want to focus on a couple different ways that “regression” was used in this particular case to see how it is sometimes misused, or, perhaps more accurately, used in a clumsy fashion.

Just so we are clear at the outset: a basic, layperson’s understanding (that’s all I’ve got, personally) of regression to the mean goes a long way in clearing up various “mysteries” we fans can get hung up on, e.g., why projection systems do not always project young players to get more awesome every year.

That last point helps us see how regression can be a good explanation for the legendary “sophomore slump.” I am sure there have been detailed studies done on that phenomenon (or myth), but for now let’s just stick with the obvious. We are never certain of a player’s true talent. However, for most players their true talent is probably closer to the average of the population from which they are drawn than their observed performance. That is why when estimating (“projecting”) true talent for a pool of players, regression is always involved. It may not get them all right, but given the uncertainty, it reduces the overall error. That is why we should see properly-done statistical projections as humble rather than arrogant.

Back to Heyward: obviously it made sense to expect improvement from a guy who hit well in the majors at age 20. That is not to say that regression is put aside, of course — both that and adjustment for age is taken into account by a decent projection system. But beyond simple projections, to what extent did “regression” apply to Heyward then and now?

[Of course, this leaves aside the scouting and medical dimensions. Some people saw all sorts of problems with Heyward's swing mechanics in 2011, and obviously shoulder problems are going to make things difficult. I am not dismissing those perspectives. They are vital. I am simply focusing on the statistical perspective. How to bring them together properly is a key issue that is beyond the scope of this post.]

Now, it would have been fair to expect Heyward to still improve despite the regression component in projections — without looking again, I would guess that most projection systems did see his performance improving in 2011. Obviously, it did not. What went wrong? This is where well-meaning but ham-handed uses ofregression might have led some people astray. BABIP, as so often is the case, is the first place people look, but did it really apply?

Some (retrospectively) may have looked at Heyward’s .335 BABIP and his 17.8 percent line drive rate and noted that he was unlikely to sustain that performance. Well, his BABIP did go down to .260 in 2011, and now it’s back up. That sort of quick analysis sometimes works, but if we look deeper, it tells the wrong story.

For various reasons that I will not go into here, I am not a huge fan heavy of reliance on xBABIP. However, it is not without its uses, and in this case it sort of applies because the line drive rate/BABIP assertions about Heyward and others are implicitly relying on the same assumptions. On the crude account of the reason for Heyward’s 2011 issues given above, we would expect a big gap between his BABIP and xBABIP in 2010. Using this xBABIP formula, here are the BABIP and xBABIP for Heyward for each of his three seasons in the majors:

2010 .335 BABIP, .338 xBABIP
2011 .260 BABIP, .267 xBABIP
2012 .300 BABIP, .307 xBABIP

As you can see, a more detailed xBABIP analysis does not show Heyward being significantly lucky or unlucky in any of his three seasons so far.

Even that tends to obscure whatever point I am trying to make. For one thing, while we fans generally have a basic understanding of what “regression” means, we also tend to apply it selectively. This is at least partly right. After all, BABIP “regresses more” than home runs, strikeout, and walk rates. However, people have latched onto this and tend to simply look at a player’s current-season performance, “correct” his BABIP, and assume that they have a good measure of his true talent.

That might work better that just looking at current performance, but it tends to obscure the reality: even if the various components should be regressed differently, all of them are subject to regression.

Now, it just so happens that (grain of salt) the batted ball data indicates probably shows much of why Heyward has been better this season than in 2011. (I will leave it to others to discuss his swing mechanics.) But that is not all that has changed. On the downside, he is striking out more in this (still young) season than in in the past, probably as a consequence of making contact less frequently.

However, focusing just on BABIP from year-to-year also might cause us to miss other things Heyward has improved on so far. As one might expect from a player in his early 20s, looking at the rates, we also see that he has increased both his home run rate and his rate of extra-base hits on balls in play. Those rates are superior not only to his 2011 performance, but also to his 2010 season.

…and, of course, they need to be seen weighted accordingly given that he has only had 150 plate appearances so far this season. Moreover, those rates also need to be regressed appropriate amounts.

It would be irritatingly hindsight-ful to say that we should have seen Jason Heyward’s “sophomore slump” coming. One of the main “flags” (BABIP) was somewhat misleading. However, it should not have been totally unexpected — not because of BABIP, but because all estimating the true talent of all players, even (or especially) very good, young ones, involves a great deal of uncertainty — and that is why we employ regression. Regression, and, to a lesser extent, BABIP, are our good old friends and an essential part of the saber-fan’s toolbox. However, when we invite them over, let’s make sure they stay in the right bedroom


MORE FROM YARDBARKER

Michael Vick: Jameis Winston is the future of the NFL

Report: No NFL team is moving to Los Angeles in 2015

Report: 49ers looking for young, offensive mind as next coach

Sheldon Richardson: I'm on the same level as JJ Watt

Kendrick Perkins: Rajon Rondo wanted out of Boston

Report: Jim Harbaugh being encouraged to take Michigan job

LIKE WHAT YOU SEE?
GET THE DAILY NEWSLETTER:

Three quarterbacks who could be traded this offseason

Wolves trade Corey Brewer to the Rockets

Yankees trade Prado to Marlins for Eovaldi in 5-player deal

Brandon Browner fined $8K for throwing Brandon Gibson to turf

Report: Sharper, NFLers called spiked drinks 'horny juice'

Rex Ryan: My wife could coach Tom Brady

Grizzlies players buy new car for team intern

Peyton Siva jokes that Rick Pitino is in the mob

WATCH: Montrezl Harrell ejected for throwing punch

UNC wearing Jordan-era throwback uniforms

Brian Kelly: Justin Brent gets 'distracted,' is in my doghouse

Five free agent signings that should happen in 2015

Reporter asks Rajon Rondo awkward question about trade

Video: Swaggy P elbows Steven Adams in throat, gets ejected

Video: Grizzlies buy car for assistant who had his stolen

Video: LeBron takes pass to the face from Dion Waiters

New video of Ray Rice incident emerges

President Obama calls James Franco “James Flacco”

MLB News
Delivered to your inbox
You'll also receive Yardbarker's daily Top 10, featuring the best sports stories from around the web. Customize your newsletter to get articles on your favorite sports and teams. And the best part? It's free!

By clicking "Sign Me Up", you have read and agreed to the Fox Sports Digital Privacy Policy and Terms of Use. You can opt out at any time. For more information, please see our Privacy Policy.

Vick: Winston is the future of the NFL

No NFL team in LA next season?

UNC rocking Jordan-era throwbacks

Reporter asks Rondo awkward question

Three quarterbacks who could be traded this offseason

Jon Lester and the Cubs

Report: NFL players used 'horny juice'

Top 10 worst NBA related commercials of all-time

Phelps gets probation for DUI

NFL coaches on hot seat in Week 16

The Roy Williams-Terrell Owens horse collar turns 10

Most underpaid NFL players in 2014

Today's Best Stuff
For Bloggers

Join the Yardbarker Network for more promotion, traffic, and money.

Company Info
Help
What is Yardbarker?

Yardbarker is the largest network of sports blogs and pro athlete blogs on the web. This site is the hub of the Yardbarker Network, where our editors and algorithms curate the best sports content from our network and beyond.