Originally posted on Fangraphs  |  Last updated 5/18/12

Understatement: Jason Heyward‘s 2011 season did not go quite as most of us expected. His (seemingly) long-awaited 2010 rookie campaign at age 20 mostly lived up to the hype, but he followed up with stereotypical “sophomore slump” season that was marred by nagging injuries, benchings, but most of all, poor hitting (relative to expectations, at least). Now 22, Heyward is so far may not be hitting as well as he did in his rookie year, but he is doing well. His resurgence over the first quarter of the season thus far is one reason the Braves are currently leading the National League East.

Many explanations have been given for Heyward’s 2011 issues, too many to deal with one way or the other. If it is not too boring, I want to focus on a couple different ways that “regression” was used in this particular case to see how it is sometimes misused, or, perhaps more accurately, used in a clumsy fashion.

Just so we are clear at the outset: a basic, layperson’s understanding (that’s all I’ve got, personally) of regression to the mean goes a long way in clearing up various “mysteries” we fans can get hung up on, e.g., why projection systems do not always project young players to get more awesome every year.

That last point helps us see how regression can be a good explanation for the legendary “sophomore slump.” I am sure there have been detailed studies done on that phenomenon (or myth), but for now let’s just stick with the obvious. We are never certain of a player’s true talent. However, for most players their true talent is probably closer to the average of the population from which they are drawn than their observed performance. That is why when estimating (“projecting”) true talent for a pool of players, regression is always involved. It may not get them all right, but given the uncertainty, it reduces the overall error. That is why we should see properly-done statistical projections as humble rather than arrogant.

Back to Heyward: obviously it made sense to expect improvement from a guy who hit well in the majors at age 20. That is not to say that regression is put aside, of course — both that and adjustment for age is taken into account by a decent projection system. But beyond simple projections, to what extent did “regression” apply to Heyward then and now?

[Of course, this leaves aside the scouting and medical dimensions. Some people saw all sorts of problems with Heyward's swing mechanics in 2011, and obviously shoulder problems are going to make things difficult. I am not dismissing those perspectives. They are vital. I am simply focusing on the statistical perspective. How to bring them together properly is a key issue that is beyond the scope of this post.]

Now, it would have been fair to expect Heyward to still improve despite the regression component in projections — without looking again, I would guess that most projection systems did see his performance improving in 2011. Obviously, it did not. What went wrong? This is where well-meaning but ham-handed uses ofregression might have led some people astray. BABIP, as so often is the case, is the first place people look, but did it really apply?

Some (retrospectively) may have looked at Heyward’s .335 BABIP and his 17.8 percent line drive rate and noted that he was unlikely to sustain that performance. Well, his BABIP did go down to .260 in 2011, and now it’s back up. That sort of quick analysis sometimes works, but if we look deeper, it tells the wrong story.

For various reasons that I will not go into here, I am not a huge fan heavy of reliance on xBABIP. However, it is not without its uses, and in this case it sort of applies because the line drive rate/BABIP assertions about Heyward and others are implicitly relying on the same assumptions. On the crude account of the reason for Heyward’s 2011 issues given above, we would expect a big gap between his BABIP and xBABIP in 2010. Using this xBABIP formula, here are the BABIP and xBABIP for Heyward for each of his three seasons in the majors:

2010 .335 BABIP, .338 xBABIP
2011 .260 BABIP, .267 xBABIP
2012 .300 BABIP, .307 xBABIP

As you can see, a more detailed xBABIP analysis does not show Heyward being significantly lucky or unlucky in any of his three seasons so far.

Even that tends to obscure whatever point I am trying to make. For one thing, while we fans generally have a basic understanding of what “regression” means, we also tend to apply it selectively. This is at least partly right. After all, BABIP “regresses more” than home runs, strikeout, and walk rates. However, people have latched onto this and tend to simply look at a player’s current-season performance, “correct” his BABIP, and assume that they have a good measure of his true talent.

That might work better that just looking at current performance, but it tends to obscure the reality: even if the various components should be regressed differently, all of them are subject to regression.

Now, it just so happens that (grain of salt) the batted ball data indicates probably shows much of why Heyward has been better this season than in 2011. (I will leave it to others to discuss his swing mechanics.) But that is not all that has changed. On the downside, he is striking out more in this (still young) season than in in the past, probably as a consequence of making contact less frequently.

However, focusing just on BABIP from year-to-year also might cause us to miss other things Heyward has improved on so far. As one might expect from a player in his early 20s, looking at the rates, we also see that he has increased both his home run rate and his rate of extra-base hits on balls in play. Those rates are superior not only to his 2011 performance, but also to his 2010 season.

…and, of course, they need to be seen weighted accordingly given that he has only had 150 plate appearances so far this season. Moreover, those rates also need to be regressed appropriate amounts.

It would be irritatingly hindsight-ful to say that we should have seen Jason Heyward’s “sophomore slump” coming. One of the main “flags” (BABIP) was somewhat misleading. However, it should not have been totally unexpected — not because of BABIP, but because all estimating the true talent of all players, even (or especially) very good, young ones, involves a great deal of uncertainty — and that is why we employ regression. Regression, and, to a lesser extent, BABIP, are our good old friends and an essential part of the saber-fan’s toolbox. However, when we invite them over, let’s make sure they stay in the right bedroom


PLAYERS: Jason Heyward
TEAMS: Atlanta Braves
MORE FROM YARDBARKER

Biggest winners and losers from NFL Week 7

John Wall warns Lonzo Ball about having target on his back

Eric Bledsoe raises suspicion with tweet

Browns select empty seat to win prize

Hue Jackson critical after DeShone Kizer stayed out late Friday

LIKE WHAT YOU SEE?
GET THE DAILY NEWSLETTER:

Carson Palmer to miss eight weeks with broken arm

Mitchell Trubisky leads Bears to win despite completing just four passes

Red Sox confirm Alex Cora as new manager

Joe Thomas misses his first career snap after injuring arm

Dolphins fans cheer after Jay Cutler leaves with injury

Browns bench DeShone Kizer again, bring in Cody Kessler

MLS Decision Day: East to be a rumble while West up for grabs

Most prominent sports bans on the 5th anniversary of the Lance Armstrong ban

Sports & Politics Intersect: Cubs owner up for Heritage Foundation post

The 'Like Mike, only better' quiz

Three Up, Three Down: Dodgers finish Cubs while Astros find pain in the Bronx

The 'Some call me the Rocket, some people call me Maurice' quiz

Kyrie Irving must lead Celtics through a disaster in search for happiness

Jacoby Brissett: The forgotten up-and-comer

NFL Week 7 Predictions

The 'Can I have a quick sword with you?' quiz

College football 2017 Week 8 predictions

NFL Referee Hotline Bling: Austin Seferian-Jenkins drops a call

Braves News
Delivered to your inbox
You'll also receive Yardbarker's daily Top 10, featuring the best sports stories from around the web. Customize your newsletter to get articles on your favorite sports and teams. And the best part? It's free!

By clicking "Sign Me Up", you have read and agreed to the Yardbarker Privacy Policy and Terms of Service. You can opt out at any time. For more information, please see our Privacy Policy.

MLS Decision Day: East to be a rumble while West up for grabs

Most prominent sports bans on the 5th anniversary of the Lance Armstrong ban

Houston Astros hold off Yankees, forcing Game 7

Sports & Politics Intersect: Cubs owner up for Heritage Foundation post

The 'Like Mike, only better' quiz

Three Up, Three Down: Dodgers finish Cubs while Astros find pain in the Bronx

The 'Some call me the Rocket, some people call me Maurice' quiz

Kyrie Irving must lead Celtics through a disaster in search for happiness

Jacoby Brissett: The forgotten up-and-comer

NFL Week 7 Predictions

Today's Best Stuff
For Publishers
Company Info
Help
Follow Yardbarker