Connect with us


Slaying the Dragon: WAR and the Battle for the Soul of Baseball

Credit: Renata Sedmakova /

“WAR, huh, yeah, what is it good for? Absolutely nothing. Say it again, y’all…”


Wins Above Replacement (WAR), has become the Golden Calf of baseball analytics; at some point in the days of yore – yet somehow not so long ago – the people of America’s chosen sport, baseball, decided to melt down all the precious gold that had been the traditional way of understanding and valuing a player’s worth, and formed it into a new statistical idol to worship and adore. WAR, the idea that you could distill a player’s value into one single number by deciphering how many wins they added over your average “replacement player,” soon became not only the gold standard, but the cash cow of modern baseball analysis; but an examination of what has transpired over the last (at least) decade-plus has left me wondering, where’s the beef?

WAR has become almost untouchable and has not only led to many of the classic “counting stats” (average, RBI, Wins, etc.) being discarded and considered to be “stupid” or “irrelevant,” but has also sadly influenced many a commentator to draw some pretty insane conclusions, such as when they argued that despite Miguel Cabrera not only winning the triple crown but being the first since 1967 to accomplish said feat, it was actually Mike Trout who deserved to win the 2012 MVP award because he had the higher WAR. I should know, because before I took the time to look into the topic more in-depth, I too was making this argument. Oh, how foolish I was.

Sure, Miggy led the AL in batting average (.330) and the MLB in homers and RBI (44, 139), but since Anaheim’s rookie sensation had a WAR of 10.5 compared to Cabrera’s 7.1, Trout was apparently more deserving in the minds of many. Thankfully, sanity prevailed, and it wasn’t until a few years later that I came to realize why it was such a good thing that it did (but more on that in a moment).

To truly understand the problem with WAR, and why I believe it is an absolute failure of a stat in determining a player’s worth (let alone using it to compare one player to another), we need to look at definitions; because you see, the reason WAR cannot be trusted to do what Sabermetricians and many a fan believe it can, is because Wins Above Replacement is neither accurate, nor is it an actual statistic. At best, WAR is a useful, but subjective tool in evaluating how you want to build a team (in terms of what you value in your players), but it is not an objective way to rank what players are the best in the game.

Why? Well, statistics are foolproof, grounded, quantifiable, objective, and absolute, but WAR is none of these things; WAR is subjective and unquantifiable, and the proof of this can be found in its very definition. Baseball-Reference (one of the three main formulizers of WAR along with Baseball Prospectus and Fangraphs) tells us that:

“There is no one way to determine WAR. There are hundreds of steps to make this calculation, and dozens of places where reasonable people can disagree on the best way to implement a particular part of the framework,” and is forced to conclude that, “WAR is necessarily an approximation and will never be as precise or accurate as one would like.” Similarly, Fangraphs admits the lack of accuracy of WAR in their primer on the subject as well.

So, the question is: if there is no singular way to determine someone’s WAR, and it’s a process that requires hundreds of steps to figure out, and within those hundreds of steps there are literally dozens of places where REASONABLE people can actually disagree over how to even implement that part of the process, then how in the world is that quantifiable? The answer is that it’s not, which is why it’s not really a statistic to begin with and should not be valued as one.

Statistics cannot be an approximation, that’s not how math works. Math, like truth, is not relative, it is absolute; an exact science that relies on key steps to arrive at the right answer. When we were in school and didn’t follow the correct procedures and steps but still stumbled on the right answer, we got marked down on our grade because by not following the correct formula we risked getting the wrong answer; and using our own formula would not assure that we got it right in the future. Can you imagine using these principles and methods to try and build a bridge, for example, (Bridge Above Replacement)? No, thank you. I mean, would you be all that surprised to discover that Boeing was using “Wings Above Replacement” to construct their planes? The same applies to WAR, the fact that there is no one way to figure it out and many places to reasonably disagree on its formula, shows the error of viewing WAR as a stat rather than an approximation, which is really nothing more than an educated guess – potentially close but not fully accurate.

Older stats which now, more often than not, are fighting to not be done away with, such as batting average, on-base percentage, ERA, RBI, etc., are solid, grounded, objective, and exact stats that are unambiguous and easy to figure out. For example, batting averages are based on simple division, dividing the number of a player’s hits by the number of his at-bats, and rounding the decimal. Similar simple calculations can be made to derive other statistics like on-base percentage (hits + walks + hit by pitch, divided by plate appearances) and ERA (dividing the number of earned runs allowed by the number of innings pitched and multiplying by nine), while other stats such as hits and RBI are simple addition, and each tells you exactly what you want it to.

Plainly speaking, the old stats are not a guess or an approximation, but are accurate truths, which is why some of the “newer” Sabermetrics like BABIP do in fact qualify as stats because they are objective and quantifiably accurate – there is no approximation involved. All that said, the basic rule of thumb should be that if your stats are not easily calculated by a 10-year-old baseball fan looking at the back of his favorite player’s card, you are missing the point; and if in order to make your “stat” work you need to create a complex and complicated formula, you are trying too hard!

Then there is, of course, the issue with the concept of a “replacement player,” because in the end, I don’t believe anyone can satisfactorily explain what a “typical replacement player” is, although they will try. Experience has taught me there is no such thing as a typical replacement player; every player is different, and any day your “average” AAA or AAAA replacement player can become hot or cold, succeed or fail – and some who are seen as replacement players end up becoming perennial All-Stars! Replacement players can range from Willie Harris and Shane Robinson to someone like Rajai Davis, or Andrew Velazquez; each player is different and in the blink of an eye can put up all kinds of varied numbers (2007 Shelley Duncan is a prime example of this). Even trying to take a baseline average found among upper minor leaguers and so-called quadruple-A players is unsatisfactory, given that baseball is such a crazy game that literally anyone can be a hero on any given day or for a prolonged period of time – and they want to try and quantify and compare that using a single number?

I understand the desire to be able to define a player’s value in singular terms, it’s a laudable – if not somewhat lazy – goal, but it misses the point of baseball statistics. The traditional, “back-of-the-card” stats are a rich and beautiful tapestry meant to be looked at as a whole story, defining how good a player is, and if we’re being honest, the vast majority of us have always been able to look at Ken Griffey Jr.’s counting stats and used the eye test if we watched him live, to know exactly how valuable he was to his team or compared to, say Paul O’Neill (who was no slouch himself). Where there was a question due to comparable stats and eye results, that’s where the fun debates existed!


“The thought of WAR blows my mind, WAR has caused unrest within the younger generation…”


I mentioned previously that there was a point a few years after the 2012 MVP debate where all this became crystal clear to me, and that occurred when Michael Fulmer won the 2016 AL Rookie of the Year over Gary Sanchez, mostly due to WAR. It didn’t matter that Sanchez’s stats were in many ways better in less than half a year than most catchers put up in a full season, it didn’t matter that Fulmer – while good – didn’t exactly put up ace numbers, nor did it matter than Sanchez by nature of being a catcher, played slightly more than twice the amount of games in less than half a season than Fulmer did in a full one; Fulmer still won, and WAR was the driving factor in essentially all the explanations. In one interaction I observed, it was said “Fulmer had a WAR of 4.9, Sanchez’s WAR was 3.0, Fulmer was obviously better.” But was he really? No, he wasn’t, and trying to determine which player is better than the other by using WAR is nonsensical, simply because by definition that is not what WAR calculates.

Remember, WAR’s raison d’être is meant to discover how much better a starter is over a replacement player, not a fellow starter or star. Even assuming WAR is correct and quantifiable, all these numbers show is that Fulmer provided 1.9 more wins over a replacement minor league starting pitcher than Sanchez did over a replacement minor league catcher; it does not tell us how many more wins Fulmer contributed to the Tigers than Sanchez did to the Yankees, and that’s because that calculation falls outside the definition and purpose of WAR. Even at face value, Sanchez would seem to have a much greater window of impact than Fulmer due to the fact that he played in 53 games to Fulmer’s 26 and was able to help the Yankees win games on both sides of the ball.

Theoretically, one of the best ways to figure out who was more valuable to his team compared to the other is to literally go through every box score of every game they played and won (and maybe even lost), and determine the impact their performance had on that win/loss, and while that sounds like an exhilarating task, to quote Kimberly “Sweet Brown” Wilkins, “Ain’t nobody got time for that!”

In essence, WAR to the game of Baseball, is akin to if there were two jars of jellybeans that needed to be sold, but there was a dispute over how to calculate the proper price at which to sell.

Imagine momentarily if you were to sit down one traditionalist and one Warmonger (sorry, had to), and place before them the two identical jars of jellybeans; they are the same size and shape, have the exact same number of beans inside them, and the beans inside are all fresh, equally sized, proportioned and flavored the same. The Sabermetrician judges the size and shape of the jar and derives a complex and rather subjective analysis through which he approximates that there are 285 jellybeans in the jar, thus making the price $5.50. The traditionalist on the other hand uses basic arithmetic, or just simple addition to count out each and every jellybean and finds that there are really 220 jellybeans in each jar and that the price should truly be $3.00.

Now, the Sabermetrician was somewhat close with his approximation, but since his method wasn’t fully accurate, he ended up miscalculating and overvaluing his jar and in so doing, ripped off his customer; as opposed to the traditionalist who used a tried, true, and quantifiable method to derive the exact value of his jar for public sale. This is what is essentially happening with the use of WAR to try and determine overall player value in today’s game, except that the Sabermetrician goes even one step further and tries to determine how much more valuable his Jellybeans are over a replacement candy, like Mike and Ike (as if such a replacement for Jellybeans ever truly existed)!

It often feels like most Sabermetrics, and WAR in particular, is essentially the embodiment of how CBS’ The Big Bang Theory treated theoretical physics. In the show, Dr. Sheldon Cooper and other scientists had to invent all-new dimensions, definitions, and scientific understandings in order to simply make the math work so that they could rationalize and “prove” their theories, whereas Sabermetricians have to invent complicated and arduous formulas just so that they can attempt to justify an approximation that redefines the way we look at baseball and supposedly renders the ways of the past outdated.


“Induction then destruction, who wants to die? Oh…”


At best, if we want to use WAR to compare two players against each other (and we REALLY shouldn’t), then we should only be comparing players who play the same position as each other, i.e. 2B vs 2B, instead of for example, 2B vs. DH, as is shown in the graphic below comparing Designated Hitter David Ortiz’ and Second Baseman Craig Counsell’s 2005 seasons:

I’m sorry, I know David Ortiz was “just a DH,” and you “adjust for the position because DH will give you more offense,” but if we are being 1000% honest, that doesn’t make up the difference! Additionally, Craig Counsell could have literally had the best fielding year in the history of mankind, heck, he could have had a better fielding year than when crop rotation was discovered, and it still wouldn’t have made him a more valuable player than Papi in 2005, or ever. The idea that Craig’s hitting, fielding, and positional adjustment provided equal to more wins (given that he had 0.3 more) than David produced is frankly, asinine.

The same can be said for Miguel Cabera having a lower career WAR than Pee Wee Reese (fielding and position be damned):

But even if we were to compare the same position, Starting Pitcher, the point still stands; take a look at Kevin Appier vs Sandy Koufax:

I beg every levelheaded observer to please look me in the eye and try to tell me with a straight face that Kevin Appier was more valuable to his team and provided them more wins than Hall of Famer Sandy Koufax did for his. You can’t because it would be inherently dishonest. Diving further into the example of these two pitchers, think about the context of their careers; Appier had 13-14 seasons as a full-time starter (not counting injury years), whereas Koufax had only 9 such seasons, and almost half of those nine seasons, included his coming out of the pen for a number of games; Appier, simply put, had more access to accumulate WAR than did Koufax, and yet, not only is Koufax remarkably close in career WAR, but any objective observer could tell you his career was markedly better than was Appier’s. I could even, reasonably, make the case that Appier had only five seasons in which he was worth being in the rotation at all out of the parts of sixteen seasons he played; that number for Koufax? Six of twelve; and I didn’t need WAR to tell me any of that.

This is why using WAR to compare the value of two starters and/or stars is a fool’s errand, and if you are going to use it at all, at least do so in its stated context of starter vs replacement player (which, as covered before, is foolish and unreliable as well). I do have more to say on comparisons between players in order to bring this full circle, but first, we must address WAR in the context it was originally intended to be discussed once more and see just how ridiculous relying on it truly is.

Let us consider the cumulative WAR totals for a few Hall of Fame/Hall of Fame-bound players, starting with two greats from the hottest rivalry in baseball history, Derek Jeter, and David Ortiz, who own career WAR of 71.3 and 55.3 respectively. Now I understand that WAR can sometimes consider fielding (maybe even a bit too much) and that Jeter was not in the upper echelon of fielding Shortstops (though certainly not a liability), and that Ortiz as a Designated Hitter had almost no defensive prowess to speak of, but the idea that Jeter has only a 71.3 career WAR and Big Papi a 55.3 career WAR is ridiculous. Do you mean to say that two no doubt-first ballot Hall of Famers only provided that many wins over an “average” AAA/AAAA replacement player over the course of twenty-year careers?

Even assuming WAR is a quantifiable stat, those numbers are both too low for men who put together the careers that they did, which included a combined eight world championships and 24 seasons in which their teams made the playoffs. You would be hard-pressed to find many, if any, seasons in which Jeter and Ortiz did not play a significant role in getting their teams to the postseason, either by driving in runs, scoring runs, making amazing and clutch defensive plays or serving as protection behind other hitters who would get better pitches to hit by the pure virtue of them being in the lineup.

The Captain is top 10 all-time in hits (#6 overall), and Papi is top 20 in home runs (#17 overall), and even putting aside that DH is a relatively new position in the history of the game, the same cannot be said about Shortstop. There are twenty-four shortstops in the Hall of Fame and Derek Jeter is arguably better than all but perhaps Honus Wagner and Ernie Banks, yet he supposedly only provided for slightly less than 72 wins over an average replacement in 2747 games? Less than 72 more than the aforementioned Andrew Velazquez or one of my all-time favorite scrubs, Augie Ojeda? That is just unreasonably false. Heck, consider what Oswaldo Cabrera is doing for the Yankees to start off the year, and he wasn’t even certain to be a bench player coming into the season; he very easily could be categorized as a “replacement player,” yet he has been an indispensable starter in DJ LeMahieu’s absence.

This brings us to the promised full-circle moment, as once again, Mike Trout enters the picture. Now, I want to make sure I am absolutely clear here that I am not trying to slam Mike Trout; I think he is a great player, a future Hall of Famer, and seems to be a pretty good guy who is getting off to an unreal start this year, but – especially because of WAR – he is also by far the most overrated player of all time, and has been for almost his entire career. For many years now, I’ve heard Mike Trout being called “the best player in the history of baseball” unironically, even when (at the time) he had only played as little as eight years in the big leagues! Why? Because of WAR, of course.

Through his first eight full seasons (so not counting his 40-game cup of coffee in 2011), Mike Trout racked up a career WAR of 64, meaning that – if we truly believe judging a star player’s worth compared to another’s by WAR is legitimate – in two fewer seasons than you are required to play to even be eligible for Hall of Fame consideration, Mike Trout was only roughly seven wins shy (a typical year’s worth for him) of what a Hall of Famer and one of the greatest shortstops of all time in Jeter, put up in TWENTY? Even now, despite having only played in parts of fourteen combined seasons, Trout has surpassed Jeter by fifteen total wins and counting. Again, through his eighth season, Trout was already more valuable than Hall of Famer Mike Piazza (59.5) and Hall of Famer Ortiz (55.3), despite playing less time than both; and while I can appreciate great fielding, I’m hard pressed to believe it would make that drastic of a difference.

Do you want to take great fielding into account? Okay! Currently, Mike Trout has a career WAR of 86.5 and counting (+0.2 since I first started writing this, no joke), about twenty-one games into his fourteenth season, and yet Hall of Famer Ken Griffey Jr. who played twenty-two seasons (and like Trout has dealt with his fair share of injuries) only has an 83.8 career WAR. It is taking everything within me not to chuck my computer across the room while screaming “HOW!?!?!?!” Even if you want to argue that Griffey’s supposed replacement players were better during the steroid era than Trout has had available (debatable for a variety of reasons), you cannot tell me that the guy with 630 HR, 1836 RBI, 2781 H, 524 2B, and a .284 average (with a great glove), was less valuable through twenty-two seasons than the guy with 374 HR, 948 RBI, 1639 H, 311 2B, and a .301 average (with a great glove) has been in less than fourteen. Nope, not going to happen; there is a greater chance that the moon is made of cheese or that I will wake up tomorrow and find my eyesight fully healed than for that to be the case.

PHOENIX, ARIZONA – MARCH 07: Hitting coach Ken Griffey Jr.#24 of Team USA talks with Mike Trout #27 during a practice ahead of the World Baseball Classic at Papago Park Sports Complex on March 07, 2023 in Phoenix, Arizona. (Photo by Christian Petersen/Getty Images)

Fun fact, you have to add together Trout’s first twelve seasons in which he played a game (which includes his initial forty games, the pandemic-shortened season, and three years in which he suffered some sort of injury) for his stats to rival or equal Griffey’s first ten seasons, (which included both the 1994 strike year and an injury shortened season as well); but WAR would have you believe that Trout is somehow more valuable with an 82.3 WAR vs Junior’s 65.8. This is madness, and no amount of supposed talent availability in “replacement players” can explain such a discrepancy.

One final note on Trout; I was listening to an episode of the Chris Rose Rotation (a production of Jomboy Media), where Rose was interviewing one of the most feared sluggers of the 90s, Albert Belle, a man who frankly was so dominant that he probably would have hit 500 home runs and been a Hall of Famer had he not called it quits twelve years into his career. One of the things that Belle said in the interview that stood out to me was that his numbers through his first twelve years were actually better than Mike Trout’s, and yet they call Trout the GOAT while he gets almost no love; I checked it out and by golly, he was right! Belle was better or at least similar to Trout in almost every counting stat category, including besting him in Home Runs, RBIs, Hits, and Doubles. Yet, Trout’s WAR through the same amount of time (and over one hundred fewer games played) stood at 82.3, whereas Belle’s career WAR is only 40.1.

Now, I’m not going to argue that Belle was a great fielder because he wasn’t, but are we truly going to be asked to believe that Trout’s fielding made him 42.2 more wins valuable than Belle over the same time period, and less games? And if those numbers don’t take fielding into account, even with various “adjustments” being made, then considering just how similar the rest of their offensive statistics were, that should lead to even more doubt being cast regarding this comparison. Even if you think Trout was better during that time period, it most certainly wasn’t that drastic of a difference; and if you want to talk about actual potential GOAT players, just consider Mike Trout’s first 10 years compared to soon-to-be Hall of Famer Albert Pujols’:

But I want to leave this discussion on a high note. I don’t want to be all negative, baseball is fun after all!

You see, given that fans across the entirety of sports will often cry about how the official, be it Umpire or Referee, screwed up the call against their team or in favor of the opponent, thus causing their rooting interest to have victory unfairly ripped away from them; and considering it is all the rage to invent new statistics that try and quantify a person’s value or impact regarding wins, why shouldn’t I do the same?

That is why I, with a little help from my friend Dan, have determined that there must be a calculation that determines just how many of an Umpire’s missed calls, in either direction, impact the outcome of the game. This is why, ladies and gentlemen, boys and girls, children of all ages, I, Aaron Reale, am proud to present to you the most angelic of new statistics. I call it: WAH!

Or, to put it another way: Wins. Above. Hernandez.

Aaron is a Writer and communicator who has notably served on the communications team of the Westchester County Executive. Nicknamed "Mr. Baseball" in his youth, Aaron is a lifelong Yankee fan, Tino Martinez and Aaron Judge enthusiast, and a fierce defender of Craig Biggio's Hall of Fame worthiness. When he is not writing, or doing baseball related activities, Aaron is an avid foodie and culinarian. His non-baseball writing can be found at the Realety Check substack.

More in Featured