David Sparks is the Arbitrarian. His column appears every Thursday here at Hardwood Paroxysm. This week’s stattastic column regards an elaboration on the Box Scores measure he discussed previously. Your feedback is welcome in the comments, puny human… I mean… dear friends.

Two weeks ago, we explored a statistical estimator of value, BoxScores, which estimates player contributions to team success at the season level. Aside from the time-honored complaint that it doesn’t account for defense, there are at least two other improvements that I might wish to make to improve the accuracy of this value estimator.

First is the problem of trades, and more generally, varying team success across the duration of the season. As it stands now, if player A is traded from team X to team Y in the middle of the season, his BoxScores are calculated by finding his PVC to team X’s entire season’s worth of MEV and multiplying that by X’s entire season’s worth of wins; then adding to that the same calculation for team Y to find the player’s season-cumulative BoxScores figure. This is good enough, for an estimate.

However, imagine if both teams are made significantly better by player A. Team X might be on pace for a very successful season up until the trade, and might begin to tank once he leaves. Team Y may have had an inauspicious start, but with the addition of player A, they might turn the season around. If this is the case, player A might be responsible for more success than his BoxScores indicate. Alternatively, similar situations can be envisioned in which much-injured players’ contributions are over- or under-estimated, since BoxScores (using season-level counting statistics) cannot account for game-level success and variations thereof.

Another problem is with comparability, especially comparisons of good players on bad teams to good players on good teams. According to BoxScores, Al Jefferson was less valuable in 2007-08 than was Andris Biedrins. This could be true, but it could be that while Al Jefferson did more every game to help his team win, he could not, (essentially) alone, carry his team enough to get very many wins. The point was made by a commenter on a previous post that a team of Michal Jordan and eleven pre-schoolers would never win an NBA game, though Jordan could be incredibly productive. BoxScores, multiplying productivity by success, would assign Jordan and his eleven weaker teammates the same value: 0. This is certainly an extreme example, but it highlights a possible shortcoming in the BoxScores methodology–wins are discrete, binary events. Either a team wins a game, or it does not. Regardless of whether the score was 101-100, or 130-70, a win counts the same.

The solution, in the form of a more specific metric

The appeal of BoxScores has been (among other things) that it can be applied to every professional basketball player, because season-level box score stats are very widely available. The downside to a more specific, game-level estimator is that the increased accuracy comes at the cost of universality: Game-by-game box score statistics are only available going back to the 1986-87 season. Nevertheless, here I will develop a value estimator that works at the game level, to give us an even more accurate picture of just how much each player contributes.

For each game, we first must calculate each player’s MEV. (See this post for a very detailed description of how this is done.) Then we calculate each player’s Marginal Victories Produced (MVP):

MVP = Player MEV / total MEV sum for both teams

As you can see, in each game, there is a total of 1.00 MVP to be allocated. Each individual’s contribution to the total production in the game is considered their Marginal Victory Production. This way, players on losing teams can be seen as producing valuable contributions–they might be valuable enough to get their team right to the cusp of victory–and this value shows up in MVP (but not in BoxScores).

Here is an example of MVP calculated for a game on April 11, 2008, between the LA Lakers and New Orleans Hornets:

The Lakers won, 107-104. Total MEV for the Lakers was 110.7, and for the Hornets, it was 106.0, so the Lakers’ total MVP allocation was 0.511, versus the Hornets’ 0.489. If we were focusing on wins and losses alone, the Lakers would get 100% of the credit for this game. Arguably, though, the Hornets produced something of value here–they got within four points of winning, and thus MVP is a much more accurate estimator of value.

One interesting way to think of MVP numbers is to note that a team needs a total of at least 0.5 MVP to win a game.¹ Thus, in the game detailed above, Bryant got his team almost a third of the way to the win (0.165 MVP), Paul/Chandler/Stojakovic together got their team 2/3 of the way to a win (0.335 MVP), etc.

MVP value at the season level

To estimate a player’s value for the duration of a season or career, we need only sum their game-level MVP. One nice property of MVP is that the sum total of MVP is equal to the total number of games played–the “value of each game” is divided among each participant, so that all games are accounted for in their entirety. Further, team season-total MVP can be translated to wins and losses by a method similar to the Pythagorean win projection (more on this sometime in the future). How many marginal victories did your favorite players produce? See below…

The first tab (“07-08 MVP”) lists the total number of MVP for each player last season. I would argue that this is a valid way of identifying the league Most Valuable Player. Just as in the BoxScores rankings, Chris Paul comes out on top, followed by LeBron James and Kobe Bryant. However, the differences in the two estimators can be instructive. According to BoxScores, Al Jefferson is the 74th most valuable player–by MVP, Jefferson is 11th, just behind the player for whom he was traded, Kevin Garnett. Dwyane Wade moves from 188th most valuable (BXS) to 67th (MVP) for his his injury-shortened season. Good players on bad teams are not “punished” for having low-quality teammates. Rather, everyone is rewarded based on their contributions to competitiveness, even if that competitiveness doesn’t result in winning every time.

The “86-08 MVP Seasons” tab lists just that–the most valuable seasons from my limited dataset according to MVP. Unsurprisingly, Jordan dominates this list, along with other modern luminaries. Keep in mind that the MVP number is not a number of wins–it’s “Marginal Victories”–but also keep in mind that teams need only 0.5 total MVP in a game to win it. One way, thus, to look at season-total MVP numbers is to say that, for example, Jordan in 87-88 contributed enough MVP to help his team win the equivalent of about 27 (13.52 / 0.5) games. Bear in mind, though, that this is just an interesting shorthand, because summing this figure for each team will not come close to matching that team’s win total. If Jordan had accumulated 0.5 MVP in each of 27 games, and sat out the rest of the season, his team would have won each of those games, and he’d be credited with 13.5 MVP. However, Jordan played for a whole season, and accumulated MVP in pieces (never as many as 0.5 at a time–no player won any game “single-handedly”), so the 27 win estimate is interesting, but not literal.

The final tab, “86-08 MVP Careers,” lists the most valuable players during the period covered by the data set. Thus, many of Larry Bird’s and Magic Johnson’s best years are excluded, as are the first years of Jordan, Olajuwon, Stockton, etc. since they came prior to the 86-87 season. This is important to keep in mind when viewing game-average MVP numbers. Larry Bird falls relatively low on the list in no small part because we’re comparing his later years to the primes of LeBron James, Chris Paul, and Dwyane Wade.

Bearing this in mind, the list is still highly instructive. Since it takes a total of 0.5 MVP to win the game, players from Jordan down to Garnett are generating at least a quarter of the value their teams need to win (0.125 / 0.5= 0.25). Players with MVP over 0.1 are doing more than a fifth of the work needed to get a win, and since no player plays over a fifth of his team’s minutes, these are obviously some of the most valuable players–overrepresented in value relative to playing time. The rankings on this list are unsurprising, and read like a roll-call of the best players of the last 20 years. These are the guys around which you’d want to build a team.

Greatest single-game performances

What’s the point of having a game-by-game data set if you don’t look at game-by-game value? Below is a list of the 100 most valuable performances of the 07-08 season, and the 500 most valuable performances from 1986-2008. These are the herculean efforts from which legends are made. Here we can see how MVP automatically adjusts for pace, and assigns value above and beyond MEV’s measure of productivity. Since MVP is a percent of total production, it makes no difference how fast the game is played, how long the game is, or how much is produced in total, contributions to winning are measured against the other players in the game. Also, since as opponents’ MEV decreases, a player’s MVP increases, the better the player contributes defensively (i.e. outside of his box score stats, but visible in the other team’s production), the better will be his MVP. The margin column, incidentally, indicates the final point spread in favor of a given player’s team. If it is negative, that player’s team lost the game.

Both lists are topped by players you might expect to be there, but interspersed are some surprises: John Salmons? Willie Burton? It goes to show that on any night, any player can be a hero, and that a single sample can be very misleading. Nevertheless, there is a lot of data to be gleaned here. Note that the best games played see players generating over a third of the total production, which gets their team 2/3 of the way to a win. Not even the greatest can win completely on their own.

I’d like to digress here briefly, on the subject of Kobe’s 81 point game. Note that he produced about 1/3 of the total valuable contributions in that game, but look at his MEV: 68.96. That means that by missing 18 field goals, and doing very little other than shooting, he cost his team about 12 points in the final margin. The Lakers still won by 18 points, but to me the 81 point achievement is somewhat underwhelming, because of what it took to get there. Edit: Apparently, you put it one little paragraph about Kobe Bryant, and it makes your whole post about Kobe Bryant… All I’m trying to say here is that Kobe, by missing 18 shots (and turning the ball over, while not doing a lot of rebounding or box score defending) cost his team a few points. Most players couldn’t dream of generating 69 points, and this is an impressive feat, but also, most other players don’t even take 18 shots (doing so would put them in the 94th percentile of all games in the data set). All I’m saying is that it might be somewhat less impressive than some of the others on the list, like, for example, Jordan’s incredible performance against Cleveland.

The future

In the future, I plan on developing an approximation of MVP based on season-level statistics, for those seasons in which game-by-game data is unavailable. Next week, I am planning on applying some of the methods discussed here to the performance of the US Men’s Olympic basketball team. Today, I have three requests for you: First, please leave insights or any questions you might have in this post’s comments. Second, please take a moment to fill out the survey below with your thoughts, ideas, and criticisms. Third, if you found this post interesting, click the little “Buzz up!” button below, to express your approval.

¹ Game-total MEV margins correlate with game-level point margins at 0.947, and looking at MEV winners correctly classifies actual point winners 92% of the time.


Vote It Down...Vote It Up! Rate this post!
Share: Digg this Add to Technoratie Favorites BallHype: hype it up!

16 Comments

  1. wondahbap says…

    Arbitrarian???

    Re:” That means that by missing 18 field goals, and doing very little other than shooting, he cost his team about 12 points in the final margin. The Lakers still won by 18 points, but to me the 81 point achievement is somewhat underwhelming, because of what it took to get there.”

    See. This is why I have a problem with these nonsense formulas. Did you realize that the Lakers were down 18 when Kobe started his scoring binge? Do you also know that he was 28 of 46? That’s a pretty good fg%. I know you can figure that out. Instead of “calculating” with your computer that Kobe cost his team 12 points, maybe you should use your EYES and see that, instead, he helped his team score 36 more points than the Raptors, and 18 more when the buzzer sounded. You want numbers to tell you what the 81 pts meant. I’m sure you want to say that if he didn;t score thatmuch, other teammates would have scored more. But there lies the problem, they weren’t. tha’s why the Lakers were down 18.

    These formulas you keep posting every weeek prove nothing except for the fact you may want John Hollinger’s job. I’d like to assume that you are smart enough to throwthat trash paragraph in there to see if you get a response, because as you can see, Laker Nation has planted a flag here on HP. Well, trust me, more emails will come from others. You want to come up with absolutes, go ahead, but use logic also.

  2. dsparks says…

    wondahbap: Obviously, Bryant’s 69 MEV is a massive figure–perhaps I should have been clearer in the statement you quote–Bryant added 69 points to his team’s final margin, which is staggering, and very impressive. I only meant to note that had Bryant scored those points on fewer shots, or contributed substantially in ways other than scoring, the margin could have been even greater (not that such was needed).

    So, having established that Mr. Bryant’s contribution was substantial, I would like to direct your attention to the “86-08 MVP Highs” list. Bryant has two games in the top ten, which attests to his value. However, if you would be willing to consider my “nonsense formulas” for a moment, compare each player’s MEV-pts. I think that you will find, at least among the top of the list, that the lowest this ever gets is -3 or -4. Except for Kobe’s big games, in which the figures are around -12 and -8.

    In fact, if you sort the list of 500 by this measure (MEV-pts), Kobe occupies four of the bottom 13 spots. I will concede that to fully accept some of these observations as valid, one needs to agree on the premises, which you obviously do not, but surely you would be willing to tentatively agree that when Bryant has high-scoring games, his other contributions tend to suffer.

    Incidentally, HP welcomes all members of Laker Nation, as well as any other nation who wishes to “plant a flag” here. Perhaps we shall receive not only your emails, but also your comments. Also, for the record, I do think Bryant is a good and valuable player, and I don’t think that statistics are the only way to observe sports.

  3. wondahbap says…

    David,

    Thanks for responding. Sorry to sound so hostile. I didn’t want to turn it into a “Anti-Kobe-Laker bashing” rant. It’s just that I feel that formulas like this cannot possibly determine the value of a player in any given game. Yes, he didn’t do alot of the other things, but whose to say that’s what the team needed to get the win. I’m not concerned with how high Kobe may rank on your list. I wasn’t coming from that angle. It wasn’t a Kobe thing. Your mention of the 81 point game just highlights my oppositon to your methods of evaluation. I feel it’s just leads to hypotheticals. Whose to say contibuting in other areas in “that” game, would’ve gotten it done?

    My Laker Nation comment was only a reference to the seemingly growing pro-laker fan base here, and contributions (knock on wood….I guess a Kobe Day Blog will do that) that I’m sure will raise an eyebrow when they read your article.

  4. dsparks says…

    wondahbap: Thanks for your reply. You are right that using methods like the ones I employ here takes away a large part of the specificity of the game situation. This can be both a good and bad thing: Good, because it may help us avoid subjectivity, but bad, because it may lead us to overlook important facts about the situation under investigation.

    In this case, I think that a healthy does of subjective observations, specifically cited statistics, and more general analytics can give a much more complete view than any of those individually. It is valid to point out that contributing in other ways may not have been what was needed in that particular game, although it could just as easily been the case that such contributions would have obviated the need for the scoring explosion.

    However, you are correct to suggest that we try to avoid many of these hypotheticals. Thank you for your commitment to keeping me honest here. If it weren’t for reader comments, I’d be totally out of control, making unsubstantiated claims or positing un-empirically verifiable theories left and right!

  5. FreeCashFlow says…

    Arbitrarian rocks.

    Everyone can become a more efficient player, even Kobe Bryant, who I regard as the best player in the NBA.

  6. Anonymous says…

    It is true that many things are potentially valuable to a team but ultimately the scoreboard only counts points.

  7. Anonymous says…

    this is completely ridiculous

  8. Anonymous says…

    I really dont understand how you are able to mash up some numbers and with those you are able to ASSUME this,that and the third…..doesnt anyone like the regular ol ppg rpg apg etc ….nice and simple….you guys really put to much thought on these stats ..nothing better to do i guess …but im assuming you are a very smart man because i dont understand half of what you said ….also i found it shocking for you to say an 81 point game is somewhat underwhelming and how it took to get there ….explain this please maybe i missed it

  9. Anonymous says…

    It might be interesting to break that 81 point game down by halves if the stats can be found to do that (I guess you could piece them together from the play by play).

    In the first half of that game, Kobe had 26 points, 0 assists, 1 rebound and 1 steal and the lakers trailed by 14.

    In the second half, Kobe had 55 points, 5 rebounds, 2 assists, 2 steals and 1 block. It certainly can be argued that Kobe didn’t do enough of the other things in the first half and that contributed to the deficit and then he did those things in the second half along with the points and got them the 18 point win.

    He didn’t say that Kobe had a bad game, he simply said there are ways to contribute more to a win by doing things other than scoring points. In 22 years of games, he came up with just 7 games that ranked ahead of Kobe’s.

    What I’d like to see is this methodology applied to playoff games.

  10. Anonymous says…

    Looks decent..from a certain perspective

    CP3 and Lebron over Kobe? Sounds about right..he didnt deserve that MVP. He was more valuable to his team during the year he got 81 points than the 07-08 season

    John Salmons is a pretty good player from what I hear..not a star..but good enough to have big games. He should have a better season with Artest out of the way

  11. Anonymous says…

    What are you talkin about (the dude above me). You do realize more goes into getting the mvp besides stats right? Lebron should have been 4th in the voting (which is what he was). Wins and losses are very important in determining the MVP, thats why KG was 3rd and gave Kobe the edge over Paul.

  12. Jason says…

    Arbitrarian, this metrics progression just gets more and more interesting, but I want to interject again that a specific stat is not always reflective of ability.

    It’s nitty-gritty as great rebounders, scorers, and set up men have the statistical consistancy to always stay near the top of these lists, but on an individual basis, it bares knowing HOW people accrue their stats. When Rambis ran the court for a layup, was that as indicative of ability as when Kareem his a tough skyhook? Were the assists Magic got on each of those plays equally valuable?

    Take Barkley and Malone for instance. They rank 9 and 10 on the MVP / gm list. The difference in MVP numbers is scanty. How much of it might be attributed to the higher ppg that Malone average, and how much of THAT difference do we attribute to Stockton? I understand that in terms of total production for the Jazz, Stockton is considered, but what about Karl’s individual scoring and fg%? He’s able to take fewer FGAs than he might if he had to create his own scoring opportunities more often rather than running the fastbreak and finishing the pick and roll play.

    Not to be down on the Mailman or up on Sir Chuck. I don’t really care who we consider better. It was just the easiest example I could find in the top 10.

  13. Anonymous says…

    Kobe got the edge over Paul because of W’s and L’s? Uh, the LAkers had 57 wins.. the Hornets had 56. It wasn’t that one win that made the difference, it was the voters wanting to give Kobe a lifetime acheivement award… which he probably deserves, but not the MVP.

  14. OnTheBanksoftheRedCedar says…

    Another missed factor would be any given variation in teammates. Obviously better teammates provide more wins and more points that way… but if there are five Michael Jordan’s on a team, each Michael would only be allowed .10 MVP per game.

    Taking out one Michael and adding a Lamar Odom or another above average player would certainly make that player receive less of the alloted available points. Put Lamar Odom on a team of average players and Lamar would be vital to his teams success and receive a much greater allotment of those available MVP points.

  15. Q says…

    Nice work David.

    Just curious and perhaps you will address this when you look at the US men’s basketball team…

    What are your thoughts on assigning credit in blowouts?

    I would argue that the players who help build a huge lead subjectively deserve more credit for the win than players who come in with the team up 20+. The same goes for players who finish out the game in garbage minutes when the opponents have probably given up or put their second unit in.

    This seems especially relevant to Team USA which had all-stars coming off the bench and putting up good numbers.

    It is probably less relevant in regular season NBA games when the second unit likely won’t see many minutes anyway.

    I can’t think of a way to account for this statistically without moment to moment numbers, but it becomes interesting if we’re trying to separate great players from average players.

  16. dsparks says…

    freecashflow, anonymous #4, and #5: You guys seem to get it; thanks.

    jason: You are right that this specific stat is not reflective of ability. Ability, I believe, would be almost impossible to measure with the statistics tracked by the NBA. Rather, I’m attempting to estimate value, which is different.

    I’m not sure how much of Malone’s success can be attributed to Stockton, but MVP divides credit among both of them. Since the Jazz won a lot of games, they typically had more than 0.5 of the total game MVP to divide amongst themselves, this helped both Malone and Stockton’s own MVP numbers. This is the way (scaling to success) that I try to account for interaction effects between players.

    onthebanksofthecedar: You bring up an interesting point, but I think MVP accounts for that. You are assuming that the five Jordans would only barely win (their team MVP would equal 0.5). I would contend, however, that a team of five Jordans would tend to dominate the MEV production to a much greater extent, regularly getting 60-75% (more or less) of the total share. Thus, each Jordan would have 0.12-0.15 MVP/gp, which would put them at the very top of the chart.

    q: I think what you’re looking for could only really be addressed with play-by-play data or plus/minus numbers. However, the players on the team whom built the huge lead, to the extent that they generated a substantial portion of the MEV (relative to their own team’s bench unit and the players on the other team), would receive more credit for the win. I can’t/don’t control any for the specific quality of the opposing players on the floor at any given time, but since I’m estimating value, and not ability/quality, I don’t think I need to do this.

Leave a Comment