Last year, I presented a theoretical model as to the talent distribution in MLB (actually, the world, too).
Can we figure out what it actually is, and can we use that to figure out the replacement level?
Here are my quick thoughts on the matter...
What is the spread in talent among MLB players?
Just doing a quick analysis, and making some key assumptions, let's say we have:
- distribution among nonpitchers = 18 runs per 680 PAs (includes off and def)
- distribution among pitchers = 12 runs per 162 IP
- we have 9 nonpitchers and 9 pitchers, each playing equally
If the hitters are randomly distributed, the team nonpitchers will be distributed at:
sqrt(18^2 * 9) = 54
and the pitchers at
sqrt(12^2 * 9) = 36
Anyway, assuming that pitchers and nonpitchers are also randomly distributed, that gives us:
sqrt(54 + 36) = 65 runs
That is, a set of random teams with random players will produce 68% of the teams of true talent within 65 runs of the league average.
Using a 10.5 runs per win converter, that gives us 6.18 wins per 162 games, or 1 sd .038 wins per game.
We would have expected .5/sqrt(162)= .039
So, my quick model here is decent enough.
Note: I am aware that this does not model MLB exactly, and my theoretical model that I linked would make more sense. But, this is as far as I can go in 15 minutes and my current limited statistical abilities.
What does this tell us? Well, in this stylized example, we have established exactly what the distribution of baseball talent is! Furthermore, if you decide that the "replacement level" is say the bottom 16% of all players, then we also know exactly what the replacement level is: 18 runs for nonpitchers, and 12 runs for pitchers. Or, if you want the replacement level to be the average of all players below 1 SD, then.... well, I'm sure someone has a handy function to tell us that. I'm guessing about 24 runs (per 162 GP) for hitters, and 18 runs (per 162 IP) for pitchers.
***
The true distribution among teams is not random, as evidenced by the actual team win% being distributed at .080, meaning that the true talent team win% is .070.
***
Given the priors for player distibution, we should be able to come up with a distribution so that we end up with a true team distribution of .070. My question: how?
Can we figure out what it actually is, and can we use that to figure out the replacement level?
Here are my quick thoughts on the matter...
What is the spread in talent among MLB players?
Just doing a quick analysis, and making some key assumptions, let's say we have:
- distribution among nonpitchers = 18 runs per 680 PAs (includes off and def)
- distribution among pitchers = 12 runs per 162 IP
- we have 9 nonpitchers and 9 pitchers, each playing equally
If the hitters are randomly distributed, the team nonpitchers will be distributed at:
sqrt(18^2 * 9) = 54
and the pitchers at
sqrt(12^2 * 9) = 36
Anyway, assuming that pitchers and nonpitchers are also randomly distributed, that gives us:
sqrt(54 + 36) = 65 runs
That is, a set of random teams with random players will produce 68% of the teams of true talent within 65 runs of the league average.
Using a 10.5 runs per win converter, that gives us 6.18 wins per 162 games, or 1 sd .038 wins per game.
We would have expected .5/sqrt(162)= .039
So, my quick model here is decent enough.
Note: I am aware that this does not model MLB exactly, and my theoretical model that I linked would make more sense. But, this is as far as I can go in 15 minutes and my current limited statistical abilities.
What does this tell us? Well, in this stylized example, we have established exactly what the distribution of baseball talent is! Furthermore, if you decide that the "replacement level" is say the bottom 16% of all players, then we also know exactly what the replacement level is: 18 runs for nonpitchers, and 12 runs for pitchers. Or, if you want the replacement level to be the average of all players below 1 SD, then.... well, I'm sure someone has a handy function to tell us that. I'm guessing about 24 runs (per 162 GP) for hitters, and 18 runs (per 162 IP) for pitchers.
***
The true distribution among teams is not random, as evidenced by the actual team win% being distributed at .080, meaning that the true talent team win% is .070.
***
Given the priors for player distibution, we should be able to come up with a distribution so that we end up with a true team distribution of .070. My question: how?