By the time a professional baseball player has played a few seasons of minor league ball, we're usually able to get a sense of how good a prospect he is. But to determine what kind of player he's likely to end up takes more than a cursory look at the numbers he's posted at each minor league stop. For more precise and fruitful analysis, we need to contextualize and aggregate the data. Once we do that, trends emerge.
Take the case of a journeyman minor leaguer like Simon Pond. Pond has had a varied career in 3 organisations, the last year and a half spent in the Toronto minor league system. Here is a list of stops he's made in his pro career to date:
Montreal Expos
1994 | Gulf Coast Expos (R) |
1995 | Gulf Coast Expos (R), Albany (A) |
1996 | Vermont (A-) |
1997 | Cape Fear (A) |
1998 | Jupiter (A+), Harrisburg (AA/2 games) |
1999 | Jupiter (A+) |
2000 | Jupiter (A+) |
Cleveland Indians
2000 | Kinston (A+) |
2001 | Kinston (A+), Akron (AA) |
Toronto Blue Jays
2002 | Dunedin (A+) |
2003 | New Haven (AA), Syracuse (AAA) |
My approach to analysing professional players is to focus on core skills. I believe there are 4 core hitting skills that may be approximated by looking at official batting stats. They are: 1) hitting for power [appoximated by the formula: (2B+3B+2*HR)/(AB-K+SF)]; 2) line-drive hitting [approximated by ball in play average - (H-HR)/(AB-HR-K+SF)]; 3) drawing walks [walks per opportunity less intentional walks - (W-IW)/(PA-IW-HBP)] and 4) avoiding strikeouts [K/(PA-IW)].
The next step is to contextualize the performance. Since park data for nearly every level of minor league baseball is hard to come by, I normalize to the average performance in a given year in the appropriate minor league. Excepting some cases like El Paso and Colorado Springs (which figure to have a large impact on batting stats in relation to the other parks in that league), normalizing to league works well because whatever effect the home park has relative to the other parks in the league is counterbalanced by the road parks.
The final step is to aggregate the data - to assemble it into larger, yet still meaningful, samples. Sample size is always an issue when evaluating a player, which is why every line of hitting data should include Plate Appearances.
In cases where a player spends parts of more than one season in a particular league or at a particular level, it may make sense to aggregate data by league and/or level.
League | Age | Years | PA | power | norm | $BIP | norm | walks | norm | K | norm | obp | slg |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SouthAtlantic | 20.4 | 95,97 | 574 | .048 | -59 | .295 | -2 | .070 | -15 | .124 | +39 | .318 | .309 |
FloridaSt | 22.4 | 98-00 | 961 | .101 | -12 | .279 | -9 | .082 | -5 | .161 | +8 | .322 | .348 |
Carolina | 24.0 | 00-01 | 374 | .170 | +43 | .371 | +24 | .084 | -5 | .164 | +18 | .388 | .500 |
Eastern | 25.5 | 01,03 | 700 | .165 | +23 | .325 | +9 | .089 | +13 | .149 | +20 | .367 | .469 |
Notes: Age is the average age, weighted by plate appearances; norm is the percentage above or below the league average performance [plus(+) always indicates a better than average performance, i.e. fewer strikeouts, more walks].
We can aggregate the data a bit more by grouping the Florida State and Carolina League data (since they are regarded as equivalent in terms of difficulty). The farther the data recedes into the past, the less relevant it is for current ability, so I will drop the South Atlantic League data from the following chart. It may be useful to sort out this season's performance, since Pond has hit much better than ever before.
Level | Age | Years | PA | power | norm | $BIP | norm | walks | norm | K | norm | obp | slg |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A+ | 23.5 | 98-02 | 1792 | .134 | +16 | .307 | +2 | .081 | -7 | .162 | +10 | .345 | .414 |
AA (pre 2003) | 24.6 | 98,01 | 432 | .166 | +22 | .296 | -1 | .068 | -9 | .166 | +14 | .319 | .440 |
AA/AAA (2003) | 26.7 | 2003 | 396 | .177 | +35 | .353 | +18 | .102 | +24 | .116 | +33 | .414 | .522 |
Simon Pond seems to have greatly improved his strikezone judgement, displaying a large increase in his walk rate while striking out less often. He didn't do much of anything until he arrived in the Jays' organisation, which suggests that Toronto has had a hand in his dramatic improvement this year.
Nevertheless, the first and most likely hypothesis that presents itself is that Pond is playing over his head. If so, then he will ultimately be of lttle interest to the Blue Jays. If his performance in 2003 turns out to be legitimate, it suggests that Pond can be something like an average major league hitter (with defensive limitations, it should be noted). A September call-up would be a just reward for this minor league vet's perseverence.