Batter's Box Interactive Magazine Batter's Box Interactive Magazine Batter's Box Interactive Magazine

These park factors are for runs scored. They are derived by looking at the ratio of runs scored per inning at home versus road for batters and pitchers (separately), after which the two results are averaged. The home park factors are then used to determine the road park factors for each team on an inning-by-inning basis. The final factors are arrived at by adjusting so that the league average is exactly 1. They have not been regressed towards the mean.





NATIONAL LEAGUE, 2003 park factors for runs scored
TeamBatter PF2002Pitcher PF2002offence'03(%)defence'03(%)
Montreal Expos1.126.9801.137.981-15.1+15.4
Colorado Rockies1.0921.1751.1001.191+5.7-10.4
Arizona D-Backs1.0891.0631.0951.068-12.5+16.9
Houston Astros1.0471.0601.0491.063+2.8+14.0
San Francisco Giants1.001.9381.000.933+2.6+14.2
Milwaukee Brewers.998.985.998.982-5.7-16.4
Cincinnati Reds.9961.083.9961.087-7.8-18.9
Chicago Cubs.982.994.981.992-1.9+7.6
Pittsburgh Pirates.9811.058.9791.061+2.2-9.5
Saint Louis Cardinals.980.975.977.971+18.2-7.5
New York Mets.978.924.977.921-11.3-5.6
Atlanta Braves.977.999.9761.000+24.6-0.6
Philadelphia Phillies.947.907.943.903+12.0+1.1
Florida Marlins.946.968.941.967+6.7+1.6
Los Angeles Dodgers.946.939.942.935-19.4+21.8
San Diego Padres.914.950.908.945-1.1-23.6

AMERICAN LEAGUE, 2003 park factors for runs scored
TeamBatter PF2002Pitcher PF2002offence'03(%)defence'03(%)
Kansas City Royals1.1131.1131.1211.122-4.2+1.6
Texas Rangers1.0921.1001.1001.107-3.8-12.5
Boston Red Sox1.056.9681.060.968+15.0+4.6
Toronto Blue Jays1.048.9901.051.989+9.4-0.3
Minnesota Twins1.021.9741.022.972-0.9+9.4
Chicago White Sox1.0011.0301.0001.032+1.7+8.6
Tampa Bay Devil Rays.991.983.991.982-9.0-9.6
New York Yankees.981.967.980.965+13.7+8.5
Seattle Mariners.974.945.973.941+4.7+16.8
Baltimore Orioles.959.959.956.958-2.7-8.3
Cleveland Indians.9511.022.9471.022-8.4-3.1
Detroit Tigers.946.944.941.939-22.2-25.4
Oakland Athletics.9371.034.9321.036+5.3+12.4
Anaheim Angels.929.970.925.968+1.3-2.7

The final results are composite park factors - i.e. they can be applied to a player's overall stat line. For offence and defence ratings, plus (+) indicates the performance was better than league average; minus (-) indicates the performance was worse than league average.

2003 Park Factors | 36 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.
_Andrew Edwards - Wednesday, October 01 2003 @ 07:11 AM EDT (#89040) #
Cool, thanks.

A few strange results in there: Edison, Pac Bell, and Olympic are the most obviously contrary to expectations. This is why we use 3-year park factors when we can.

Still, generally consistent. SkyDome was a pretty strong hitter's park this year.
Dave Till - Wednesday, October 01 2003 @ 07:24 AM EDT (#89041) #
Are there enough numbers to compare the SkyDome with roof closed to the SkyDome with roof open? The "accepted wisdom" seems to be that the Dome is more of a hitter's park when the roof is closed.
robertdudek - Wednesday, October 01 2003 @ 08:00 AM EDT (#89042) #
Dave,

It's hard to get a hold of data that separates Skydome-roof open from Skydome-roof closed. There are also a few cases where the roof was closed (or even opened) during a game.

Andrew,

3-year park factors have problems of their own. Every year there will either be new parks or existing parks will change in some significant way. That alters the park factors of ALL the teams in the league because a PF is always a measure in relation to the other parks.

In general, I don't think it's a good idea to apply park factors to a different player dataset. That is, if you want to look at what a player did over a three-year stretch it's okay to look at 3-year park factors, but for one-year performances using 3-year factors could create some problems (see above paragraph).

I think the best answer is to regress the 1-year park factors. I haven't had time to do that yet: I'd have to calculate the r value for general park factors over a sufficiently large number of years. I'll keep you posted.

Montreal is perfectly understandable, I think, when we remember that they played 22 games in San Juan this year. That has had an effect on all the NL park factors.
Craig B - Wednesday, October 01 2003 @ 08:33 AM EDT (#89043) #
Thanks, Robert. Tremendous stuff.

One big factor that affects park factors is the weather. Cool, dry weather will generally mean less hitting. Hot, humid weather will mean more hitting. That's one reason it's good to use one-year PFs (and yes, regressed might be best if you're measuring ability and not value) - the weather is never the same from year to year.
robertdudek - Wednesday, October 01 2003 @ 09:02 AM EDT (#89044) #
Correction:

There was an error in the 2002 PFs for Kansas City. Originally they were listed as .980 (batters) and .981 (pitchers). They have now been corrected to 1.113 and 1.122 respectively.
Craig B - Wednesday, October 01 2003 @ 09:19 AM EDT (#89045) #
Robert, can you confirm whether this is right?

In order to re-normalize (approximately) the AL numbers above to an AL without the Tigers, you would take

100 + the offence or defence rating

and multiply by 162 to give "factor A".

Then take the 100 - the _opposite_ factor for the Tigers (i.e. if you're doing hitting numbers, use Tiger pitching.)

and multiply by the number of games that team had against the Tigers to give "factor B".

Then subtract factor A from factor B.

Then divide by the number of games that team had against non-Tigers opponents.

Is this right?
Mike Green - Wednesday, October 01 2003 @ 09:24 AM EDT (#89046) #
Robert, these are very helpful. I agree that one should not use 3 year park effects to compare yearly performance. To illustrate, in evaluating Hudson's and Halladay's 2003 seasons for Cy Young purposes, their per/game efficiency is very close if one park-adjusts using only 2003 statistics, but Hudson's is much better if one takes into account 2002 statistics. But, what relevance is the ballpark effects in 2002 (which is affected by, as Craig points out, the weather in Toronto or Oakland in 2002)in evaluating how Hudson and Halladay performed in 2003?

It is not well known that Kaufman Stadium has for the last 4 years been almost as favourable to hitters as Coors. Does anyone know what happened? In any event, fans of Carlos Beltran and Mike Sweeney take note.
_Andrew Edwards - Wednesday, October 01 2003 @ 09:29 AM EDT (#89047) #
But, what relevance is the ballpark effects in 2002 (which is affected by, as Craig points out, the weather in Toronto or Oakland in 2002)in evaluating how Hudson and Halladay performed in 2003?

I suppose there are two questions. In evaluating how a player did this year, 1-year park factors would be absolutely right.

In predicting how a player will do next year, I'd rather use the three-year park factors.

I know there are problems with the three-year factors too, the building of new stadia is the obvious one. It just seems like there is so much variation between years that I feel like a sample size of 81 games isn't enough.
Pepper Moffatt - Wednesday, October 01 2003 @ 09:32 AM EDT (#89048) #
http://economics.about.com
I know there are problems with the three-year factors too, the building of new stadia is the obvious one. It just seems like there is so much variation between years that I feel like a sample size of 81 games isn't enough.

There was a great article in one of the Prospectuses a couple years ago (2001 maybe?) about how many years you should consider when designing park factors. The tests they ran showed that 1 year park factors ended up working the best as including past years tended to bring in too much noise. IIRC this surprised the author as he thought that 3 or 5 year PFs would be the best.

Cheers,

Mike
Craig B - Wednesday, October 01 2003 @ 10:17 AM EDT (#89049) #
The PF work is a big argument in favor of Pujols as MVP, as Robert has been contending.
Pepper Moffatt - Wednesday, October 01 2003 @ 10:27 AM EDT (#89050) #
http://economics.about.com
The PF work is a big argument in favor of Pujols as MVP, as Robert has been contending.

But is the PF enough for Ichiro to overtake Bill Mueller in the AL MVP race? An inquiring mind wants to know!

Mike
Craig B - Wednesday, October 01 2003 @ 10:31 AM EDT (#89051) #
Mike, I don't think anyone's going to be worried about 12th place in the AL MVP race. It's obvious that Shannon Stewart has that one roped and hogtied.
robertdudek - Wednesday, October 01 2003 @ 10:32 AM EDT (#89052) #
In projecting next year's performace, I would first park-adjust individual batting events using 1-year regressed event factors. Then I would weigh them approximately as follows: .15*(Year-3) + .30*(year-2) + .55*(Year-1), where Year is the season you want to project for.

I would then include an adjustment based on typical changes related to age and position. I think such information can be found somewhere on FanHome, but it would take some work to find.

Craig,

I'm not quite sure what your aiming to do. Does it involve the offence and defence ratings in the last two columns?

To convert those percentages into factors:

(100 + rating%)/100

E.g. Detroit's offence rating is -22.2%; therefore: [100 + (-22.2)]/100 = .778. I.e. they scored runs at a rate of 77.8% of that which they would have had they been an average offensive club.

Mike Green,

Coors has been more of a pitcher's park this year and is normally more of a hitters park than Kauffman or Arlington (the two best hitter's parks in the AL)
Craig B - Wednesday, October 01 2003 @ 10:35 AM EDT (#89053) #
Yeah, Robert, it's the ratings. I want to normalize them to a league with no Tigers because the Tigers are so awful they throw the balance out of whack.
Pepper Moffatt - Wednesday, October 01 2003 @ 10:40 AM EDT (#89054) #
http://economics.about.com
Mike, I don't think anyone's going to be worried about 12th place in the AL MVP race. It's obvious that Shannon Stewart has that one roped and hogtied.

(joe sheehan mode on)

You guys don't understand value at all, you economic illiterate b****es!!!!!

(joe sheehan mode off)
Mike Green - Wednesday, October 01 2003 @ 01:06 PM EDT (#89055) #
Robert, I checked and you are right that Coors favored hitters significantly more than Kaufman in 2001 and especially 2000. The impact of Coors from 2001-2003 does seem to be significantly diminished on average from 1996-2000. While Coors is still the best hitter's playpen in baseball, the situation is not as extreme as it was 5 years ago, with Coors being in a completely different park on this score.
robertdudek - Wednesday, October 01 2003 @ 01:56 PM EDT (#89056) #
I think the addition of the BOB, San Juan (2003 only) and possibly Pittsburgh and Cincy's new parks have brought Coors down to earth a little.

Another factor might be that Colorado pitchers have learned to adapt to their home park as time has passed.
_Spicol - Wednesday, October 01 2003 @ 02:21 PM EDT (#89057) #
Another factor might be that Colorado pitchers have learned to adapt to their home park as time has passed.

...and the humidor effect, if there truly is anything to that.
Mike Green - Wednesday, October 01 2003 @ 03:10 PM EDT (#89058) #
I just noticed that San Fran's and Oakland's parks moved in different and extreme ways in 2003 from 2002. It's unlikely that this is due to temperature or humidity. Perhaps prevailing winds might be a factor. Does anyone know the orientation of the parks, i.e whether centerfield in each park is north, south, west or east of home plate? It would be curious if the parks were polar opposites.
_StephenT - Wednesday, October 01 2003 @ 09:15 PM EDT (#89059) #
It's probably just a symptom of how erratic park factors are statistically.

Here's a relatively straightforward way to compute an approximate 95% confidence interval for SkyDome's 2003 park factor (based on Efron's Bootstrap (percentile method)):

1. Randomly select 81 Jays home games (with replacement) and 81 Jays road games (with replacement). Compute the Park Factor.

2. Repeat B times (e.g. B=10,000 or 100,000).

3. Sort the B park factors. Drop the bottom 2.5% and top 2.5%. The endpoints of what remains are an approximate 95% confidence interval for the actual SkyDome park factor this year.

I suspect the interval will be very wide, e.g. from 0.95 to 1.15, but I haven't tried bootstrapping park factor data before.
Pepper Moffatt - Wednesday, October 01 2003 @ 09:34 PM EDT (#89060) #
http://economics.about.com
I suspect the interval will be very wide, e.g. from 0.95 to 1.15, but I haven't tried bootstrapping park factor data before.

Wow... You sound way too much like my former econometrics prof James MacKinnon.

I can try out this sim the next time I'm in the office using @Risk in Excel. I'm working at home because I've caught a rather nasty cold.

Cheers,

Mike
_Michael - Thursday, October 02 2003 @ 05:02 PM EDT (#89061) #
Michael Moffat (not me) correctly points out the 2001 BP as having an excellent article on PF. It is by Clay Davenport. However, you remembered the results incorrectly. Clay expected 1 year park factors to do well (partially because he is a meterologist and wanted small weather fluctuations to play larger roles), they did not. He did a number of comparsions.

First, he compared players who stayed in the same team in the same park when the park in question changed PF by at least 80 points in two consecutive years (where a neutral park is 1000). Looking at RMSE you get: 3 year PF is slightly better than 0 year PF (I.e., no park factors) is well better than 1 year PF. (he didn't list 5 year PF for this).

Secondly, he compared players who stayed on the same team in the same park when the park in question changed PF by at least 80 points in just one year (rather than 2 consecutively). Looking at RMSE here you get: 0 year, 3 year, and 5 year all equal and all better than 1 year PF.

Thirdly, he compared all players who changed teams and/or parks between two years. (This is what PF are general most interested in. If I sign Jim Thome to play in my park how good will he be? Ditto on todd Helton.) Here he got the RMSE as: 5 year PF better than 3 year PF which are better than 0 year PF which were slightly better than 1 year PF.

All in all it is relatively strong evidence that 1 year PF are bad to use. Which is why from 2001 and on BP uses 5 year PF (or as long as possible for parks that are younger than 5 years). Also why Nelson's RC list that he posts to r.s.b. really is very, very silly to sort based on PF that are YTD PF sometimes after as little as 1/3 of the season.
Pepper Moffatt - Thursday, October 02 2003 @ 05:33 PM EDT (#89062) #
http://economics.about.com
Michael Moffat (not me) correctly points out the 2001 BP as having an excellent article on PF. It is by Clay Davenport. However, you remembered the results incorrectly. Clay expected 1 year park factors to do well (partially because he is a meterologist and wanted small weather fluctuations to play larger roles), they did not. He did a number of comparsions.

Whoops! My memory must be going in my old age. I must admit I hadn't read the article in over a year. Thanks for the clarification. I can't believe I got it *exactly* backwards. Davenport's conclusion:

So, I am now convinced. One-year park factors are, as Gene Wilder said of Baron von Frankenstein's work in Young Frankenstein, "doo-doo". The longer the period of time over which you run the park factors, the better. And that is why this year's Baseball Prospectus is using five-year park factors (or as close to a five-year average as we can get), at both the major- and minor-league levels. (pg. 518)

Thanks for the correction. That's really quite embarrassing.

Cheers,

Mike
robertdudek - Friday, October 03 2003 @ 10:00 AM EDT (#89063) #
Michael,

Could you flesh out some details as to how the players were compared?

Did he use general park factors or event park factors? Did he regress the park factors?

I think the main problem with using longer than 1-year park factors is that as parks change every park's relative effect also changes. After 5 years, you'd have so many changes that you'd have no confidence that a 5-year park factor is at all applicable to a given year within that 5-year sample, unless you assume stability WRT the park population.

Whatever accuracy you gain by using multi-year park factors is nothing more than the reining in of extremes created by random fluctuations in the small sample size of 1-year park factors. The same thing, on a more theoretically sound basis, can be accomplished by regressing 1-year PFs towards the mean.

In sabermetrics, it's not only accuracy, but also theoretical soundness, that counts.
_pathos - Friday, October 03 2003 @ 11:13 AM EDT (#89064) #
Do the Montreal numbers only include games played in Montreal, or do they include Puerto Rican games, too?
Pepper Moffatt - Friday, October 03 2003 @ 11:16 AM EDT (#89065) #
http://economics.about.com
Whatever accuracy you gain by using multi-year park factors is nothing more than the reining in of extremes created by random fluctuations in the small sample size of 1-year park factors.

Right. I had remembered that the small changes in parks had outweighed the problems caused by sample size issues, but it was the other way around.

In sabermetrics, it's not only accuracy, but also theoretical soundness, that counts.

You're never going to get theoretical soundness in an applied model with sample sizes that small. Too many of the distributional assumptions rely on law of large number arguments.

You should really invest in the old Prospectii. You can get them (2000-2003) cheap on eBay. The older ones are more difficult to find, but they're not as useful anyway.

Cheers,

Mike
robertdudek - Friday, October 03 2003 @ 11:26 AM EDT (#89066) #
Yes, San Juan games are included.
robertdudek - Friday, October 03 2003 @ 11:29 AM EDT (#89067) #
Mike,

Of course all of these things are estimates, since we don't know the real park factors and never will. But there are still better and worse methods at arriving at your estimates. Logic has to come into play.
robertdudek - Friday, October 03 2003 @ 11:31 AM EDT (#89068) #
Mike,

If you want to purchase and donate a few copies for me, I'll send you my mailing address.
Pepper Moffatt - Friday, October 03 2003 @ 11:41 AM EDT (#89069) #
http://economics.about.com
If you want to purchase and donate a few copies for me, I'll send you my mailing address.

We could start you a fund. They go for about $5 U.S. each on eBay + shipping. For a 600 page book, that's pretty cheap. Heck, next time I see you I'd be happy to loan you mine.

In my view they're one of those books that are a MUST HAVE for statheads. Sort of like the 1982-1988 James Baseball Abstracts and Palmer and Thorn's Hidden Game of Baseball. In any science it is important to know the important historical papers and books, and whether you love BP or hate them, they have had a huge impact on the field. Just ask Keith Law.

Mike
robertdudek - Friday, October 03 2003 @ 12:43 PM EDT (#89070) #
I'd be happy to have you loan me your copy(ies). I
I'd only be interested in the odd essay that had to do with sabermetrics; I doubt team essays or player projections would be of much interest to me.

If they put out a condensed compendium of their theoretical work, I'd probably buy that.
Pepper Moffatt - Friday, October 03 2003 @ 12:56 PM EDT (#89071) #
http://economics.about.com
I'd only be interested in the odd essay that had to do with sabermetrics; I doubt team essays or player projections would be of much interest to me.

Sounds good. I'll try to remember to bring them to the next outing. You could always make a photocopy of the relevant articles (I think that would count as "fair use") and put them together in a binder.

Cheers,

Mike
Craig B - Friday, October 03 2003 @ 01:06 PM EDT (#89072) #
If they put out a condensed compendium of their theoretical work, I'd probably buy that.

As would I. They could call it _This Time Let's Not Only Eat The Bones, Let's Crack Them Open and Feast On The Marrow Inside_.
robertdudek - Friday, October 03 2003 @ 01:20 PM EDT (#89073) #
Craig,

I LOVE the title. Maybe we should co-author a book by that title. Not sure if we'd open ourselves to a plagiarism charge for that title, though.
_StephenT - Sunday, October 05 2003 @ 05:24 PM EDT (#89074) #
http://www.stephent.com/jays/
fyi: for the park factors in my 2003 Jays article, I used a 50% weight for the 2003 data, 25% for 2002 and 25% for 1999-2001 (except when the park changed). The 2003 and 2002 data were from the batter park factors given by Robert above (thanks Robert).

As an aside, in the STATS All-Time Major League Handbook (Second Edition), Bill James (et al) decided to list park factors which in most cases were based 50% on the one-year data and 12.5% each for the previous two and following two years.
_Ben NS - Monday, October 06 2003 @ 03:23 PM EDT (#89075) #
Nice work, gentlemen.
2003 Park Factors | 36 comments | Create New Account
The following comments are owned by whomever posted them. This site is not responsible for what they say.