- Thread starter
- #21
gp956
The Hammer
- 13,846
- 1
- 36
- Joined
- Aug 11, 2010
- Hoopla Cash
- $ 1,000.00
- Does "Career heatmaps" mean "Career-since-PITCHf/x-has-been-around heatmaps" ?? (for the young players it doesn't matter, but for Huff it would)
Brain fart. The data is from 2008 on. In a way that's good since players change over time.
If a batter NEVER swings at a pitch in a certain place, his wOBA would be 0 (very deep, dark blue), correct?
If a batter swings at a pitch in a certain place 10 times over his career and never gets a hit, it would be the same color as the above (never swinging at a pitch), correct?
If a batter swings at a pitch in a certain place 100 times over his career and never gets a hit, it would be the same deep blue color as above (never swinging at a pitch), correct?
Yes and no. The process assume it's taking a sample from a finite region in space, and spreads the sample over that region based on the output of a spreading function. These "spreads" are then aggregated to give a weighted value for any particular region (this is in additon to the weight of the sample itself - its wOBA). It's a bit like averaging a series of contour maps superimposed on a neutral surface. You can see artifacts of the spread from pitches well away from the strikezone.
All that said, for those players with limited data, say less than 500ab, it's best not to automatically assume they can't hit a particular zone simply because it's deep blue - especially if it's within a zone of non-blue. You really should look at these maps as "where these players like to hit the ball", and not "where these players can't hit the ball". That's why they're labeled "hotzone" maps.
The color can't tell us *how often* a batter swings at a location, just his success when he does swing, correct? (Say, he swung 5 times and had a wOBA of 0.300 vs. he swung 500 times with the same wOBA.)
So (grouping all of the above) the heat maps don't show us a batter's *absolute tendencies* (how often he swings at a high pitch) just the success that he has had swinging at pitches in that location in the past? (And, by extension, the locations where said batter *should* be swinging, and which he should be avoiding.)
This data only includes at bats with results, e.g. a swing and miss on a 0 - 0 count is not reflected in this data. With that noted, the aggregation of "spread functions" function will reflect the density of data at that location. We should be able to tell the differrence between one data point and five hundred even if they have the same wOBA. Although it's certainly possible to have some ambiguity result from these operations with just the "right" combination of data. It's a balance in setting the granularity and spread of the sampling functions. And there is some subjectivity in the choice of those parameters, and, obviosuly, the modeling of the functions themselves, as can be seen in heat maps produced by other sources.
And just to be clear, it's possible to do heat maps that show swings and misses, fouls, and such; and I did one such map for Schierholtz last year to show he had closed his "down and in" hole.
And lastly, the heat maps don't indicate the type of pitch (low and outside was a fast-ball, slider, curve, etc), correct?
These don't, but it's possible to break it down by pitch type. I've got every bit of data generated by the mlb gameday/pitchfx system, so there is a large mutltitude of ways the data can be broken down. When there are questions about individual players it'll probably be worth taking a deeper look. And if there's a particular type of breakdown that is especially illuminating, I'll automatic the process so it can be repeated easily.
Again, thanks for the heat maps, and thanks in advance for helping me to understand them better.
Last edited by a moderator: