By Sam Linker
Introduction
Recently, I conducted a study on what statistics could lead to teams scoring more runs which can be found here. This project uses the same methods, but focuses on a team’s pitching to help prevent runs. The goals remain the same: find an equation that can predict how many runs a team will allow and to assign a value to each stat in terms of runs.
Data Collection
All terms and definitions could be found at fangraphs.com. I selected 16 variables that relate to batted ball data, pitch selection, pitch velocity, and pitch location. All the statistics I used could also apply to developing pitchers in the minor leagues, and not only looking for the free agent that will best help your team prevent runs. The 16 variables are as followed:
LD%
GB%
FB%
FB% (pitch)
FBvSL%
SLvCT%
CTvCB%
CBvCH%
CHvO-Swing%
F-Strike%
SwStr%
All data points along with runs allowed were collected from the 2004-2017 seasons resulting in 420 data points. Unlike my last project that went back to 2002, Fangraphs did not have data on percent of cutters thrown before 2004.
Data Analysis
I used the same regression and backward elimination process using the 16 variables as independents and runs allowed as the dependent variable. After eliminated the variable with the highest p-value, I repeated the regression/elimination process until I ended with a set of 9 variables with p-values under .05. They are:
GB%
FB%
FB% (pitch)
CT%
CB%
CBvO-Swing%
F-Strike%
SwStr%
The resulting linear equation had an R2 value of 0.41 and reads as follows:
Runs Allowed = 2502.59 - 1221.68*GB% - 787.15*FB% - 313.56*FB%(pitch) -- 359.10*CT% - 255.80*CB% + 7.09*CBv - 498.89*O-Contact% - 1512.49*F-Strike% - 2567.46*SwStr%
My equation’s accuracy was tested by taking the data from the 2017 playoff teams and comparing the results from the equation to the actual runs allowed. The table below shows that my equation is accurate with a majority of the percent error values in the range of 2-8%.
Moving on to the main goal of this project, I used the coefficients to see how a 1% increase in a variable can affect how many runs a team allowed. CBv is measured by a .1 increase instead of a 1% increase since it is the only non-percent data point. Since I am looking at runs prevented, positive values in the chart represent the amount of runs the total will decrease by while a negative value shows the amount of runs the total will increase.
Conclusion
Using the data, I managed to create an equation to accurately predict how many runs a team’s pitching staff will allow along with assigning each stat a “runs prevented” value. The most exciting part of the results was the pitch selection and velocity data. The batted ball data yielded generally predictable results with more ground balls being more valuable than fly balls. I started my analysis with 5 different pitches and velocities and ended with only 3 pitches and only 1 velocity being significant. The fastball, cutter, curveball, and curveball velocity all registered p-values under 0.05. This is where these results contribute to the development of pitchers. Every pitcher has a fastball, so the emphasis is always put on developing secondary pitches. While I am not saying to only teach pitchers a cutter and curveball, building a staff and bullpen around pitchers who can easily throw one of these two pitches for strikes and getting hitter swinging when they are not strikes would be extremely valuable. Also, with only curveball velocity being significant and negatively impacted runs prevented, it’s safe to assume that control is highly valued when it comes to all pitches as shown with the two largest values of runs prevents of 25.67 and 15.13 being attached to swinging strike percentage and first pitch strike percentage, respectively. Velocity on a pitcher’s other pitches is no doubt important too but if had to pick between elite velocity with average control or average velocity with elite control, the latter choice is the best for the team. Overall, these results could change the way on how pitchers are developed and not just traded for or signed in free agency as it increases the ways teams could acquire talent to limit runs and have a successful season.
Comments