2022 QB Analytical Review part 3
Author: Billy Jones
Introduction
Welcome back to the
third installment of my blog series where I perform an analytical review for quarterback
fantasy performance from 2022. In this blog, we start to “take a look under the
hood” of fantasy football player production. We will be exploring the
relationships between data points that make up fantasy scoring that simple
logic would assume are closely linked, such as rushing yards and rushing
touchdowns, passing yards and passing attempts, and more. We will be using
visualization tools to help us determine if these relationships actually exist
and if there are any outliers or anomalies in the data and will comment on what
this might mean about their performance in 2023.
I'm particularly
excited about this blog, as we start to get a little deeper into the realm of
data science-y things. This blog touches on topics of correlation and
anomalies, and how they can be used when evaluating player performance. These
are easy concepts and tools which help towards a comprehensive understanding of
what happened in 2022 and what could be in 2023. Let's get started!
Statistics 101
In today’s blog we
will be adding two new data science terms into our study, correlation and
outliers.
Correlation is a
statistic that describes the relationship between two data points. It is expressed
as a number between -1 and 1, with 1 being a perfect positive relationship, -1 being
a perfect negative relationship, and 0 indicating no relationship between the data
points. A positive correlation means that as one data point increases, the
other data point tends to increase. A negative correlation is the opposite
whereas one data point increases, the other tends to decrease.
It's important to note
that just because there is a correlation between two variables, it does not
necessarily mean that one variable causes the other. This is a common pitfall
in statistical analysis, and one that we must be mindful of when analyzing the relationship
between offensive market share and fantasy production. However, it's also
important to remember that just because we can't pinpoint a clear direct
cause-and-effect relationship between two variables, it doesn't mean there
isn't something there. Correlation may still indicate that there is some
underlying factor or factors that are causing the data to behave the way it
does.
An outlier is an
observation in a data set that lies far away from other values. Outliers can have
a significant impact on the results of statistical analyses and can sometimes
be indicative of measurement error or data entry errors (or fluky football results
based upon a small-ish sample size).
Ground Rules
Before we jump back into the analytics, I would
like to remind the readers of the ground rules we will be playing with. The
most common scoring system is as follows so that’s what we will be going with:
- Passing Touchdowns: 4 points for each touchdown pass.
- Passing Yards: 1 point for every 25 passing yards.
- Rushing Touchdowns: 6 points for each rushing touchdown.
- Rushing Yards: 1 point for every 10 rushing yards.
- Fumbles Lost: -2 points for each fumble lost.
- Interceptions (INTs): -2 points for each interception thrown.
Additionally, I want to note we will be
focusing on a pool of 32 QBs (shown below). The data used for this analysis was
obtained from pro-football-reference.com, and all games where the QB had less
than 10 passing attempts were removed to mitigate games where the player may
have been injured or playing in garbage time. This will help ensure that the
analytics aren’t skewed by anomalous game results and allow us to gain comfort
in the conclusions we draw from the results.
Aaron Rodgers |
Deshaun Watson |
Josh Allen |
Mac Jones |
Taylor Heinicke |
Andy Dalton |
Geno Smith |
Justin Fields |
Marcus Mariota |
Tom Brady |
Brock Purdy |
Jacoby Brissett |
Justin Herbert |
Matt Ryan |
Trevor Lawerence |
Dak Prescott |
Jalen Hurts |
Kenny Pickett |
Matthew Stafford |
Tua Tagovailoa |
Daniel Jones |
Jared Goff |
Kirk Cousins |
Patrick Mahomes |
|
Davis Mills |
Jimmy Garappolo |
Kyler Murray |
Russell Wilson |
|
Derek Carr |
Joe Burrow |
Lamar Jackson |
Ryan Tannehill |
|
This analysis only
encompasses 32 quarterbacks and therefore it's possible that some of your
favorite QBs were excluded. The sole exception among QBs with more than 9
starts is Zach Wilson, who was not included due to widespread agreement that he
is not good at football.
Visualizations
and Analysis
With the ground rules
established and the data sources defined, let’s get into analyzing some relationships.
In this analysis, we
are examining two commonly assumed relationships in offensive production: the
direct correlation between rushing attempts, yards and touchdowns, and the same
for passing. Although we acknowledge that these are simplified views of
offensive production, our scatterplots provide clear indications that these
assumptions hold true.
* Visuals are colored based upon groupings identified in part 2. *
Analysis: Upon examining the data, the relationship here appears strong for both rushing related correlations. Additionally, three points stood out as particularly noteworthy. We observed that Lamar Jackson and Justin Fields had rushing yards production that outpaced their rushing touchdown production, making them strong candidates for positive touchdown regression in the 2023 season. Conversely, Jalen Hurts had rushing touchdown production that outpaced his rushing yards & attempts production. Do we think these touchdown figures for Hurts are sustainable because his rushing profile is different than the rest as the Eagles short yardage runner or is he a prime negative touchdown regression candidate for the 2023 season?
Analysis: This final visual shows how Brock Purdy had the most touchdown inflated fantasy profile out of all the players in our study. Additionally, while Justin Fields had some anomalies in his production relationships at the detailed level, a more zoomed out view showed his results were much more normal. This highlights the importance of taking a comprehensive approach to analysis, rather than simply reviewing a single statistic or visual in isolation.
General Point: Attempts
vs. Yards for both rushing and passing were strongly correlated. Some players
are more/less efficient but when looking for a quarterback that is going to put
up points, look for a coach willing to make their quarterback the focal point
of their offense. Yards/Attempts vs. touchdowns for both rushing and passing had
much weaker correlation. While the relationship definitely still exists as
players who produce more yards typically produce more touchdowns, the data is
much more spread out with outliers on both sides. I hope to look into what are
some of the key drivers of touchdown production in later blog series.
Conclusion
In conclusion, our
analysis of the relationships between rushing yards and rushing touchdowns, as
well as passing yards and passing touchdowns, has yielded some interesting
insights into the offensive production of various players. Stay tuned for the
last blog post in this mini study where we look into yardage production consistency.
Bonus visual: Average completions by average attempts (completion percentage).
* This blog post was enabled by ChatGPT. The text was generated by me, and the content is my own, but some sentences and wording were provided by the model. I take full responsibility for all information produced in this blog. More information about OpenAI and their technology can be found at https://openai.com. *
Comments
Post a Comment