Monday, March 4, 2013

MLS Tempo-Free Soccer (TFS) Rankings Methodology

Overview


 I believe that framing things on a per possession basis in soccer is an effective way to evaluate teams and players.  What TFS attempts to answer is: When you have possession, what do you do with it?  Do you score? Do you turn it over?  What do your opponents do when they have possession?

Borrowing heavily from tempo-free statistics pioneers Ken Pomeroy and Dean Oliver, who have done excellent work with basketball, all statistics are framed on a per possession basis.  Luckily for Pomeroy and Oliver, it is relatively easy to determine what constitutes a possession in basketball.  Soccer is much more tricky.

Possession


Would-be soccer statisticians have long been flummoxed by the lack of available statistics.  In recent years however, companies such as OPTA have taken on the large task of compiling a more comprehensive look at a game.  Indeed, it is OPTA's game reports for MLS that form the basis of these rankings.

My starting point is the "Tackled, Possession Lost" (TPL) metric.  TPL is assessed for any errant pass, interception, failed dribble, etc.  A typical total for a team is between 120 and 160 a game.

The second component in determining the number of possessions is the "Clearances" (C) metric.  According to OPTA, all Clearances are TPL, but not all clearances are changes in possession.  For example, if a defender clears the ball out of bounds, then the team that originally lost the ball never really lost possession.  Therefore, the number of Clearances, less Clearances where the defense maintains possession, is subtracted out of the TPL total.  This subtotal is the number of Turnovers Committed.

Since it is Tempo-Free Soccer's opinion that all possessions end in either a Turnover or an "Attempt on Goal" (AOG), the equation for Possessions can be written as:

Possessions = TPL - C + AOG


One problem with the Tempo-Free soccer analysis is that very rarely will the number of possessions each team has equal each other.  That said, the difference rarely exceeds 10 and is often closer to 5. Over the course of the season the disparity appears to sort itself out. The reasons for the consistent disparities are instances of "double possessions": a player may lose possession of the ball but it is adjudged to have gone off the other team, an attempt on goal is rebounded and another attempt is made, etc.  It is my belief that even if a discerning eye were to go back over the entire 90 minutes, a complete reconciliation would be nearly impossible.


Expected Goals

EG = (AOG X Conversion %) + (SOG X Conversion %)
                                                     2

EG is a general metric taking the two major components of scoring goals (AOG, SOG) into consideration.  Its aim is to roughly predict a team's expected goals scored per possession and goals allowed per possession.  Conversion rates (Goals/AOG and Goals/SOG) are highly variable from team to team in a single season but generally consistent over time. Therefore, creating more opportunities than your opponent on a consistent basis is the primary driver to having a positive goal differential, winning soccer games, and accruing more points over the course of a season.

EG is derived by calculating the league average conversion rates for AOG and SOG and then assuming each team converts at the league average.  At this point, I am weighting AOG and SOG equally, but this may be subject to change.

Luck

Luck = Points Per Match - Expected Points Per Match

For some reason, a long time ago someone decided that a win should count for three points, a draw for one, and a loss as zero.  Margin of victory matters not one bit (except in case of tiebreaker).  From a statistical point of view, the win/loss/draw system is silly human artifice.  Some teams benefit from it (winning close games) and others do not.

Expected Points Per Match are calculated by creating a best fit line where Goal Differential is the X variable and Points Per Match are the Y variable.

2 comments:

  1. Very interesting stuff.

    Are the number of possessions for two teams not the same because:

    a. definition

    b. the lack of data to define a possession

    ? For the rebound example you give, it's all the same possession. The other team never got the ball.

    Thanks.

    Ed

    ReplyDelete
  2. Ed,

    The main problem with reconciling possessions is I have to rely on OPTA's chalkboards. Unfortunately, they don't have a "possession" # so I have to back into an approximated # of possessions. OPTA analysts are human and I can assume that of the cumulative ~260 or so Tackled, Possessions Lost that occur in a given game there will be a fairly large # of errors or other judgment calls that skew my formula.

    You are correct, in the rebound example I give that is the same possession. I am admitting defeat on those, however, as it is far too time consuming to try and reconcile that type of event using the chalkboards. If doing so would provide a perfect reconciliation then I would consider it, but I think the vast majority of the variance in # of possessions is due to the OPTA TPL counting issues I allude to above.

    The next time you watch a game, try and count when possession switches. You will quickly find what a difficult task OPTA analysts must undertake :)

    Thanks,

    TFS

    ReplyDelete