Tuesday, July 15, 2014

The Best Uncapped American Player + MLS Homegrown Growth

With Jurgen Klinsmann looking to start a youth movement in the US national squad over the coming months we will no doubt see a slew of fresh faces earning their first cap.  And who do we want to see earn their first cap the most?  We have written about him before, but we have no qualms about doing it again.  Wil Trapp, the young Columbus Crew homegrown player, has been turning heads with his intelligent, technical play (see above pass distribution from a recent game vs. Colorado).

When looking at Trapp's statistical profile, one metric in particular jumps out: the number of 25+ yard passes he completes (what Opta calls "long balls").  Trapp completes an astonishing 9.8 of these per game, at a very accurate 86% clip.  The majority of these are not hopeful long balls to a striker, but are what most fans might call a "switching ball" that shifts the point of attack from one flank to the other - an indispensable skill for a holding midfielder.

While we know Trapp has a great statistical profile, it is difficult to really assess a player without surveying for potential comparisons.  As such, we looked at every central or holding midfielder under the age of 25 who completes 4+ 25 yard passes a game in eight top divisions: MLS, EPL, Bundesliga, Serie A, Ligue 1, Eredivisie, Russian PL, La Liga.  We found 22 such players (including the Whitecaps' Gershon Koffie and Matias Laba).  Then, we compared them across three major factors: pass distribution, defensive contribution, and pass usage rate.

There are a lot of familiar names on this list, including seven players who featured at the most recent World Cup, but it is self-evident that statistical comparisons across different teams and leagues make drawing any conclusions exceedingly difficult.  Each player is used in different tactical fashions by their teams and it is no doubt much easier for Wil Trapp to play a switching ball at home against Chivas USA than it would be for Jordan Henderson playing at Chelsea.  Nevertheless, it is hard but not notice that of the players who complete 6+ 25 yard passes a game, two played in the World Cup, two others play for their national team, a fifth is worth ~8M euros, and the sixth is Wil Trapp. Give the kid a cap, Jurgen.

MLS Homegrown Growth

Seeing the success of Trapp, DeAndre Yedlin, Diego Fagundez, and a litany of other MLS "homegrown" players, we cannot help but be excited about the future of the MLS academy program and what it portends for the league.  At the current rate teams are signing homegrown players (and if we assume an average MLS career of 5 years), it is not implausible that in 10 years over a third of the league will have been signed directly from an MLS academy.

Thursday, July 10, 2014

Ranking World Cup Rating Systems

Before the World Cup began we looked at what some rating systems predicted from the group stage.  To recap, the four rating systems we looked at were:

Elo: Originally devised as a method to rank world chess players, it is one of the most robust international soccer rating systems.
SPI: Developed for ESPN by famed political prognosticator Nate Silver.  
Oddsportal: An aggregator of 10+ online betting house odds.  Reflects the opinion of the betting public.
EA Sports FIFA Video Game: Based on player and team ratings provided by the popular video game franchise.

When we ran each system's match-by-match projections, certain geographical biases became noticeable (see below).

To re-iterate, we only looked at group stage games, and projected results were not modified as games were played.  In essence, the projections represent a "freeze frame" from before the World Cup began. Many other people have been tracking the rating systems, perhaps most notably Seth Burns over at Stats Bomb, who has been making "bets" following Nate Silver's SPI model.


Granted, the differences between each rating system might seem small, but don't be fooled. Over the course of 48 games (still a small sample size) it is significant that both Elo and SPI clearly outperform Oddsportal.  If you clicked on the Seth Burns link  you would see that by betting with SPI during the group stage you would have ended up over 600%!

Breaking the results down by region confirms what everyone knew watching the group stages:  the Americas came to play, while Asian teams were largely disappointing.

Tuesday, June 17, 2014

USMNT Fewest Passes per Possession (Klinsmann era 2011-)

The 10 fewest passes per possession  per game recorded during Jurgen Klinsmann's 49 game tenure.  You might notice that there are actually quite a few wins in the bunch, including the recent 2-1 World Cup victory over nemesis Ghana.

Thursday, June 12, 2014

Measuring Tactical Variance by League

This article originally appeared in Stats Bomb


Manchester City won the 2013-2014 Premier League with a diverse and international (and very expensive) squad.  Of the players who made 20 or more league appearances, a full eight different nationalities were represented (nine if you count their Chilean manager, Manuel Pellegrini).  Only one first choice squad player, goalkeeper Joe Hart, was English.
In many ways Manchester City is representative of what many see as the future of European football, one in which hyper cross-pollination of playing styles and tactics renders our old heuristics (Spain = tiki-taka, Italy = catenaccio , etc.) useless.  In this future world of European football, then, we might expect the distribution of formations/tactics to be fairly consistent across different leagues. Of course that is not the case now, and quite possibly will never be.
 In the complex world of game theory and football formations, sometimes it behooves a manager to stick with an unsuccessful setup for no better reason than it is what everyone else in the league is doing; many people do not like to take risks, especially if their job is on the line.  Conversely, in a league like Serie A where using different formations/tactics from game to game is almost an obsession, an adherence to one formation might be frowned upon.
 It should be stated that while this piece is about “tactical” variance our only measurement tool is “formation” variance.  Formations and tactics are not necessarily the same thing.  For example, a 3-5-2 might in practice more resemble a 5-3-2 and any formation can exist in an attack-minded or defensive form.  However, to the extent we are measuring tactical heterogeneity/homogeneity it seems self-evident that measuring formation variance is probably as good of a proxy as any.  Formation information comes from Opta, whose analysts watch every game for each team they are assigned.  Also noteworthy is this data does not include any in-game changes and is merely how each team lined up at the start of the game.  Information is from the last completed season (’13-’14) and includes only formations used more than 3%.  Formations are listed from left (most used) to right (least used).
formations by league
A couple things stand out here:
1. The Eredivisie loves the 433 and Russia loves the 4231, almost to the exclusion of any other formation.
2. Serie A demonstrates a tactical diversity not seen in other leagues (see below).
team avg formations
The “favored” and “unfavored” formations are partially a symptom of the fairly eclectic mix of leagues included in the analysis.  If we just aggregated all the teams from the “Big 4″ leagues (Bundesliga, EPL, La Liga, Serie A) this is what the results look like:
Big 4
The 4231 is certainly the fancied approach at the moment, but things can change.  For example, MLS has seen a rise in the use of the “diamond” 41212 in 2014. Unfortunately, this analysis does not include any data from previous seasons.  Will the homogeneity in the Dutch approach and heterogeneity in the Italian approach hold in the face football globalization?  It will certainly be worth watching.

Projecting the World Cup Group Stage

This article originally appeared in The Shin Guardian

Predicting the outcome of soccer matches, and World Cup matches in particular, with any confidence is an exercise for the foolhardy.  One fateful bounce, wonder strike, mistake, penalty, or offside call can change a nation’s entire trajectory.  But, what fun would this event be if we could not dissect and over-dissect all the matchups and possible outcomes?  We have rounded up some of the best regarded international soccer rating systems and played out W-L-D probabilities for every match of the group stage.  Let us meet our contenders:
Elo: Originally devised as a method to rank world chess players, it is one of the most robust international soccer rating systems.
SPI: Developed for ESPN by famed political prognosticator Nate Silver.  Get used to seeing their ratings thrown around a lot during ESPN’s World Cup coverage.
Oddsportal: An aggregator of 10+ online betting house odds.  Reflects the opinion of the betting public.
One way to decide it...
One way to decide it…

EA Sports FIFA Video Game: Ok, so using video game player ratings is not a statistically rigorous method, but this still seems a step up from Paul the Octopus.
The Predictions
(Note: all figures represent approximate expected points)

Group A
Group A
Group B
Group B
Group C
Group C
Group D
Group D
Group E
Group E
Group F
Group F
Group G
Group G
Group H
Group H
Ranking Biases
Each one of the four rankings (EA Sports, Elo, SPI, Oddsportal) have relative biases.
In many instances, these biases follow along geographic lines (see table below).
For example, many of the online betting houses are based in Europe, so there is a noticeable bias against lesser known international sides from North/Central America and Asia.
Similarly, EA Sports player ratings are noticeably biased towards players and teams that feature in the major European leagues.  In SPI’s case there is a favorable bias to South American sides which is likely due to the heavy weight they placed on CONMEBOL World Cup Qualifying matches (as far as I am aware the raw SPI ratings I used do not take into account continental bias for the event taking place in Brazil, which might be a plausible explanation).

By Region...

Friday, May 23, 2014

Joao Plata's and RSL's (Unsustainable) Efficiency

Joao Plata, the dimunitive (5 ft 2 in) Real Salt Lake forward, has absolutely put his stamp on this MLS season.  He has scored or assisted on nine of RSL's 23 goals, many of them in crucial spots, and has helped his club to an undefeated start through 11 games.  Most impressive?  Plata has created those nine goals in only 557 minutes of play, an absurdly efficient rate which would place him fully mid-table in MLS in goal creation/90 compared to other teams.

Plata's goals created per touch is best in the league, but what is truly remarkable is that he has achieved this absurd output while not turning the ball over nearly as much as the other players high on the list (Dwyer, Wright-Phillips, etc.).

It has not just been Plata contributing to RSL's success.  The entire team has been ruthless in front of goal.  Their conversion rate (goals/shots) on shots in the "danger zone" (defined here) of approximately 27% is not only the best in MLS this season, it is the best when compared to the 78 teams in the "Big 4" leagues of Europe (EPL, La Liga, Bundesliga, Serie A).  A team previously profiled here, the Crew, incidentally are last.

But how sustainable is this?  Most of the soccer analytic community seems to agree that conversion rates tend to regress to the mean, and RSL (and Plata) are certainly due for that.  But in the mean time it is fun, as a fan, to see a player and team in such a fine run of form.