At the crux of all player scouting is data. Whether it’s measured or observed, statistical or anecdotal, data gives shape to the insights that we draw from player performance – which practitioners, coaches and entire franchises then use to help make predictions about future player performance.
But here’s the rub: In order to trust our eyes, and even our most accurate measurements, we need lots of data – and it needs to extend far beyond that of individual player profiles. And although many sports organizations now have the knowledge and resources to intelligently collect entire computer servers full of proprietary data, even the elite professional franchises have access to only a fraction of the information that would empower them to transform educated guesswork into something closer to a working hypothesis.
Everyone from performance practitioners to armchair quarterbacks and fantasy sports owners understand the importance of sample size when attempting to separate the randomness of statistical noise from a full-blown pattern or trend. But in addition to a depth of data, teams also need breadth. It helps to know a given athlete’s physical capabilities on her best and worst days, for instance, but it’s impossible to paint a full picture without knowing how that athlete compares to her competitors – or more importantly without knowing the accuracy of existing data sources.. For now, that’s a key sticking point.
The Current Limitations of Proprietary Data
The challenge in scouting is reach. We consider Usain Bolt to be the fastest athlete in the world because we can compare his world-record 100-meter time (9.58 seconds) to those of sprinters across decades. Yet, our assessment falters when confronted with shorter bursts, nonlinear sprints, decelerations, and nuances of explosiveness. These aspects are pivotal in understanding an athlete's full prowess. Is it possible that a world-class athlete bested that time outside official competition, or even that a genetically gifted hunter from centuries ago ran faster? It seems unlikely, but we can’t be sure. We lack the breadth of data to prove otherwise.
Those limitations become even more pronounced when measuring dynamic movements, fatigue and other complex kinetic systems – and when factoring in the basic structure of modern competitive sports. Consider: The Premier League, the National Basketball Association and every other major sports league in the world operate on a zero-sum basis. For one club to win, another must lose. If your team isn’t in the playoffs, they’ve landed in the lottery. Given those stakes, why would any organization want to share proprietary information and risk “helping” a competitor?
The Game-Changing Potential of Data Sharing
The Herculean task today is convincing a multitude of clubs that sharing comparable data with one another is in all their best interests. For example: Club A might believe that its player-tracking technology is superior to that of Club B – and it may even be right. But Club A has virtually no data on Club B’s players. That means it lacks the ability to accurately compare its own athletes to those of its competitors. Neither does it have any information that might flag an injury risk, or a potential hidden talent, if Club A had an interest in trading for or signing a competitor’s player. Data sharing across teams – and even leagues – unlocks that knowledge.
In addition, legal challenges restricting the commercialization of player data remain. Current solutions have their own limitations as well: scaling existing video footage and deriving insights through AI analysis has some immediate potential, although its measurement of speed leaves something to be desired. The trouble is that video footage quickly becomes outdated, and in some cases is limited in its reach. What makes the most sense instead is a real-time, shared system of player tracking and data sharing.
Think of it: a network in which teams agree to collect and make available to one another data from all games played in their stadium or arena. The more teams that sign on, the deeper and more far-reaching – and, thus, the more valuable – that data becomes. Organizations would still compete for the best data analytics technology and personnel, the smartest executives and savviest coaches. Decision-making would still be the key competitive differentiator. A true data-sharing network would simply give all clubs a chance to make their best-informed decisions, facilitated by the advancements in data analysis. If you trust your staff, and you can trust the data, it’s a scenario that any organization should welcome.This collaborative approach would not only leverage technology but also create a layer of trust and accountability, ensuring that the shared data is used to elevate the entirety of the game.
Comments