The Newest FIFA World Cup Referee: Human-in-the-Loop Machine Learning

by Jacob Bengtson

Posted in Business | December 07, 2022 4 min read

In case you were not aware, there’s a little event called the World Cup that’s happening right now. This World Cup has been notable for a couple reasons. The first being the timing—no summer watch party barbeques this time around, instead FIFA is breaking from tradition and running the tournament in the northern hemisphere winter months to spare the players the experience of playing soccer (Cloudera is headquartered in the US, so it is “soccer”) in temperatures exceeding 41.5°C (Cloudera is headquartered in the US, but we also recognize the superiority of the metric system).

The second notable fact about the 2022 World Cup is that this is only the second World Cup to be held entirely in Asia, the first being the 2002 tournament held in South Korea and Japan. However, it is the first World Cup to be held in the Middle East region of the world!

The third, and most interesting fact about the 2022 World Cup, is the new and innovative ways that technology and data are being used to improve the beautiful game, both on and off the pitch. For off the pitch innovations, Qatar has implemented solutions like a state-of-the-art cooling system, and even cameras and computer vision algorithms designed to prevent stampedes. For the fans, you don’t have to look far to find new and exciting ways that technology is enhancing their experience.

The data innovation that I was most excited to learn about though is the implementation of a human-in-the-loop (HITL) machine learning (ML) solution to assist referees in more accurately calling offsides. Officially, FIFA is referring to this ML solution as Semi-Automated Offside Technology (SAOT). Human-in-the-loop ML is not a new or novel technology, but the use of it at soccer’s largest stage is a major step for ML as a mechanism to improve the quality of officiating at professional sporting events.

What is human-in-the-loop machine learning?

Machine learning is a subcategory of artificial intelligence where computer systems learn to do tasks based on data rather than being explicitly programmed to do so. HITL ML adds an additional step that requires a human (preferably a subject matter expert) to verify the tasks being performed by the computer system.

HITL ML essentially combines the strength of both ML and humans. ML has the unique advantage of being able to scale across multiple systems and process data exponentially faster than the human brain, allowing it to handle many more tasks than a human ever could. ML is not always perfect though, and so by including humans that are subject matter experts in the training of the system and the tasks being performed, you can minimize the likelihood of ML performing tasks incorrectly.

Humans can be involved in either the training of the system through providing it with the data that it learns from, or in the case of SATO, humans can be used to verify that the task performed was done so accurately.

A world-class machine learning solution

The ML model implemented as part of SAOT is trained to classify a play as either offsides or not. It uses two main sources of data as inputs: The first is Adidas’ new IoT enabled ball, the Al Rihla Pro. This revolutionary ball contains a sensor inside of it that senses the ball’s inertia; that data is captured and reported a remarkable 500 times per second. The data from the ball provides a precise measurement of the direction that a ball was kicked, at the moment it was kicked (well, within 1/500th of a second that is).

The second piece of technology used as inputs to the ML model comes from 12 cameras mounted just underneath the roof of the stadium. These cameras capture 29 data points on each of the 22 players on the field, at a rate of 50 times per second.

That means that 17,400 positional data points from players on the field are used as inputs into SAOT’s model every second.

With these two sources of data, inertia data from the ball, and player positional data from the cameras, the SAOT ML model is able to provide a classification of whether each play is either offsides or onsides. Now here is where the HITL aspect of the solution comes into play. The offside prediction doesn’t go directly to the on-field referee, it is instead sent to the VMOs (video match officials) who then validate the offside prediction. The model provides the kick point of the pass that resulted in an offsides play, as well as a generated offside line with the 29 points of the offensive and defensive players at the moment of the pass. If the VMOs agree that the play was indeed offsides, they inform the on-field referee.

Other applications for human-in-the-loop machine learning

A natural extension of this technology would be in other sports. Imagine if an automated system was used to inform NFL referees whether or not a player stepped out of bounds, or if the ball went across the goal line in mayhem of a QB sneak from the one yard line. In the NBA, HITL ML could be used to definitively classify a play as a block or a charge (the bane of any NBA fan’s experience).

What’s great about HITL ML is the speed at which it occurs. There would be no more five-minute review of the same camera angle in which we all disagree if there is court/grass between a player’s foot and a line. Instead, the prediction is instantly available. Additionally, because trained officials are still used to verify, inaccurate predictions are caught (and there’s no possibility of Sky Net going active and robots taking over the world).

In business, HITL methodologies can be used to minimize downtime due to an incorrect precision of a failure in a predictive maintenance application, and give confidence to business stakeholders that the output from ML models can be trusted.

It takes more than machine learning to solve these problems

Notice that the semi-automated offside technology solution wasn’t just an ML model sitting in the cloud. It required data to be streamed, transformed, loaded, analyzed, and reported, all within a matter of seconds. A solution for something like that requires data services for every step of this process, and these data services have to work together seamlessly, both on premise (the ball and cameras) and in the cloud (model training, predictions, and reporting web applications).

This is why Cloudera has built the hybrid data platform (the Cloudera Data Platform) with integrated data services for every step of the end-to-end data lifecycle, because anyone who has built ML solutions knows that it takes more than just an ML point solution in the cloud to deliver a business ready solution.

If you would like to learn more about how the Cloudera Data Platform is the hybrid solution you’ve been looking for, go here to learn more.

Jacob Bengtson

More by this author

Editor's Choice

Business

Acquisition of Verta’s Operational AI Platform Will Transform Cloudera’s AI Vision to Reality

Business

Bringing Financial Services Business Use Cases to Life: Leveraging Data Analytics, ML/AI, and Gen AI

1 Comments

by Amod Kumar on Dec 07, 2022 @ 9:26 pm EST

Excellent detailing. Agree in toto with you. It’s not only about building ML solutions but how the the data is organised on various platforms. Cloudera just fits the bill appropriately.