Ultimate guide to keypoint annotation for sports AI
AI in sports has been growing enormously in recent years and it is enabling a level of precision and depth in sports analytics that the industry has never before witnessed. Computer vision is now being used to analyze player behavior, monitor game activity, and provide suggestions for better performance and posture for pros and amateurs alike. And of course, high-quality data is the key to enabling all of these new advancements.
The best thing about developing AI in sports is the wide availability of visual data: recordings of thousands of games from different sports, exercise videos from yoga and gym gurus, etc. The challenge is to select the most useful, representative, and diverse subsets of this data and to make sure it has all the necessary rich annotations in order to transform it into actual insights. And this is where a human in the loop becomes relevant!
Challenges and best practices
Based on our extensive experience annotating data for sports AI applications, we are sharing below some of our best practices and tips on how to ensure your AI project will be a success:
Challenge | Best Practice |
Keypoint annotation is a time-consuming process: in some cases, more than 150 keypoints need to be applied just on the face of a person! At scale, this becomes especially challenging, especially if annotating sequences of images or videos. | We apply a number of tricks in order to accelerate the keypoint annotation process: 1. Create a skeleton mesh which contains all the keypoints, so that annotators just need to adjust the nodes and the edges to match the figure, rather than creating the points from scratch
|
Pose estimation can be a very tricky process, especially if bodies appear in unconventional or difficult to distinguish positions. Whenever there is more than one figure on the scene, it may be common for an AI model to mistake one person’s limb for the other one’s or to completely lose track of a specific person if they are occluded or the video/image quality is low. In addition, whenever there is an unusual case, such as a person wearing a prosthetic limb, most generic AI models become confused and are underprepared for dealing with it. | In order to prepare AI models for any strange joint combination that they will encounter, we make sure to collect datasets that are as diverse as possible and present humans in extreme poses (such as our Extreme keypoints dataset above!). With our adversarial example collection services, we also take special care to include visual data of diverse humans, including different genders, ethnicities, body shapes, and ability levels in order to make sure the datasets are as diverse as possible. During annotation, we make sure that there are clear instructions on how to handle edge cases (eg a person that is occluded, or that goes out of the frame and comes back, or whose limb is suddenly cropped by the edge of the frame or disappears behind the person’s back). |
Sports AI models mostly operate on video data, but video annotation is much more time consuming than simple images. Especially when there are multiple objects to be tracked on each frame, each one with its unique ID, and there is a lot of action happening (as in every single soccer game for example!), the costs for annotation can skyrocket and the entire process can take a very long time if you are seeking a very large-scale dataset. Even if you attempt to use pre-labeling, sometimes this approach can backfire and can cause annotators more work in order to clean up the messy pre-annotations, compared to annotating from scratch! | Using advanced automation techniques, we can make sure that video annotation becomes much more efficient than standard image annotation: some of our favorite tools support interpolation not only for bounding boxes, but also for polygons and even keypoints. We are even able to implement interpolation that instead of being linear (filling out empty frames between the first and last annotated frame), uses AI to perform object tracking on the actual object that has been annotated. This is combined with smart task assignments, which split the dataset and allow annotators to track one figure at a time across the entire video, rather than trying to track everyone at the same time. |
This is why it’s really important to start with the right training data, to continuously expand your dataset with new adversarial and difficult examples, and to adopt a human-in-the-loop approach for validating your model’s predictions!
Data Collection
If you are looking for a custom ML dataset, our humans in the loop can help by gathering and curating a bias-free dataset!
Types of annotation for Sports AI
Below we will feature several different use cases of keypoint annotation for pose estimation using AI and some of the best practices for each one:
Yoga pose detection
Many consumer-facing apps for yoga now use the latest AI technology in order to provide customized feedback to the users. Using keypoint annotation and tagging, AI models can be trained to recognize specific yoga asanas as well as to detect whether there are any mistakes in how the body is positioned.
Exercise monitoring
AI can also be used in apps for exercise monitoring in the gym or at home. For this purpose, keypoints are used as well as tagging of clips in order to identify the type of exercise being done. This can be used as a basis for recommendations for better posture and joint alignment during exercise.
Rep counting
AI models can be trained to count reps for specific types of exercises in order to notify users once they have reached the required number of reps – as if they had a coach next to them! This is achieved through frame-level labeling on video clips for half and full reps, accompanied by detection of the type of exercise.
Performance monitoring
In order to improve player performance in sports, coaches can count on enhanced analytics of player movements and actions. For example, AI can detect and count the number of kicks, passes, tackles, fouls, etc. with bounding boxes on video and to provide statistics, including on pass and shot accuracy and other metrics.
Team strategy analysis
On a team level, the availability of annotations for player positions, actions and trajectories on the field throughout the game can provide detailed analytics such as ball possession statistics, team movement patterns, interactions, etc. These can be used to provide insights on team dynamics and shortcomings.
Penalty detection
In combination with semantic segmentation and bounding box tracking of objects such as balls, keypoint annotation can provide plenty of insights on penalties in different sports, such as line calls in tennis. In addition, the referee’s actions can also be recorded, eg. the number of red/yellow cards.
Yoga pose detection
Many consumer-facing apps for yoga now use the latest AI technology in order to provide customized feedback to the users. Using keypoint annotation and tagging, AI models can be trained to recognize specific yoga asanas as well as to detect whether there are any mistakes in how the body is positioned.
Exercise monitoring
AI can also be used in apps for exercise monitoring in the gym or at home. For this purpose, keypoints are used as well as tagging of clips in order to identify the type of exercise being done. This can be used as a basis for recommendations for better posture and joint alignment during exercise.
Rep counting
AI models can be trained to count reps for specific types of exercises in order to notify users once they have reached the required number of reps – as if they had a coach next to them! This is achieved through frame-level labeling on video clips for half and full reps, accompanied by detection of the type of exercise.
Performance monitoring
In order to improve player performance in sports, coaches can count on enhanced analytics of player movements and actions. For example, AI can detect and count the number of kicks, passes, tackles, fouls, etc. with bounding boxes on video and to provide statistics, including on pass and shot accuracy and other metrics.
Team strategy analysis
On a team level, the availability of annotations for player positions, actions and trajectories on the field throughout the game can provide detailed analytics such as ball possession statistics, team movement patterns, interactions, etc. These can be used to provide insights on team dynamics and shortcomings.
Penalty detection
In combination with semantic segmentation and bounding box tracking of objects such as balls, keypoint annotation can provide plenty of insights on penalties in different sports, such as line calls in tennis. In addition, the referee’s actions can also be recorded, eg. the number of red/yellow cards.
Tools we love
Here are some of our tried and tested platforms for keypoint annotation which offer best in class features for both images and videos.
Superannotate is a leading data annotation platform which includes keypoint detection with advanced automation features, such as switching from one point to the next one automatically.
V7 is one of our favorite tools for keypoint annotation, especially on video, provided that they are the only tool on the market which supports interpolation of the keypoints from one frame to the another.
Best Annotation Tools for 2022
Found your humans – now all you need is the right tool for the job? Here’s our review of the most popular tools for 2021!
How to use a human-in-the-loop for pose detection AI
Sports and physical activities are unpredictable and there may be plenty of edge cases or out of distribution instances that your model may not be prepared for. In order to deal with them, it’s important to use human input on a continuous basis, not just for the initial training of your models. Here are some of the ways in which humans can be plugged into the entire MLOps cycle:
- Dataset collection: our humans in the loop on the ground can collect videos and images of people doing different types of exercises, with a wide variety of genders, ethnicities, backgrounds, camera types, and locations
- Ground truth annotation: in order to train your initial models, we offer full dataset annotation from scratch in batches: keypoint and bounding box annotation on images and video, as well as more complex polygonal and semantic segmentation tasks, even 3D annotation!
- Output validation with active learning: once you’ve trained an initial model, we can use it in order to pre-annotate a large part of the dataset, which will both increase the speed of the annotators and the impact of their work, by setting up an active learning workflow and prioritizing instances where your model is least certain
- Adversarial example collection: once you’ve trained an initial model, we can expand your core dataset with additional difficult and challenging edge cases, such as unusual types of exercises, extreme poses, or challenging videos or camera angles, depending on the failure modes of your model during testing
- Real-time edge case handling: once you have a model in deployment, our humans-in-the-loop are available 24/7 to handle potential edge cases that appear in real time or close to real time, using a simple API request and sending the correct response in seconds in order to ensure a second layer of verification for your model’s most critical responses
Wondering who is annotating your data?
When you are hiring a company to help you with your annotation needs, you frequently never meet the workers who are labeling your data. We want to change this and present to you the inspiring stories of our annotators!
Does this sound like something you’d like to try out? Get in touch with us and we’d be happy to schedule a free trial so as to explore how we can best help you with your keypoint annotation and pose estimation needs!