Introduction to video annotation for surveillance AI

The surveillance, security, and identity verification industry has been completely transformed by the recent improvements in AI technologies. Given its need to process enormous amounts of data, the industry has been one of the pioneers in the adoption of AI systems, both by public authorities and by private companies. However, the industry has also come under a lot of scrutiny and public criticism, given the fact that it enables both public authorities and corporations to perform mass surveillance at unseen scales.

In this context, the processing of datasets and the training and improvement of AI models needs to be done with extreme care and consideration for personal privacy and human rights. In addition, harmful racial and gender prejudices need to be mitigated. The data used for training such systems must be handled after implementing proper anonymization and data security procedures. Using our GDPR-compliant annotation teams, we at Humans in the Loop are the right partner if you are looking for dataset annotation and model monitoring services, even for the most high-risk and sensitive applications.

Challenges and best practices

The AI surveillance industry holds a lot of promise for ensuring public security and preventing the worst kinds of incidents, but it also poses major risks. Below we outline a couple of the challenges specifically related to dataset management and annotation, as well as model retraining and maintenance.

Challenge

Best Practice

The data for surgical AI applications frequently comes in the form of videos, some of which can be very long. For example, a video of a colonoscopy may have a length of more than 2 hours. This makes it time-consuming to perform detailed frame-by-frame annotation, and splitting the videos into shorter clips may backfire because if they are annotated by different people who do not have the whole context, their interpretations of the content may differ.

In order to perform video annotation in an efficient way which does not sacrifice quality and consistency, our annotators use a variety of automation techniques which allow them to interpolate annotations from one frame to the next one. In this way, one annotator can annotate an entire video in a short period of time, ensuring that they have access to its entire duration in order to make better judgments. Many platforms on the market support interpolation with bounding boxes, while the best ones provide this feature even with polygons and semantic segmentation. In addition, once you have a trained model, you can use it in order to generate pre-annotations for new datasets, which can optimize the manual work of the annotators who will only check and validate the outputs rather than annotating from scratch!

The anatomy of each person differs, and their internal organs may appear very differently in terms of position, size, and orientation. Moreover, many abnormalities such as polyps or cancerous formations may be difficult to detect even for human experts, who may disagree on the interpretation of their presence, size, and type.

We are conscious that there is a lot of subjective judgment involved in medical AI annotation and this is why we work with certified medical professionals. We have a roster with more than 20 specialties, including radiologists, surgeons, dentists, ophthalmologists, and others, who work closely with you in order to agree on the standards for interpretation of the data and to make sure the entire annotation team interprets the data in the same way, according to industry standards and widely accepted taxonomies.

Medical and dental data are not readily available and accessible as other types of computer vision data. Data from different hospitals is frequently kept away in siloed systems which are frequently not interoperable, and there is an acute lack of diversity in the demography of patients and the geographic locations where data is usually coming from. Therefore, datasets for surgical AI applications frequently carry systematic biases.

As a company committed to mitigating harmful biases in AI systems, we dedicate a lot of time to find out workable solutions to make datasets more representative and diverse. The most effective solution is an iterative approach where an AI model that has been trained is tested on new data during deployment and its performance is evaluated on a continuous basis. Whenever data drift is detected, the outliers are sent to our humans-in-the-loop which label the ground truth interpretation so that the model’s predictions can be assessed. This guarantees the gradual improvement of models and the mitigation of harmful prejudice.

 
 

This is why it’s really important to start with the right training data and to adopt a human-in-the-loop approach in order to continuously improve the surveillance models that you are using.

High quality ground truth metadata

Edge case handling

Do you want to learn more about our approach and how it improves your model’s performance? 

Model validation circle icon from humans in the loop with girl and robot with question on purple globe background

Types of annotation for Surveillance AI

Below we will feature several different use cases of image and video annotation for surveillance purposes and some of the best practices for each one.

Pedestrian tracking

This type of labeling is very relevant for applications dealing with foot traffic monitoring, traffic light control, or jaywalking detection. It consists of bounding box annotation with interpolation across frames, as well as per-frame or per-object labeling of the state of the person (eg standing, walking, running, etc.)

Smart offices and buildings

For companies which want to introduce smart office applications for a more interactive and energy efficient environment, datasets with bounding boxes and action labels may be very relevant. This includes foot traffic monitoring in different spaces, movement and gesture recognition, property damage detection, etc.

Event monitoring

For large scale events such as festivals or football games, AI may be used to provide accurate headcount estimations and action monitoring for large crowds. This can provide useful insights into attendee flows for optimization purposes, but it can also help to identify risks in real-time with a human-in-the-loop.

Weapon detection

Weapon detection in public spaces and institutions such as schools is a critical application which benefits from a human-in-the-loop approach as well. The key for success here are comprehensive real-life and synthetic datasets of scenarios in the monitored spaces, coupled with real-time monitoring by human operators.

Patient monitoring

The monitoring of patient safety and state at home, in hospitals and in homes for the elderly is extremely important because an undetected fall on the floor may cause serious injuries. This use case also requires comprehensively annotated training data coupled with a human-in-the-loop approach.

Access control and intrusion detection

Finally, AI systems can be used to replace or work together with traditional security guards and badge verification systems for access control. In addition to consensually obtained datasets for the identification of each individual, this use case needs adversarial examples for spoofing detection.

Tools we love

Surveillance video annotation requires versatile tools and platforms which can handle a high amount of annotations per image and a high density of action labels and IDs. Below are our two favorite tools for this purpose:

CVAT is the easiest tool to get started with for video annotation purposes, and it supports dense bounding box annotations on each frame. It also has a simple reporting and quality management functionality.

With its video annotation suite, Supervisely enables users to annotate videos with a multi-track timeline. It even supports AI tracking of objects across frames without relying on keyframes for the interpolation.

Blogs

Best Annotation Tools for 2021

Found your humans – now all you need is the right tool for the job? Here’s our review of the most popular tools for 2021!

V7 Automated polygon detection GIF

How to use a human-in-the-loop for surveillance AI

Surveillance AI applications are some of the most high-risk and critical ones so having high-quality training data and frequent iterations of model testing and improvement is crucial. Here are some of the ways in which humans can be plugged into the entire MLOps cycle in order to provide human input and verification on a continuous basis:

  1. Ground truth annotation: in order to train your initial models, we offer full dataset annotation from scratch, including a pre-annotation anonymization step in order to preserve the privacy of the individuals who appear in the images. Our annotators are able to annotate both images and videos with bounding boxes, action detection tags, unique ID tracking across frames, and other requirements 
  2. Output validation with active learning: once you’ve trained an initial model, we can use it in order to pre-annotate a large part of the dataset, which will both increase the speed of the annotators and the impact of their work, by setting up an active learning workflow and prioritizing frames or videos where your model is least certain
  3. Real-time edge case handling: once you have a model in deployment, our humans-in-the-loop are available 24/7 to monitor live video streams coming from your systems or to provide alert handling whenever a specific alert is triggered, in order to avoid alert fatigue for the end user. In this way, we provide a critical second human layer of verification for your model’s most critical responses

Wondering who is annotating your data?

When you are hiring a company to help you with your annotation needs, you frequently never meet the workers who are labeling your data. We want to change this and present to you the inspiring stories of our annotators!

MEET THE TEAM - Monday 250722 (1)

Does this sound like something you’d like to try out? Get in touch with us and we’d be happy to schedule a free trial so as to explore how we can best help you with your video annotation needs!

Get In Touch

We’re an award winning social enterprise powering the AI solutions of the future