Thermal video annotation for accurate surveillance despite concept drift

The image shows Milestone-Logo
Success Story



Illustration shows Geospatial annotation


Type of service

Illustrative image of a person with video annotation

Video annotation

Platform used

hasty logo


Services circle icon from humans in the loop with woman at boardi on purple globe background
frames annotated


Model validation circle icon from humans in the loop girl celebrating with robot on purple globe background
conflict-affected individuals in Syria provided with work

The client

Milestone Systems is a Denmark-based company with presence in more than 25 countries which is a leading provider of open platform video management software. Based on an open platform, their video management software enables integration with the industry’s widest choice in cameras. 

Milestone provides smart video solutions for various applications, such as smart cities, where it offers new solutions to old problems such as crime, congestion and pollution. With a focus on sustainability, urban mobility, and safety, they have partnered with 170 cities around the world to date in order to implement safe city solutions.

The challenge

One of Milestone’s recent research projects was to explore concept and data drift in surveillance applications. It is well-known that once computer vision algorithms step outside the lab and are deployed in real-life outdoor applications, their performance tends to drop significantly due to conditions changing over time, i.e. concept and data drift. Existing datasets usually favor coverage of multiple locations for short periods of time and are not suitable for exploring the long-term effects of concept drift.

For this project, Milestone partnered with the Visual Analysis and Perception Lab (VAP) at Aalborg University as well as the ​​Human Pose Recovery and Behavior Analysis group at the Computer Vision Center (CVC) at the Universitat Autònoma de Barcelona. Together, they set out to develop a novel real-world  long-term thermal video dataset for surveillance which spans across different seasons and encompasses a wide range of weather conditions, human activity, and recurring cycles such as weekdays, weekends, mornings and evenings. 

And in order to be able to fully understand the changing composition of the dataset, Milestone required comprehensive and precise annotations of each video in the dataset. This is when they decided to partner with Humans in the Loop.

Kamal Nasrollahi, Director of Research at Milestone Systems and Professor of Computer Vision and Machine Learning at Aalborg University told us:

I was really impressed by humans in the loop’s commitment to deliver a high-quality annotation. They take care of the details, educate the annotators about the importance of details for each specific case, and provide a verification step to make sure the annotations are as precise as possible.

Image of Kamal Nasrollahi, Director of Research, Milestone Systems

Kamal Nasrollahi, Director of Research, Milestone Systems

The solution

Humans in the Loop worked with Milestone in order to define the full scope of the annotation task and to spot the edge cases and subtleties in the data which would make the analysis of the dataset more insightful. The annotation team annotated a total of 10,039 videos which were 120 seconds long. In order to minimize the labeling effort required, 1 frame per second was extracted from each clip, resulting in 120-frame videos. 

All videos came from one single view camera looking at a harbor front in Aalborg, Denmark. The fact that the imagery in the dataset is thermal was beneficial for preserving the anonymity of the individuals who appear in it and no facial blurring was necessary.

During the scoping and trialing process, it was discovered that part of the harbor scene was too distant from the camera and objects in the distant area were too difficult to annotate because they were too crowded, blurry or pixelated. In addition, such small boxes would have confounded the AI model rather than lead to improvements, which is why Milestone and Humans in the Loop agreed on excluding part of the scene for the benefit of both the annotators and the AI model.


The image of the street with bounding box
The image shows Thermal video annotation

The annotation process involved video annotation with bounding boxes of 3 classes: humans, bicycles, and vehicles, with tracking across the frames. However, once the team started performing the annotation, they realized that in addition to bicycles, motorcycles also appeared occasionally and they weren’t part of the taxonomy. This was flagged to Milestone and the scope was subsequently amended to include motorcycles as well, and to re-annotate some of the data.

The result

The resulting annotations generated by Humans in the Loop were combined with other metadata for each video, such as timestamps, temperature, humidity, and precipitation. This gave Milestone the necessary information in order to perform people tracking and anomaly detection and to evaluate the performance of select AI models.

Video annotation of people moving and cars
Video annotation of people and cars

Milestone found out that person tracking and anomaly detection models exhibit performance degradation, with temperature and humidity influencing the models the most, followed by the change between day and night and the activity level of the scene. The researchers also proposed a new baseline algorithm for reducing the effects of concept drift and demonstrated that more diverse training data helps to mitigate concept drift. The group published their findings in an article which you can find here and also launched a Seasons in Drift Challenge for concept drift detection as part of the 2022 edition of the European Conference on Computer Vision (ECCV).

As a final step in the annotation process, Milestone participated in a workshop together with members of the annotation team. The workshop was facilitated by researchers at the Weizenbaum Institute at TU Berlin with the goal of exploring the documentation and processes used in the project, and co-designing a documentation framework collaboratively between the annotators and the client.

Interested in implementing a human in the loop in your AI pipeline? Get in touch with us and we would be happy to have a call.