Thermal video annotation for accurate surveillance despite concept drift
Type of service
Milestone Systems is a Denmark-based company with presence in more than 25 countries which is a leading provider of open platform video management software. Based on an open platform, their video management software enables integration with the industry’s widest choice in cameras.
Milestone provides smart video solutions for various applications, such as smart cities, where it offers new solutions to old problems such as crime, congestion and pollution. With a focus on sustainability, urban mobility, and safety, they have partnered with 170 cities around the world to date in order to implement safe city solutions.
One of Milestone’s recent research projects was to explore concept and data drift in surveillance applications. It is well-known that once computer vision algorithms step outside the lab and are deployed in real-life outdoor applications, their performance tends to drop significantly due to conditions changing over time, i.e. concept and data drift. Existing datasets usually favor coverage of multiple locations for short periods of time and are not suitable for exploring the long-term effects of concept drift.
For this project, Milestone partnered with the Visual Analysis and Perception Lab (VAP) at Aalborg University as well as the Human Pose Recovery and Behavior Analysis group at the Computer Vision Center (CVC) at the Universitat Autònoma de Barcelona. Together, they set out to develop a novel real-world long-term thermal video dataset for surveillance which spans across different seasons and encompasses a wide range of weather conditions, human activity, and recurring cycles such as weekdays, weekends, mornings and evenings.
And in order to be able to fully understand the changing composition of the dataset, Milestone required comprehensive and precise annotations of each video in the dataset. This is when they decided to partner with Humans in the Loop.
I was really impressed by humans in the loop’s commitment to deliver a high-quality annotation. They take care of the details, educate the annotators about the importance of details for each specific case, and provide a verification step to make sure the annotations are as precise as possible.
Kamal Nasrollahi, Director of Research, Milestone Systems
Humans in the Loop worked with Milestone in order to define the full scope of the annotation task and to spot the edge cases and subtleties in the data which would make the analysis of the dataset more insightful. The annotation team annotated a total of 10,039 videos which were 120 seconds long. In order to minimize the labeling effort required, 1 frame per second was extracted from each clip, resulting in 120-frame videos.
All videos came from one single view camera looking at a harbor front in Aalborg, Denmark. The fact that the imagery in the dataset is thermal was beneficial for preserving the anonymity of the individuals who appear in it and no facial blurring was necessary.
During the scoping and trialing process, it was discovered that part of the harbor scene was too distant from the camera and objects in the distant area were too difficult to annotate because they were too crowded, blurry or pixelated. In addition, such small boxes would have confounded the AI model rather than lead to improvements, which is why Milestone and Humans in the Loop agreed on excluding part of the scene for the benefit of both the annotators and the AI model.
The annotation process involved video annotation with bounding boxes of 3 classes: humans, bicycles, and vehicles, with tracking across the frames. However, once the team started performing the annotation, they realized that in addition to bicycles, motorcycles also appeared occasionally and they weren’t part of the taxonomy. This was flagged to Milestone and the scope was subsequently amended to include motorcycles as well, and to re-annotate some of the data.
The resulting annotations generated by Humans in the Loop were combined with other metadata for each video, such as timestamps, temperature, humidity, and precipitation. This gave Milestone the necessary information in order to perform people tracking and anomaly detection and to evaluate the performance of select AI models.
Milestone found out that person tracking and anomaly detection models exhibit performance degradation, with temperature and humidity influencing the models the most, followed by the change between day and night and the activity level of the scene. The researchers also proposed a new baseline algorithm for reducing the effects of concept drift and demonstrated that more diverse training data helps to mitigate concept drift. The group published their findings in an article which you can find here and also launched a Seasons in Drift Challenge for concept drift detection as part of the 2022 edition of the European Conference on Computer Vision (ECCV).
As a final step in the annotation process, Milestone participated in a workshop together with members of the annotation team. The workshop was facilitated by researchers at the Weizenbaum Institute at TU Berlin with the goal of exploring the documentation and processes used in the project, and co-designing a documentation framework collaboratively between the annotators and the client.