Here at Humans in the Loop, we have been publishing regular reviews of the best annotation tools on the market for a while now (you can find previous reviews here and here, as well as our complete “Tools we love” series). It’s exciting to see that the ecosystem is as dynamic as ever and annotation platforms are coming up with better and better features. All of our reviews are completely honest and based on our hands-on experience annotating thousands of images and videos for a variety of projects and use cases.
This year, we want to share with you this list: of what we think are the 4 best tools to use for labeling and annotation (ranked in no particular order)!
Similar to our previous lists, we have evaluated the tools with regard to the following parameters:
- Functions
- Automation
- Project management
Darwin, a tool by V7 Labs, is one of the most versatile and advanced tools for image and video annotation. This tool comes from the same company behind other exciting computer vision projects such as AI Poly and Autonomous Retail. In 2021, they were even named among the top 25 machine learning startups to watch In 2021 in an article on Forbes!
1. Functions: Darwin’s interface is exceptionally user-friendly and clean. It is a swiss army knife for annotation, supporting a variety of annotation formats as well as various files types of data including .mp4, .mov, .avi, .bmp, .jpg, .jpeg, .png, .svs, .tif, .tiff, and .dcm in both RGB and YBR color space. For larger collections, one can also import files using their command line option. The projects can be exported in various formats as well such as COCO, CVAT, PASCAL VOC, PNG masks, and Darwin’s own format in JSON or XML.
2. Automation: Darwin’s ‘Auto-annotate’ polygon and semantic segmentation tool is on of the best ones on the market because it is class-agnostic and can detect any area or object of interest. It can even be used in videos where you can apply auto-annotation on keyframes and watch how the interpolated polygons on the frames in-between smoothly morph into each other. However, currently, this does not seem to work as well on darker areas or very complex or unclear objects. For those specific use-cases, V7 can train a custom model for polygon detection with an AutoML engine, made to fine-tune Auto-annotate to new datasets. They also offer the possibility to train a neural network directly within the platform (currently only available for Instance Segmentation), and each training session tests hyper-parameters and image augmentation techniques to find the optimal solution.
3. Project management: Darwin allows for setting annotation and review stages where work can be assigned to specific project members. During the review stages, annotators and reviewers can leave comments within the tool to ask questions or bring up issues. Data can also be sorted, and listed in order of priority and progress can be monitored as well. User management supports roles such as Worker, Workforce Manager, and Admin, and the productivity and quality of each separate user can be monitored on a per-project basis. Finally, a nice feature is the possibility to upload instructions for labeling that appear when a user opens a project and to have thumbnails and descriptions for each class – quite handy for annotator management!
Deepen was founded in 2017 and they identify as a safety-first data lifecycle and services company specialized in the self-driving cars & robotics industry. Along with their tool, Deepen AI which will be reviewed here, they also have other offerings such as their newly released Deepen Calibrate, a tool which can be used to calibrate multi-sensor data seamlessly. Deepen are also a project leader in Safety Pool, a Global incentive-based brokerage that works with Autonomous Driving Systems testing, validation & certification. Their tool, the Deepen AI Suite, offers a complete set of features to annotate and validate image, video, and sensor fusion (camera, LiDAR, radar, and more).
1. Functions: Deepen is the only platform offering a 3D LiDAR interface for annotation and semantic segmentation on this list. When launching a workspace, one can delve into the contents of the datasets and easily upload, annotate or even edit the data within them. Deepen AI Suite supports a variety of formats such as images, videos and 3D point clouds, which can be annotated as individual frames or as a sequence. They offer bounding box, polygon, line, point, 3D bounding box, 3D polygon, 3D line, 3D painting, 3D instance painting and scene labelling annotation options. What really impressed us were their unique offerings such as the ability to see all labels from multiple frames simultaneously, which helps especially with their pre-labeling feature.
2. Automation: Deepen offers a one-click bounding box feature which automatically generates a highly accurate and adjustable bounding box in 3D! They also offer a unique ML-powered Visual Object Tracking feature, with a handy slider function when the object leaves the scene, where instead of being based on interpolation, it can predict the location of the object based on its first frames. They also have a brush feature that can be used to automatically generate a segmentation mask over certain objects such as roads with the click of a button when labeling LiDAR data. They will also be soon releasing their superpixel segmentation capability, which helps to auto-segment images and reduce annotation time massively. Finally, their pre-labeling feature AI Sense can automatically pre-label images (up to 80 different objects in images, 8 different objects in sequences, and cars, pedestrians and bicycles in 3D).
3. Project management: From dataset and dataset profile creation, pipeline configuration and task assignments, to managing the workers and assigning them groups, these are just a few examples of what Deepen offers in regard to project management. The mentioned dataset profiles are really useful because the configuration of a project can be reused for similar projects in the future. Additionally, for QA the tool offers an issue creation option for highlighting and discussing issues with labeling, and even automatic quality assurance which is algorithm-assisted. The only downside at the moment is that the labelers use the same user interface as the project managers. However, we have been told that creating a separate portal for the annotators is in the roadmap and should be added in the near future.
Heartex is a data labeling tool company based in San Francisco. The founders were summiting Stok Kangri in the Himalayas when they decided to create a world-class open source data labeling solution, and Label Studio is the result. What distinguishes this platform from other labeling products is the open source offering, driven and supported by a very active and engaged user community. This Label Studio Community Edition is ideal for small teams, while the paid Enterprise Edition adds features that lets larger organizations scale their labeling operations. This review looks at Label Studio Enterprise Edition.
1. Functions: Heartex’s Label Studio Enterprise Edition is one of the most flexible and customizable data labeling tools out there. You can work with almost every type of data, such as audio, image, text, HTML, and HTML NER, and also import CSV or JSON files to process time series data and data hosted elsewhere. When you set up a project, you can configure the tool with the unique Labeling Config, featuring over 50 annotation templates that you can customize to give annotators an intuitive UI with exactly what they need to label the data. You can even have multiple images on one screen in the labeling interface! This sort of customization is an amazing and unique feature that you can experience in the Label Studio Community Edition Playground.
2. Automation: Label Studio Enterprise Edition has a variety of algorithm-driven automation features, including a pre-labeling option that can pre-label data based on an existing machine learning model. They also offer a feature to improve machine learning model results, providing a continuous model improvement loop where corrections to the pre-labeled data can help an algorithm improve for future pre-labeling tasks. Label Studio can even calculate what to prioritize to adjust the future model prediction. Additionally, they offer a really cool feature in active learning mode, where Label Studio can determine a representative item from a selection. Finally, there is a new feature coming soon called weak supervision, which the founders assure us will be a game-changer for labeling and annotating large datasets. With this feature, the user just has to input a set of rules to guide the automated labeling process.
3. Project management: When it comes to project management, Label Studio Enterprise Edition offers an easy project management interface, with training and data all in one place as well as secure cloud hosting. It’s straightforward to create teams, add members to your team, and invite annotators to join particular projects. It also includes a way for you to control the order in which data is sampled for labeling. As a control, a project manager can put together a set of labeled data to treat as “Ground Truths” for the project. This data can then be used to compare how accurately a machine learning model or annotators label the items in a dataset. You can configure the acceptable level of agreement to your specifications. There are also well-presented ways to keep track of the quality assurance process and of the metrics and statistics of a given project.
Alegion is an Austin-based company founded in 2014 that offers a labeling platform which is most notable for the fact that it specializes in video annotation. Currently the image and text labeling versions of the platform are available as a managed service but they recently launched Alegion Control: a self-serve version of their video labeling platform where videos can be processed in 4k, which reduces labeling time and supports complex and large-scale collaborations.
1. Functions: Alegion can be used for annotation of video, image and language datasets as well as for general-purpose tasks. They offer precise object localization, object classification, attribute assignment, scene classification, instance recognition, object localization and classification, semantic segmentation, as well as instance segmentation to just name a few options. The platform supports both 2D and 3D labeling and their video labeling solution is cutting edge. It also provides complex taxonomy support and nested entity relationship classifications. Access to multi-stage and conditional workflows is also available for higher tiers.
2. Automation: The automation features of the platform include their automated selection tool, their ML pre-labeling option and their object tracking in videos – which can be done in 4k! Additionally, they are using DextR in their SmartPoly feature which allows labelers to create polygons and masks based on four clicks of an object’s outline. Their object tracking feature can track an object throughout a video based on just a couple of frames – using actual object tracking instead of simple linear interpolation. However, we would recommend annotating additional frames for increased accuracy.
3. Project management: When it comes to project management, Alegion has it covered. They offer “polymerized” workflows that ensure an efficient and accurate process using human or ML consensus by comparing annotators to each other or to the predictions of an ML model. The workflows may also include pre-labeling or any type of external API using a Lambda. Stages can even be chained together using conditional logic to create multi-stage workflows. In terms of QC, supervisors have the ability to score labeled files, which gets reflected on each labeler’s metrics. Additionally, annotator groups can be assigned based on defined criteria, which enables annotator teams to become specialized in specific tasks or stages of the labeling process.
Hope this was helpful! If you are working on an AI project and are currently reviewing which tool might be the most appropriate for it, get in touch with us and we would be happy to have a call and advise you on the best way to build your pipeline.