How to efficiently onboard thousands of annotators to ensure consistent quality annotations
The development of perception systems for autonomous vehicles depends on humans to provide interpretations of the world, a.k.a annotations, from which the algorithms learn. A major challenge is that humans make different interpretations based on their own experiences and opinions. For the data we produce to be useful we need to ensure consistency in the annotations. This is a critical factor to get safe autonomous vehicles that can reliably interpret the world around them.
Kognic provides consistent, high-quality labeled data for safe autonomous vehicles. In the past few years, we have grown to have a large team of thousands of annotators spread across the world. We understand the importance of human assistance for machine learning applications. The key is to build up a platform, methods, and processes that enable high-quality production of these applications. This is a cornerstone of the Kognic platform and services.
For the untrained eye, data annotation could seem like a straightforward task. In reality, however, it is very complex. The digital world consists of ones and zeros and does not allow for much subjectivity. But for humans, subjective reasoning is a big part of our consciousness and personality.
Raw
Data annotation is a process where tens, hundreds, or thousands of annotators annotate images according to a set of instructions. The problem arises if annotators label differently and inconsistently for the same annotation task. For example, if annotators are asked to annotate the drivable road, aka freespace, in the above image they may interpret it subjectively. While some may perceive the area in front of the fence as freespace, others with driving experience may perceive it as “not freespace”. Such inconsistencies are a result of differences in annotators’ skills, experience, and guidelines. Inconsistent annotations are a risk to your computer vision systems and can lead to disastrous results.
How does Kognic ensure that our annotation team annotates consistently?
Kognic has a carefully designed process to onboard annotators which is called Kognic Annotator Academy. This process is used to identify what skills, experience, and understanding of the task are required for an annotator to be suitable for an annotation project. The findings are then used to translate ways to educate, train, evaluate and select suitable annotators. Only the annotators who pass the examination step and become certified Kognic annotators are selected to work on the task to ensure the highest quality and fastest annotations. Kognic Annotator Academy is based on three distinct steps.
- Information sharing
- Training annotators
- Examination and selection
1. Information sharing
The annotation teams are provided with functional instructions explaining the annotation tool and substantive instructions about how to annotate the tasks. Our quality managers (QMs) are annotation experts and are well trained to effectively communicate these instructions to annotators. At this step, it is suggested not to rely solely on the written guidelines to educate the annotators. Therefore we ensure that other essential information is also shared with the annotators; such as annotation best practices, possible edge cases, and examples of good annotation images or videos. This helps in creating a quality agreement among the annotators and reduces the risk of inconsistencies.
2. Training annotators
Training annotators is an essential part of the onboarding to make certain that the annotators produce consistent and reliable annotations. Our data science team strategically selects specific annotation tasks from the project which include varying degrees of complexity and edge cases. These tasks are used to train the annotators to achieve the required quality and speed levels. The length of a training phase is dependent on the complexity of the project, the annotator's experience, and understanding of the guidelines.
During the training period, annotators are encouraged to ask questions if they are uncertain about the application of the project guidelines or the tool’s functionality. The annotation speed and quality produced are analyzed and real-time feedback is provided to the annotators so that they can produce more consistent and reliable annotations.
3. Examination and selection
Once the annotators have attained the required information and training, a final examination is conducted to certify and select the best suitable annotators for the project. Annotators are provided with specific examination tasks that are used to measure and analyze each individual’s performance. The method for assessing annotators’ performance includes a continuous quality and speed assessment of examination tasks.
The annotators who pass the examination tasks are certified as Kognic annotators and selected to work on tasks.
Conclusion
Human-assisted annotations are an inevitable part of machine learning applications. The existence of autonomous systems is not possible without annotated data. The quality and reliability of machine learning applications are dependent on the quality and reliability of annotated data sets. While Kognic continues to develop the world-class annotation platform, we recognize the importance of optimal onboarding of annotators to obtain consistent, reliable, and timely annotated data.