Human-AI Collaboration for Data Annotation | Northeastern University Human-Centered Artificial Intelligence Lab (NEU HAI)

Data annotation represents a significant proportion of domain experts’ daily work. Doctors give diagnoses based on lab test results, judges and jurors give judgments based on Facts and Law Articles, etc., all these works are some kind of data annotation. Although LLMs showing great capabilities in multiple tasks and domains, they don’t behave welll on real-word tasks due to lack of domain knowledge. Though, human golde-truth label is still needed to provide better prediction. Another unsolving problem with LLM is data privacy, in many domains, such as medical, you can’t call third party API to do inference. To facilitate these problems, we develop a AI-assisted data annotation platform that allows domain experts to annotate data samples and get real-time model suggestions from Active Learning models. In this way, we can minimize the annotation cost while providing high-quality outputs.