Using AI to Optimise the Calculation of Krippendorff’s Alpha
The Experiment
At the beginning of the year, we ran a small experiment at work. We hired four annotators and let them rate 900 sentences (the details are not important). To decide whether the inter-annotator agreement was significant, we calculated (among others) Krippendorff’s alpha coefficient.
I’d used Perl for everything else in the project, so I reached for Perl to calculate the alpha, as well. I hadn’t found any module for it on CPAN, so I wrote one: I read the Wikipedia page and implemented the formulas.
The Real Data
The experiment was promising, so we got additional funding. We hired 3 more annotators, and a few months later, another nine. This increased the number of raters to 16. So far, they’ve rated about 200K sentences. Each sentence has been annotated by at least two annotators (usually three).
One day, I decided to calculate the inter-annotator agreement for the new data. To my surprise, the calculation took more than 6 hours.