Using AI to Optimise the Calculation of Krippendorff’s Alpha

The Experiment

At the beginning of the year, we ran a small experiment at work. We hired four annotators and let them rate 900 sentences (the details are not important). To decide whether the inter-annotator agreement was significant, we calculated (among others) Krippendorff’s alpha coefficient.

I’d used Perl for everything else in the project, so I reached for Perl to calculate the alpha, as well. I hadn’t found any module for it on CPAN, so I wrote one: I read the Wikipedia page and implemented the formulas.

The Real Data

The experiment was promising, so we got additional funding. We hired 3 more annotators, and a few months later, another nine. This increased the number of raters to 16. So far, they’ve rated about 200K sentences. Each sentence has been annotated by at least two annotators (usually three).

One day, I decided to calculate the inter-annotator agreement for the new data. To my surprise, the calculation took more than 6 hours.

Strong Password

The Weekly Challenge 287, Task 1

You are given a string, $str.
Write a program to return the minimum number of steps required to make the given string very strong password. If it is already strong then return 0.
Criteria:
  • It must have at least 6 characters.
  • It must contains at least one lowercase letter, at least one upper case letter and at least one digit.
  • It shouldn’t contain 3 repeating characters in a row.
Following can be considered as one step:
  • Insert one character;
  • Delete one character;
  • Replace one character with another.

A Simplification

To make the algorithm simpler, let’s ignore deletion. Instead of deleting a character, we can always replace it with a character different to the original one and its neighbours (you can easily verify that it can’t break any of the three criteria: it doesn’t shorten the password, it doesn’t remove more characters than the deletion would have deleted, and it never creates repeating characters).

The Algorithm

Let’s keep a set of strings we need to check, we’ll call them the agenda. At the start of the program, the agenda contains the input string.

Equalise an Array

The Weekly Challenge 270/2

In the week 270, the second task was really interesting and difficult. Here’s a slightly reformulated version:

We’re given an array of positive integers @ints and two additional integers, $x and $y. We can apply any sequence of the following two operations: 1. Increment one element of @ints. 2. Increment two elements of @ints. The cost of each application of operation 1 is $x, the cost of operation 2 is $y. What’s the minimal cost of a sequence of operations that makes all the elements of @ints equal?

Why do I say it was difficult? I compared all the Perl and Raku solutions I could find in the GitHub repository and none of them gave the same results as mine. It took me several days to find an algorithm that would answer the tricky inputs I generated with a pen and paper, and one more day to optimise it to find the solutions in a reasonable time.

Changes in MooX::Role::Parameterized

What is it good for?

If you’ve never worked with MooX::Role::Parameterized or MooseX::Role::Parameterized, you might wonder what is a parameterized role at all?

Roles are used when you need to share behaviour among several classes that don’t have to be related by inheritance. Normally, a role just adds a bunch of methods to the class that consumes it (there’s more, you can for example specify which other methods the role expects to already exist).

A parameterized role makes it possible to provide parameters for the consumed role. This way, you can adjust the behaviour for each consuming class.

Step Counter (Advent of Code 2023/21)

The Task

We’re given a grid with obstacles, we’re supposed to count all the reachable plots in the grid in a given number of steps (we can only move one plot at a time horizontally or vertically).

The sample input looks like this:

...........
.....###.#.
.###.##..#.
..#.#...#..
....#.#....
.##..S####.
.##..#...#.
.......##..
.##.#.####.
.##..##.##.
...........

where S is the starting position.