Set::Jaccard::SimilarityCoefficient v0.5.1

Set::Jaccard::SimilarityCoefficient lets you calculate the Jaccard Similarity Coefficient for either arrayrefs or Set::Scalar objects.

Briefly, the Jaccard Similarity Coefficient is a simple measure of how similar 2 sets are. The calculation is (in pseudo-code):


count(difference(SET-A, SET-B)) / count(union(SET-A, SET-B))

There is a Jaccard Similarity Coefficient routine already in CPAN, but it is specialized for use by Text::NSP. I wanted a generic routine that could be used by anyone so Set::Jaccard::SimilarityCoefficient was born.

The minimum Perl version for Set::Jaccard::SimilarityCoefficient is currently set to 5.8.8, but that needs to be revised because it uses autodie and Test::Most (thereby causing many CPAN Testers failures).

Leave a comment

About Mark Leighton Fisher

user-pic Perl/CPAN user since 1992.