New Lingua::Identify::CLD
As if I did not have enough modules to take care already, I just started a new one. It is still on its beta version as I did not have much time to test it, and write a decent API. It is available in the usual place: https://metacpan.org/release/AMBS/Lingua-Identify-CLD-0.01_01
This is an interface to a library by Google for language detection. As far as I could understand, it is part of the Chrome browser, and was just released as open source. Details here: http://code.google.com/p/chromium-compact-language-detector/
It is available at GitHub, and I am happy to receive issues or pull requests. Just bear in mind that no API is still defined (although I have an idea of what I want) and that I do not have much time to solve your issues right ahead.
Finally, a thanks to Jean Véronis that pointed me the library and asked kindly for a Perl interface to it.
I had the following problems with it:
Using cpanm, the dependency ExtUtils::LibBuilder didn't get followed, for some reason, although it seems to be in Build.PL.
I installed the module using the github source code and was able to build and test it.
However, after it was installed manually from the source code, the shared library would not load:
I cannot see in the build script Build.PL where the library is even installed.
Hello, Ben
That is completely my fault, I forgot to write the code to install the library :) oops!
Regarding ExtUtils::LibBuilder, I can add it to configure section as well, and hope cpanm installs it :)
Thanks for the feedback. Look for second devel version, probably this year :)
Version 0.01_02 is available on CPAN (or should be in some minutes). It already installs and works (at least, it worked here). If not, please poke :)
Hi,
From source I see that library as well has encodings information. Can it guess or only transforms/use for purpose of language detection?
If it can guess then it would be cool to compare with Encode::Detect::Detector that uses Gecko's library that detects encodings when those are not provided. If it can not then both will run smooth together.
Anyway, good to have this around. Thanks.
Hey, Ruslan.
I do not know (yet) if it detects encodings. :) But I'll try to find out ;)
Cheers