Apart from the modules mentioned here, there is extensive parsing of the user agent string in the program "awstats", although I am not sure how modular it is (I don't know how easy it would be to use that in another program).
The main thing I would be interested in from a module like this would be detecting if a user agent was a robot or not.
Thanks for putting this together. I was surprised at how HTTP::BrowserDetect stacks up against the others in your breakdown. It will be 12 years old in February (according to BackPAN), so it may well be the crustiest of all of them. There are some design decisions from over a decade ago which are still firmly entrenched in the code.
It can be a lot of work to stay up to date with UA strings, so I would think everyone would stand to benefit from some sort of pooling of efforts.
Thanks for the great article. It was very timely, as I was just trying to find a better mobile user-agent detection package recently. I have been using HTTP::BrowserDetect, which I find pretty good, and it is actively being updated. After I read your article, I tried the others on your list, but I ended up staying with HTTP::BrowserDetect because my app needs robot and mobile detection, and this module is by far the best in these categories at this point. It's still not perfect, though, but much better anything I've found so far. I'd love to find a more reliable way of detecting robots and mobile browsers (as well as browser type, though that turns out to be 3rd in importance for my needs).
By the way, for mobile detection, I experimented with the 3 most promising modules for this purpose. I wasn't as diligent as you, but I did some experiments on some of our traffic that's mostly mobile: HTTP::BrowserDetect detected over 90% of unique user-agents as mobile, which isn't bad at all; HTTP::MobileAgent detected less than 5% (it may work well with Japanese carriers, but it doesn't seem very useful for general mobile detection); HTTP::DetectUserAgent detected none as mobile (or robot). I didn't try Mobile::UserAgent because it hasn't been updated since 2005, so it's unlikely to be useful at this point.
HTTP::BrowserDetect is also the best I've found for robot detection, as your section on robot detection also indicates. If anyone knows a more reliable robot/crawler detection scheme, I'd love to know about it because this is another important requirement I tend to have in traffic analysis.
@Al, please feel free to get in touch with me about improving robot detection in HTTP::BrowserDetect. It is far from perfect, but I do my best to keep up with user contributions etc. You can open issues at https://github.com/oalders/http-browserdetect/ or contact me directly. My contact info is at https://metacpan.org/author/OALDERS
Aristotle: definitely. I'm working on a new review at the moment, but then plan to do updates, as a number of my reviews have modules that need adding. You could +1 my play-perl quest for this :-)
Apart from the modules mentioned here, there is extensive parsing of the user agent string in the program "awstats", although I am not sure how modular it is (I don't know how easy it would be to use that in another program).
The main thing I would be interested in from a module like this would be detecting if a user agent was a robot or not.
I'll have a look at awstats, thanks.
I'll extend my comparison to include checking whether an agent is a robot/crawler. Will take a little while, as I'll have to update my test corpus.
Thanks for putting this together. I was surprised at how HTTP::BrowserDetect stacks up against the others in your breakdown. It will be 12 years old in February (according to BackPAN), so it may well be the crustiest of all of them. There are some design decisions from over a decade ago which are still firmly entrenched in the code.
It can be a lot of work to stay up to date with UA strings, so I would think everyone would stand to benefit from some sort of pooling of efforts.
Hi Neil
Congratulations. I can see a huge effort has gone into this.
If you have the time, I'd like to see this released as a module under the Benchmark::Featureset::* namespace.
I had a look at awstats. The code is inline, not a separate module, alas.
Your robots comment was a good nudge, as one module is clearly this best on this front.
Ah, i was about to give you an advice not to look at the awstats code. It is one of the most insane perl code I've ever seen in my life.
Thanks for the great article. It was very timely, as I was just trying to find a better mobile user-agent detection package recently. I have been using HTTP::BrowserDetect, which I find pretty good, and it is actively being updated. After I read your article, I tried the others on your list, but I ended up staying with HTTP::BrowserDetect because my app needs robot and mobile detection, and this module is by far the best in these categories at this point. It's still not perfect, though, but much better anything I've found so far. I'd love to find a more reliable way of detecting robots and mobile browsers (as well as browser type, though that turns out to be 3rd in importance for my needs).
Thanks again, and please post updates.
By the way, for mobile detection, I experimented with the 3 most promising modules for this purpose. I wasn't as diligent as you, but I did some experiments on some of our traffic that's mostly mobile: HTTP::BrowserDetect detected over 90% of unique user-agents as mobile, which isn't bad at all; HTTP::MobileAgent detected less than 5% (it may work well with Japanese carriers, but it doesn't seem very useful for general mobile detection); HTTP::DetectUserAgent detected none as mobile (or robot). I didn't try Mobile::UserAgent because it hasn't been updated since 2005, so it's unlikely to be useful at this point.
HTTP::BrowserDetect is also the best I've found for robot detection, as your section on robot detection also indicates. If anyone knows a more reliable robot/crawler detection scheme, I'd love to know about it because this is another important requirement I tend to have in traffic analysis.
Thanks again!
@Al, please feel free to get in touch with me about improving robot detection in HTTP::BrowserDetect. It is far from perfect, but I do my best to keep up with user contributions etc. You can open issues at https://github.com/oalders/http-browserdetect/ or contact me directly. My contact info is at https://metacpan.org/author/OALDERS
Thank you for this comprehensive and well-written review, I have found it most useful. You may have set a standard here.
Thank you. This is very usable information for me :-)
The executive summary mentions that HTTP::DetectUserAgent is recommended if robots are important.
The conclusion seems to recommend HTTP::BrowserDetect however.
Is this a mistake?
Thanks tangent.
Thanks Anders, that was a mistake, which I've now corrected. I keep getting mixed up between modules with similar names.
Any chance you’ll revisit this to include HTTP::UA::Parser in the line-up?
Aristotle: definitely. I'm working on a new review at the moment, but then plan to do updates, as a number of my reviews have modules that need adding. You could +1 my play-perl quest for this :-)