Do your piece to fix TIOBE or stop talking about it
Many people talk about TIOBE and how it's bad, or irrelevant, or broken, or many other vague descriptors of why it should be ignored.
All people talking about TIOBE miss one crucial point: It is software, it has an algorithm, and it is not "bad", it is buggy. That means it can be fixed.
So either fix it, or stop talking about it.
Here's why you can fix it
The TIOBE algorithm is to search for "[language] programming" on a number of search engines, then apply a weight to the resulting count, based on the search engine, and sum the results up to get a score.
They do this because C is a letter, Ruby is a stone, Python is a snake (or a comedy troupe). If they searched only for those words they'd get a lot of trash results. Adding programming means they will mostly get results that contain the phrases "[language] programming" or "[language] programming language". Especially the latter is important, since people talking about the other languages often add "programming language" to disambiguate; while Perl developers have no reason to do so (since the word is mostly unique), and simply don't, thus removing themselves from what TIOBE can find.
Here are the top four search engines and their weights:
- Google: 28%
- Blogger: 28%
- Wikipedia: 13%
- YouTube: 7%
What is notable? All of these sites either point to other content, or are directly editable. I'll go on with wikipedia, as that's the easiest example to use. Here are some search terms and their current result counts:
perl | 5194 |
perl programming | 179 |
python programming | 289 |
python | 7239 |
By comparing the list of results for "perl" and "perl programming" you'll find that many perfectly fine results simply won't show up for the latter and thus won't be visible to TIOBE either.
It's not that the talk about Perl isn't there, the problem is that TIOBE can't see it. And that can be changed.
Here's how you can fix it
The solution is simple: The string "perl programming language" needs to be added to results that are valid for Perl, but don't currently contain it and as such remain invisible to TIOBE. Once TIOBE can actually see the talk about Perl, it will at least stop being a source of false bad press for Perl.
- Go through this list and ask site owners to update their site.
- Go through this list and ask blog owners to update their blog or entries.
- Go through this list and update the page where appropiate.
- Go through this list and ask uploaders to update their video descriptions.
Lastly, don't forget that if you own or create any of the type of contents that appear in any of previous result lists, make sure you mention "perl programming language" as well.
Updated my own perl stuff.
*points to the footer of this ’ere page*
(Of every single page on this ’ere site, in fact.)
https://github.com/blogs-perl-org/blogs.perl.org/pull/197
:)
Oh that was you! The first I saw of this was the internal discussion about it (“would this be scummy or are we fine with it?”, which unanimously came down on the side of “fine” after a moment’s consideration), not the pull request itself – so I didn’t make the connection. Plus, for some reason my ability to connect nicks and real names has been highly inconsistent in recent times.
Anyway – thanks for prompting this!
I bet my name for this blog doesn't help either. :D
But yeah, i did ask to have that done and also prompted a similar change on perlmonks, after i did a lightning talk on this topic at the German Perl Workshop 2012 in Berlin. A recent post here simply prompted me to put it in blog post form. :)
What about search.cpan.org and metacpan? I don't see even "Perl" on their pages. No meta-keywords, no meta-description.
And of course module authors don't put "written in perl programming language" into .pod (because of obvious reasons).
Good point. We should ask to get the maintainers of the respective sites to change that. I'm not sure where to start with sco though, but a pull req for metacpan should be easy.
Given that you never actually talked about it, that comment is fairly disingenuous. I'd like to think that we, as a community, are above that kind of thing.
I haven't cared about TIOBE and don't normally talk about it. That said, this seems to be convincing logic and I will do my best to try to help. In the meantime, is there any chance that TIOBE would listen to this logic and allow "Perl" to work in place of "Perl programming" since its not ambiguous? Probably not, but you never know.
Wouldn't it be better to try to get TIOBE to treat "perl" as a synonym for "perl programming?" It would be a special case for them but it's easier (fix in one place vs. update the entire Internet), more robust (you'll never get everyone to update their content) and forward-facing (handles new content by people who don't follow the "rule"). Ultimately, it should be TIOBE's problem if their algorithm doesn't handle the way people discuss the language. This problem wouldn't seem to be unique to Perl, either. I wouldn't expect "PL/SQL programming" to be as common as just "PL/SQL."
@Joel
@Michael
I honestly don't know how they'd stand on that. In the past they seemed to have had a negative stance towards Perl in general, so i don't hold my hopes high. Would one of you be willing to try and talk to them?
I've heard the "negative stance towards Perl" comment a number of times, but I've not seen evidence of this. Can you provide a reference?
@Joel
@Michael
perl 5194
perl programming 179
python programming 289
python 7239
If they count "perl" results, it will get 17-time higher index than "python programming"
Thing is Python (I mean programming language) also used as sole word, without "programming"
I'd love to, but everytime they update their index they simply delete the old postings. All i remember dimly is them gloating quite a bit about Perl's "fall" in the past.
They have list of exceptions for search queries.
http://www.tiobe.com/index.php/content/paperinfo/tpci/tpci_definition.htm
(for example ABC is language unless it's "ABC tv")
and grouping (Awk = Gawk)
I wonder if Perl5 (or even Perl6 ?) could be added as alias for Perl?
also:
> Artifacts or ideas on improving the calculation of the TIOBE index will be received with gratitude (tpci@tiobe.com).
Craig Treptow on twitter found a post that shows their stance on Perl: https://twitter.com/CraigTreptow/status/365524850044968960
I don't think that having "perl programming" on every page that talks about Perl will solve the main problems. I don't think this will suddenly increase the number of people looking for Perl
BUT
I do think that TIOBE has an impact on perception and the comments of RickTick and then of mithaldu on Reddit convinced me how the lack of "perl programming" on the Perl sites lies to TIOBE and any other organization who checks these numbers.
"Of the first 20 hits for Perl, 17 are actual hits on Perl. Meanwhile for the 20 Python results, 5 are actually about the programming language."
BTW You could also try "perl -programming" and "python -programming" to see which pages have the could be updated.
So I added "Programming Perl" to all the pages on the Perl programming weekly, and on the Perl Programming Maven sites. That's about 400 pages.
Now we only need Google to re-index those pages and to see them as "important".
Thanks for updating your site.
You're right, having that won't change Google trends, but as you realized, this is about perception. Imagine a headline of "Perl spikes on TIOBE, deathsayers wrong!".
Also, the links under "How to fix it", already link to searches that show pages that need to be updated. :)
@Mithaldu, those links lead to google.de and it complains about various Perl non-programming site in German....
Don't hold your breath for such title.
Nor for any "spike".
Since then I also updated the two Perl Mongers sites I maintain and sent out a call to all the Perl Monger admins to use their web assets wisely.
Oh and to further brag a bit I also added perl programming tags to all the
interviews with perl programmers.
BTW I think it would be also important that people who update their sites with the magic phrase will also mention it here.
Others, even if they don't have web sites, could then share those links on Google+, further encouraging Google to take those pages seriously and maybe even reindex them sooner.
A couple of more issues as I don't really understand the numbers.
I just searched on Google:
perl programming is higher here than python programming, but if Google really weights 28% of the TIOBE index then either they see different numbers than I did or, in the other searches Python outweights Perl by so much.
Let's see YouTube:
There you go. On YouTube Python Programming is more than 10 times bigger than Perl.
I have no idea how to search on Blogger.com, but I found search.blogger.com that redirected to Google with the "Blog" tab lit up.
I think these are the corresponding searches:
I don't understand this. There are 3 times more hits for perl programming there than for python programming so if this 28% of the weight then I think the 10 time lead in YouTube with its 7% weight should not have such a big impact.
Very strange.
Gabor, thanks for pointing out the google links, i updated them to point at .com.
Also, your searches are flawed. The search term is "[lang] programming", including the quotes. All your searches leave the quotes out and don't reflect what TIOBE sees.
Let's roll with what you say. Are these the searches they use?
Google:
Google blogs:
YouTube
Exactly.
I think the basic problem with TIOBE is that their concept is wrong. They are using a proxy for interest, a proxy that doesn't really mean what they say it does, and then making conclusions from it. Much like the perlmonks' CB stats. Except there I'm quite explicit that the stats are invalid.
TIOBE isn't doing proper sampling to extrapolate anything. Their core methodology is simply flawed. Adding "programming" after the word "Perl" doesn't fix it. It simply does some SEO to game the system.
If TIOBE wanted to reach relevance, they would have to tweak their algorithms to fit reality, to reduce the number of legitimate pages skipped and yet minimise the number of unrelated pages counted. In a way that handles all languages, not just the ones whose communities have rallied around it.
Right now, the strongest conclusion that can be drawn from their numbers is, "oh, that's interesting." Anyone who is drawing stronger conclusions are putting far too much trust in the system and don't understand sufficient statistical analysis to comprehend what the numbers really mean (i.e., nothing).
Tanktalus, while you're correct, it seems you also missed the simple reality that newbies and managers, both groups of people highly important to the survival of a language, do think TIOBE is relevant and not only draw conclusions from it, but even make decisions based on it. As such we can't afford to ignore it.
Now, if you would like to fix it, well volunteered! Please go and contact TIOBE. Please do report here when you do so and report back with results. (I hope you came here to do something and not just talk about TIOBE. ;))
In the meantime i'll work on the path of SOE to affect their data sets.
perlenespanol.com forum, from 2120 results, to 3250, and going up.
I'm curious if we can use the
<abbr>
tag ortitle
attribute to tag the word Perl with "Perl programming language" and still have it be indexed. That way we can use Perl and still have the text flow naturally.I've added a quest stencil on Questhub: update 5 pages as per Mithaldu's directions. I've just kicked it off by taking 5 easy wins on wikipedia :-)
Mithaldu, I do wish you would just have let sleeping dogs rest, and forget all about TIOBE. If we try to game TIOBE, then so do the user communities of other programming languages, and then we are going to start an arms’ race.
In any case, since I am unhappy with TIOBE being taken for granted, without close reconsideration, I decided to do my part in educating people against it using my newly created anti-TIOBE page (which is just another page in my section of pages against bad software). Since this was inspired by this post and its publicity, I guess I should thank you for it.
@zaki: I don't know, but giving it a try can't hurt. :)
@neilb: Thanks, i'll have to publicize that in a future post.
@shlomi: The fact that you think an armsrace can take place here means you've missed some facts about TIOBE. Please do poke me on IRC if you'd like me to elaborate. Also, all your links only explain why TIOBE is bad, which is correct because it's faulty, but miss that TIOBE matters despite being bad. Again something i'll happily explain to you on IRC.
Still as needed as ever...