Unfortunately, the idea contained fatal flaw. See the following post for explaantions.
Once upon a time I faced a huge pile of HTML files which I had to analyze. Say, there were about 1 000 000 of them. Say, 100 Gb of data.
Most of you would say “It’s not that much!”. And you are right. It’s not.
But then I’ve decided to estimate time required to process that pile of files. I quickly put XPaths of what I was needed together and got a prototype in Web::Scraper. And here I go: ~0.94s per file, i/o overhead not included. That occurred more than 11 days on my laptop. Phew!
There are only a few more days left to chip in to sponsoring work on Pinto. If you're still unconvinced or haven't thought about it yet, let me give my point of view on why you should spare a few minutes and a few bucks to sponsor Pinto.
My experiment to crowd fund Jeff Thalhammer's Pinto development is going well. It's 87% of the way there. We need $503 to reach the campaign minimum. We have a week left to get that remaining 13% to get the campaign to "tilt", and I think we can get even more than that. Our secondary money goal is $5,000, all of which goes to Jeff to work on open source features of Pinto. I like $6425 (two perfect squares next to each other). That's 0b0001100100011001 (repeats the bit pattern) or 0x1919 (repeated, and the same prime next to itself).
On Monday Sean Quinlan became the 100th contributor. We have a week left to get 128 contributors. Part of the experiment is to get as many people involved as we can, at any level. I don't care how much you donate: a $1 donation is just as good as $100 when we are counting contributors.
Padre, the Perl IDE, is the work of a number of people with the goal of creating an IDE written in Perl itself.
Padre 0.98, according to the Release History page has finally been released 1 year and 1 week after 0.96.
This is a long time between releases. In part this can be put down to me as the Release Manager. As things go, we all have interests and busy times in our lives that can take us away from projects that we give up our free time to contribute to. For me, it's been a case of discovering photography. So instead of looking at code I'm looking at images I have taken.
This has meant, that try as I might, I never focused back on Padre and releasing Padre enough to get the new version out the door.
For several years I've been using a shell script that runs once a day from a cron job, and automagically subscribes me to the RSS feeds that rt.cpan creates for my modules. That means that whenever I release a new module, within a day or two I'll be subscribed to its bug reports, and within another day or so I'll start getting bug reports automatically emailed to me.
At some point RT got upgraded and my script broke. When I became aware of it, I fixed it, and I also put it on github. I hope you find it useful.
To run it you will need to set the RTUSER and RTPASS environment variables and possibly edit the variables at the top of the script. I assume that you use rss2email for reading RSS feeds. Use of any other tool for RSS is a bug, but if you wish to be buggy I'm sure you can work around that.
"First we ask, what impact will our algorithm have on the parsing
done in production compilers for existing programming languages?
The answer is, practically none." -- Jay Earley's Ph.D thesis, p. 122.
In the above quote, the inventor of the Earley parsing
algorithm poses a question.
Is his algorithm fast enough for a production compiler? His answer is a
stark "no".
This is the verdict on Earley's that you often
hear repeated today, 45 years later.
Earley's, it is said, has a too high a "constant factor".
Verdicts tends to be repeated more often than examined.
This particular verdict originates with the inventor himself.
So perhaps it is not astonishing
that many treat the dismissal
of Earley's on grounds of speed to be as valid today as it
was in 1968.
But in the past 45 years,
computer technology has changed beyond recognition
and researchers
have made several significant improvements to Earley's.
It is time to reopen this case.
I'm on IRC just about all the time (my handle is "thaljef"). But I thought it might be interesting to actually schedule a session and invite people to come in and ask questions about Pinto, suggest a feature, report a bug, or just say "Hi".
So there will be two one-hour jam sessions in the #pinto channel on irc.perl.org this Thursday, May 2. The first will at 14:00 and the second will be at 18:00 (all times GMT). If you haven't used IRC before, this is an excellent guide.
Welcome to Perl 5 Porters Monthly, a summary of the email traffic of the
perl5-porters email list. This is the last monthly catch-up. I am planning
to do weekly summaries for the week starting April 29, 2013. (But the road
to hell is paved with yada yada yada...)
The plugin simply fetches a random title and link from the front page of Reddit's TodayILearned subreddit.
I used WWW::Shorten::Simple to return bitly links, and Mojo::JSON to decode reddit's API.
example:
curtis: !TIL ircbot: curtis: TIL that the fighter squadron with the highest number of kills in the Battle of Britain during WWII were actually from Poland, and showed up two months after the battle had begun. http://bit.ly/ZN0qdB
Hannover.pm is organising the 16th German Perl Workshop 2014 ( GPW 2014 ) in Hanover.
An official act.yapc.eu website is currently in the making and will be published in early June. Give us some time to understand and fully configure its back end.
The gpw2014 will take place from March 26th to 28th 2014 (Wednesday to Friday). The CeBIT will take place from March 11th to 15th, the Hannover Messe will take place from April 7th to 11th. We're smack-dab in the middle of those two big fairs, but hotel rooms will be affordable during that week.
I'll blog about the gpw2014 at least every month to keep you informed. But please also have a look at the official act website for major news.
If you like to chat you can join the IRC channel #gpw (#gpw2014 is for the organisers) on irc.perl.org.
Because building KDE takes hours, and you wont need it other than for cachegrind.
But there's a QT variant coming with kcachegrind, called qcachegrind.
Maybe ports wants to use this variant. Or not, because kdelibs3 is listed as dependency.
Description: KCachegrind visualizes traces generated by profiling, including a tree map and a call
graph visualization of the calls happening. It's designed to be fast for very large
programs like KDE applications.
Homepage: http://kcachegrind.sourceforge.net/
Library Dependencies: kdelibs3
Platforms: darwin
License: unknown
Maintainers: nomaintainer@macports.org
My first job was as a bus conductor, and my second one was as a student trainee in an engineering company - proper engineering, with production lines, big machines, hot things, and "danger of death" notices on equipment. In both of these, safety was an important concern, and especially in the second one it was drilled in to me that safety and quality are closely related and arise from systems, not merely from individual endeavour. While I never completed my degree in manufacturing/systems engineering (I dropped out because I was fed up after too many years in the classroom) I still retain an interest in the subject.
I recently came across the excellent Disastercast podcast by Drew Rae. Of particular interest to programmers is the sixth episode, which looks at the report into a fatal rail crash caused by a poor safety and testing culture.
1: Right man.I was there when it started. At German Perl Workshop this march in Berlin Richard ignited with his inofficial keynote a lot of controversy. All what said wasn't new or IMO just opinion or chatter/not relevant. Later I spoke with him @ the social meeting in the computer game museum. (seriously, is there a better place for such an event?)
During our conversation I found out: he listens to people, he really loves Perl and he's the right kind of Person to do that, with the right experience set. Even if I don't share some of his fews/considerations what is important.
Stratopan is a new service for hosting custom repositories of Perl modules in the cloud. Private beta trials will begin early this summer. If you'd like to participate in the trials, please stop by https://stratopan.com and leave us your email address. We'll contact you with all the details when the trials begin.
Stratopan will host both public and private repositories with any combination of proprietary and open source Perl modules. And Stratopan is built on Pinto, the open source tool for creating custom CPAN-like repositories, so it has the same helpful tools for managing your application dependencies.