[UPDATED] Beginning Perl For Bioinformatics
One of my favorite tweeps asked for some up-to-date resources to help her teach Perl to her university's Biology students. "Biology students?", you ask. Yes! If you've mostly used Perl for Web development and systems support, you might be surprised to learn that Perl is huge in the bioinformatics domain. Hell, perlgeeks played a crucial role in cracking the human genome. SCIENCE!
Anyway, since I'm neither all-knowing nor all-seeing (shocking, right?) I figured I'd put the list of resources up here so others could chime in.
- The BioPerl Wiki - Tons of information here, including a section specifically for BioPerl beginners and a series of tutorials.
- Beginning Perl For Bioinformatics - While I haven't read it, the table of contents for this O'Reilly title seems like just the thing for biologists looking to get started with Perl.
- chromatic's excellent Modern Perl book. Even though it is not tuned specifically to the programming beginner, it is an excellent resource for learning to write 21st-century Perl code. The fact that there are free PDF and ePub versions of the book makes it a must-have for students on a budget.
A constant stream of new users is how Perl gets smarter and continues to thrive. If you know of other good beginning Perl introductions, please drop them in a comment and I'll add 'em to this list. (note: If you link to a good intro that also focuses on bioinfo for its code examples, I'll buy you a beer at the next YAPC::NA. :-)
Update 10/18/1011
Dr. Louise Johnson, the aforementioned biology prof whose tweet prompted this post, has graciously provided a a list of resources that other perlgeeks have suggested for her students:
- O'Reilly's Learning Perl (the classic Llama Book) was recommended several times. That's what I used, and I still refer back to it. I found the end-of-chapter practice exercises particularly useful.
- Keith Bradnam (@kbradnam on twitter) runs a perl course for biologists at UC Davis and is currently writing a book which sounds excellent. Some course materials are available here: http://korflab.ucdavis.edu/Unix_and_Perl/ and Keith also pointed us toward this http://www.perl.com/pub/2000/06/27/perlbook.html which is among the many helpful resources available at the Perl site itself.
- Casey Bergman (@bergmanlab), computational biologist at the University of Manchester and all-round fab person, recommended this book chapter: http://onlinelibrary.wiley.com/doi/10.1002/0471223921.ch17/summary
- Wiley were kind enough to send me a copy of 'Perl Programming for Biologists' by Curtis Jamison, which is now in the hands of one of the students. I will ask him to report back! From a brief flick-through, it is clearly written and has exercises at the end of each chapter. The focus is on sequence analysis.
- Stephen Henstridge (@HenstridgeSJ) thought 'Perl By Example' by Ellie Quigley would be suitable as it has a focus on learning by doing, which sounds very appropriate for someone who isn't all that interested in the theory. I'm pestering the library to get hold of a copy.
Thanks for the update, Doc!
A former colleague of mine wrote an introduction to Perl for biologist at the university Göttingen (Germany). Unfortunately it is only available in german language but the best I have seen so far: http://www.tcrass.de/en/wissen/informatik.html
> Beginning Perl For Bioinformatics
I read some part of that book when i visited Gene Campus in Cambridge and sadly have to report that it's pretty bad as a Perl book. It actively encourages the mixing of html generation and code and shows how to build html by using the various commands from CGI.pm. I'd only suggest it to people who already have a FIRM footing in Perl already and are able to recognize bad patterns and just want to find out how to connect bio with perl.
Incidentally, BioPerl is one of the flat out worst distributions on CPAN. I've been meaning to find time to dig in and refactor it for a long while.
Modern Perl should really be first on that list.
Fair enough, but, consider the audience: these are people who, while smart, know absolutely nothing about programming. It's more important, then, to offer introductions that address problems that they are already familiar with so they have at least some conceptual handle to grab onto. For you and I, things like separation of concerns (not mixing HTML and code) is a big deal; to someone who is just learning what a for() loop is? Not so much. Small, early (perhaps messy) successes matter more.
Wow, people still use German? Who knew? ;->
I'm kidding, of course. Thanks for the link.
Speaking as the maintainer of BioPerl, I partially agree (and gladly welcome any help with the code!). If you can specify exactly what you find that is terrible then maybe we can focus on tackling those areas first.
Scientific programming by its nature is a messy enterprise compared to the more production oriented commercial projects that most programmers are used to. On the other hand I saw a text mining with perl book that while useful on the scientific aspects of NLP programming encouraged some pretty awful perl practices. I think it's the job of us perl literate scientific programmers to provide exemplars of good practice in our fields wherever possible.