The Perl Toolchain: developing your module

This is the second in a series of blog posts about the Perl toolchain and the collection of tools and modules around it that are central to the CPAN we have today. In the first post we introduced PAUSE and CPAN, and outlined how you release a module onto CPAN, and how someone else might find it and install it. In this article we're going to cover what comes before the release: creating, developing, and testing a module for release to CPAN.

This post is brought to you by ActiveState, who we're happy to announce are a gold sponsor for the QA Hackathon this year. ActiveState are long-term supporters of Perl, and shipped the first Perl distro for Windows, ActivePerl.

What does a distribution look like?

If you've got a module Foo::Bar, it will be released to CPAN (via PAUSE, remember) in a Foo-Bar distribution. The following shows the directory structure for a release of this distribution with version 1.01. This is what you might get if you downloaded Foo-Bar-1.01.tar.gz from CPAN and unpacked it:

Here's what those files and directories are:

  • Changes: a text file that lists the changes in each release, usually listed from most recent to oldest. This can be any format you like, but the most common format is described in CPAN::Changes::Spec.
  • LICENSE: the text of the open source license under which this distribution is released. You can maintain this yourself, but easier if you rely on Software::License to generate it for you.
  • MANIFEST: a list of all the files in this release. In the old days you'd maintain this by hand; you could use ExtUtils::Manifest, or one of the tools that builds on it.
  • META.yml: metadata for the distribution, such as what external modules it relies on, the minimum version of Perl required, etc. More recent releases will often have a META.json file as well. The specification for the content of these files is CPAN::Meta::Spec.
  • Makefile.PL: used to generate a Makefile (usually via ExtUtils::MakeMaker, which can then be used to build, test, and install the distribution. Some distributions may also/instead have a Build.PL file (typically built on Module::Build).
  • README: a short text file describing what's in the distribution. Sometimes this will be a file instead.
  • lib/ - this directory contains all modules that will be installed.
  • t/ - this directory contains tests for the distribution. More on tests below.

These are the things you'll find in a typical pure-Perl distribution. For XS modules, you'll find a .xs module, header file(s), and various other things. XS is beyond the scope of this series.

You can learn more about the files and directories in a distribution in this blog post.

Creating a new distribution

You've had an idea for a new module, which you're pretty sure you'll want to release to CPAN. How do you create the right directory structure?

One approach is to find a similar distribution on CPAN and copy it, changing things to refer to your module. This can work, but there are pitfalls: the module you find might be slightly different from yours in one or more ways, so you might end up with a Makefile.PL or META.yml that aren't quite right.

There are a number of dedicated tools for creating new distributions:

  • h2xs is the original tool for this task. Originally meant for building an XS module from C header files, it can also be used to bootstrap a pure-Perl distribution.
  • The Dist::Maker module comes with a dim ("distribution maker") script.
  • The Module::Starter module comes with a module-starter script.

I'm sure there are others that I've missed on CPAN.

There are a number of more general distribution builders which can be used to create new distributions:

  • Dist::Zilla - the most widely used of the options here, with a rich collection of plugins on CPAN.
  • Dist::Milla is essentially a specific Dist::Zilla configuration. If you're happy with MIYAGAWA's decisions, this may be easier for you than DZ itself.
  • Minilla is similar to Dist::Milla in philosopy, but is a standalone tool based on some conventions. For simple pure-Perl modules this is probably a good place for beginners to start.

A number of prolific CPAN authors have written their own tools, such as TOBYINK's Dist::Inkt and and INGY's Zilla::Dist, but these aren't widely used.

These tools make it easy to create a new dist, but their real strength comes from everything else they can do for you: creating documentation, generating tests, creating various files that a release should contain, packaging up a release, and uploading it to CPAN for you.

Just remember to replace the boilerplate with real content! There are a surprising number of modules documented with filler text.

Editing your module(s) and other files

There is syntax highlighting support for Perl in most mainstream text editors, and some of these offer plenty beyond colouring your code:

Beyond text editors, there are a number of IDEs that you might use:

  • Padre is an open source Perl IDE that's written in Perl. It hasn't seen a release since 2013 though.
  • Komodo IDE is a professional IDE from ActiveState.
  • EPIC is a Perl IDE based on the Eclipse IDE.

Tests for your module

The testing mindset is baked into the Perl community: almost all CPAN distributions come with a testsuite, and there are various modules, tools, and systems to support you in your testing. By convention the testsuite for a distribution lives in a directory called 't', and each test file has a .t extension. Small distributions might have a single .t file, larger distributions might have one .t file per module, or one per feature — it's totally up to you!

If a distribution has a Makefile.PL, then you can run the testsuite thusly:

    perl Makefile.PL
    make test

If it has a Build.PL, then:

    perl Build.PL
    ./Build test

Since 5.8, Perl has also shipped with the prove utility, which makes it easier to run tests:

    prove -lr t

This script is a front-end to the Test::Harness module, which knows how to run a collection of test files in the t directory.

At the heart of Perl's test framework and tools is TAP, the test anything protocol. This is a simple text format that is emitted by each .t file. TAP was originally created for Perl, but there are now TAP-based frameworks for plenty of other programming languages.

At its most basic, a test file prints out the number of tests, follow the result of each test:

    ok 1 - load module
    ok 2 - passing no arguments should croak
    not ok 3 - passing 'plugh' should result in a hollow voice

Your test file can just print out this text, and that's how it used to be done in the old days. For very simple test scripts you might use the Test::Simple module, but most people use Test::More, which as you might guess, offers a bit more than Test::Simple.

There are many more modules in the Test:: namespace, and they all work together because they stand on the shoulders of Test::Builder. Test::Builder provides the building blocks for writing test libraries, and also provides the plumbing which ensures you can use test modules from a range of authors and still get a sensible stream of "ok 1" etc at the end of the day.

Test::Builder has served us well over the years, but has been creaking somewhat recently. So the replacement for Test::Builder is in development, and currently a collection of modules under the Test2 namespace. This was heavily discussed at the 2015 QA Hackathon, and will no doubt see a lot more discussion and work at the 2016 QAH. Many other Test:: modules have been worked on at the QAH over the years.

If you want to learn a bit more about testing Perl, you could start with the Test::Tutorial documentation, then look at Gabor's collection of test-related blog posts, and also consider the book Perl Testing: A Developer's Notebook.

Version control / source code management

In the early Perl days CVS was the most widely used version control system. That gave way to Subversion for a while, but these days git and mercurial are probably the most widely used in open source projects. Since we're talking about open source, you should think about not only the version control system you're using, but also what public repository hosting service you'll use. The most widely used hosting service for CPAN distributions is github: just over 35% of distributions have a github repo. All the other hosting services added together cover less than 3% of CPAN.

The following is a typical release cycle for github users:

  1. Work on changes until all tests pass
  2. Make sure the Changes file documents all major changes
  3. Make your changes atomic, all changes related to a particular fix or feature should go in a single commit, including the Changes entry.
  4. Bump the version
  5. Upload to PAUSE
  6. Tag with the version. By convention for version 1.01 the tag would be 'v1.01'
  7. Push to github

There are many more things you could do, and should do when other CPAN distributions are relying on yours: ensure tests pass on all recent versions of Perl, on the most widely used operating systems; test and spellcheck your doc; test the distributions that used your module; check what minimum version you should specify for the modules you use; and plenty more besides.

One thing to be aware of: there are various files which should be included in a CPAN release, but which are automatically generated. Such files shouldn't be in your source control repo. Exactly which file depends on which tools you're using. For examine, if you're using Dist::Zilla, then the Changes, README, MANIFEST, META.yml and META.json, and Makefile.PL files will probably be generated when you release your dist. If you're not using a tool like Dist::Zilla and have a hand-rolled Makefile.PL, then you'll be maintaining most of those files manually, but META.yml and META.json will be generated when you run make dist. Read this blog post to learn more about what files are generated.

About ActiveState

Since 1997, ActiveState has been providing software application development and management solutions, especially around Perl. ActivePerl (the original "Perl for Windows" distro), Perl Dev Kit, and polyglot Komodo IDE are some of the flagship products that have made them renowned by Perl developers around the world. More than two million developers and 97% of Fortune 1000 companies use ActiveState's solutions to develop, distribute, and manage their software applications. Global customers like Bank of America, CA, Cisco, HP, Lockheed Martin and Siemens rely on ActiveState for faster development, ensuring IT governance and compliance, and accelerating time to market.

Thank you to ActiveState for their support of the QA Hackathon and the Perl community.

Leave a comment

About Neil Bowers

user-pic Perl hacker since 1992.