An introduction to CPAN distribution metadata
All CPAN releases (these days) include a metadata file which has information about the distribution. It can be used by tools like CPAN clients (when installing modules), but it's also helpful for other tool writers, and people analysing the structure of CPAN. The metadata file will be called META.yml or META.json, and recent releases often contain both.
In this blog post we'll introduce some of what's in the files and how they're used by CPAN clients.
This post is brought to you by FastMail, a gold sponsor for this year's Toolchain Summit, which is being held in Lyon, France in May. The summit is only possible with the support of companies like FastMail. We'll be doing a series of toolchain-related blog posts, to thank our sponsors.
This article assumes you're familiar with the difference between module, distribution, and release. Read this CPAN Glossary if you're not sure.
What's in the metadata?
A lot of different things can be put in the metadata, including:
- The current author(s).
- The version.
- External modules it relies on (all known as prerequisites (shortened to "prereqs"), or dependencies).
- What licence(s) the distribution is released under.
- What modules are included in the release.
- Where the source code repository can be found (eg github).
The full spec for what can be included in the metadata is
CPAN::Meta::Spec. There are two main versions of this to be aware of:
- Version 2+ data is what you’ll find in META.json
- Version 1.4 data is what you’ll find in META.yml
There are a number of differences, but a key one is that version 2 lets you specify prereqs more precisely. For example you can say that certain modules are only needed to run tests (if you use cpanm to install modules, you can use the --notest option to not run tests when you install a module). The other main change in version 2 was the ability to specify that a prereq was recommended or suggested, meaning that the target distribution can still be installed, even if a recommended or suggested prereq can't be installed. We'll cover that in a later post.
Older releases will only have a
META.yml file, as that came first. More recent releases tend to have both, to support any tools that can only process
META.yml. Really old distributions don't have either, but there aren't many of those left on CPAN.
Where does it come from?
The metadata file or files are generated when you build your distribution (don't write them by hand). If you have a
ExtUtils::MakeMaker), this is when you run:
Module::Build-based distributions, it's when you run:
Dist::Zilla, it's when you
dzil build or
Sometimes you'll come across a re-purposed metadata file, that has been copied from another distribution, but there's no need to do that these days, as the files can be generated for you.
And because they're generated, as a general rule you shouldn't include either of
META.yml in your source code repository.
Installing a distribution
When a CPAN client is installing a distribution, once it's downloaded the latest release, one of the first things it needs to do is check whether it depends on any other CPAN distributions, and if necessary, install those first.
In the early days of CPAN, distributions didn't come with machine-readable metadata, so dependencies were just listed as part of the documentation, and you had to manually install them first. You can imagine how much fun, and how reliable, that was, leading to the toolchain developers introducing prerequisites in metadata.
So now when you're using a CPAN client to install a distribution, it looks in the metadata, determines the prerequisites, and installs them first, if needed. And if any of those have unsatisfied prerequisites, then it recurses on those.
When looking in the metadata for prerequisites, the first thing a CPAN client should do is look for the
dynamic_config entry. For example, look at the META.json file for Try-Tiny. If
dynamic_config is false, then the CPAN client can just work with the metadata file that came in the tarball.
dynamic_config is true (1), then the client has to regenerate the metadata on the target system. It does this by running
perl Makefile.PL or the equivalent. When you do this, you'll see that MYMETA.yml and MYMETA.json are generated. The CPAN client will then use one of these files, for example when checking what third-party modules are relied on (and whether they need to be installed first), rather than META.yml or META.json.
Why is this done? Typically it's because the prereqs might change, depending on:
- The version of Perl, or
- The operating system, or
- The version of
MYMETA files should never be included in a release, and they shouldn't be included in your source code repository.
When releasing distributions to CPAN, make sure they include both
META.json, and don't include them in your source code repository. Don't write these files by hand -- make sure they're generated for you!
There's a lot more to know about distribution metadata; we'll cover some of that in a later post.
Thanks to David Golden and Karen Etheridge for their input on this article.
FastMail is a commercial hosted email service founded in 1999, which has established a reputation for technical leadership in the hosted email space, with a focus on security, privacy, and reliability. It is run by FastMail Pty Ltd, an Australian company based in Melbourne. From their early days they've been users and supporters of Perl, and several of their developers are CPAN authors: BRONG, ROBN, and ROBM (one of the founders). In late 2015 they acquired pobox.com, another hosted mail company and longtime user and supporter of Perl. Pobox's tech team includes RJBS and WOLFSAGE.
Our thanks again to FastMail for continuing to support this event.