A convention for Changes files
This post is an attempt to pull together in one place all the discussions about a/the changelog format for CPAN modules. Brian Cassidy's CPAN::Changes::Spec defines a format to use for the file, which is currently used by 41% of the distributions on CPAN. Discussions are currently happening in a number of places, including a MetaCPAN issue, a pull request, and a Questhub quest.
I think the goals for a Changelog format / convention are:
- Easy to grok.
- Easy to write.
- Flexible to allow variations.
- Doesn't assume English.
- Amenable to automatic processing, even if just at the chunking level (ie "this block of text describes what changed in version 0.NN of this module").
You might think "why have a spec for this? just let people do whatever they want". I have a number of thoughts on that:
- On the "easy to grok" point, the most important factor is that it should be easy to work out what changes are associated with which CPAN release.
- People uploading their first module might not know about any conventions, and often don't really care that much, but would like to "fit in". On my quest to fix up Changes files I've had a good number of people thank me, saying they didn't know what format to use.
- CPAN has very few hard and fast rules, and this wouldn't be one of them. I think a convention is entirely appropriate, and I suspect 90%+ of CPAN authors would be happy to follow any convention (based on my experience doing over 100 pull requests and bug reports for reformatted Changes files), as long as it made sense and didn't make their life harder.
- It makes it easier when looking at a changed module. If everyone just listed the most recent release first, that would be an improvement.
Here's a minimal Changes file template, that satisfies CPAN::Changes::Spec
Revision history for Perl module Foo::Bar You can put some preamble text here. 0.03 2013-09-12 * First thing I changed. * Second thing I changed. 0.02 2013-08-17 * Meh. 0.01 2013-02-01 * First release to CPAN.
Each release is introduced with a 'header line': the version number (without a leading 'v') followed by the date. The header line is left aligned and everything else (apart from the preamble) is indented, making it easy to chunk, both visually and programmatically. It uses the ISO 8601 date format, which is country and language agnostic. You can put anything you like after the date, which might be the name of the person who made the release, the time and timezone, or anything else. I think we could have additional conventions for the header lines; see below.
You can make the description a bullet list, but it could just be some text, as long as it's indented. Releases are listed from newest to oldest.
I think this should aim to be "markdown for Changes files". Here's a bit of the markdown philosophy
Markdown is intended to be as easy-to-read and easy-to-write as is feasible.
Readability, however, is emphasized above all else. A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.
Which I think is hard to disagree with.
All that said, I know there are some people who dislike this format, and others who think it's got some problems that need fixing before more widespread adoption. So, what are the problems. What are the minimum changes to the spec that would see you adopting this?
Other changelog formats
A quick tour around other changelog formats, to see what we could borrow from them. Add links to other formats / conventions in the comments please.
GNU
GNU has a simple changelog format, which is very similar to the format described above:
1993-05-25 Richard Stallman* man.el: Rename symbols `man-*' to `Man-*'. (manual-entry): Make prompt string clearer. * simple.el (blink-matching-paren-distance): Change default to 12,000. 1993-05-24 Richard Stallman * vc.el (minor-mode-map-alist): Don't use it if it's void. (vc-cancel-version): Doc fix.
The main problem I have with this is that it's chunked and dated on each commit or set of commits. When I've seen Changes files done this way, it is sometimes hard to work out the boundaries between CPAN releases.
Debian
Debian defines a changelog format, which is used for an overall release. Here's a sample entry:
foo (1.2.3-1) unstable; urgency=low * New upstream release. * Dropped 02_manpage_hyphens.dpatch, fixed upstream. * Added 04_edit_button_crash.dpatch: fix a crash after pressing the edit button. (Closes: #654321) * debian/control: foo should conflict with libbar. (Closes: #987654) -- John DoeFri, 30 Nov 2007 15:29:42 +0100
For Deban the date of the individual entries isn't so important, but it is for CPAN releases, so I think it's right that the date is on the header line. The urgency entry is good, and prompted some thoughts for additional conventions for the header line:
- For developer releases, include developer release on the header line. Someone might not know that 0.09_01 indicates a developer release, and it's easily missed when skimming the file.
- If this release of a module is backwards incompatible in any way, indicate that.
- If users of your module really should upgrade (security issue, critical bug, huge performance improvement), then indicate that.
What's the goal here? Why am I doing this?
This is never going to produce a convention that 100% of CPAN authors agree to and use. There will always be people who want to do their own thing, and a small number of "power users" who have outlier requirements (or who insist on formatting their changelog in LaTeX, you know how it is). What I do think is useful is a simple and easy-to-understand format, that we can all agree is a good approach for most authors, and in particular can be suggested as a start point for new users.
I think that most new authors are best helped with a simple convention that says "do it this way, and go read here for more details / options".
> So, what are the problems. What are the minimum changes to the spec that would see you adopting this?
Allow different filenames: Changes, CHANGES, ChangeLog CHANGELOG etc
https://rt.cpan.org/Public/Bug/Display.html?id=87045
@vsespb: yep, sounds entirely reasonable, assuming there's a finite (and short) list of acceptable names.
Probably there is, because (as specified in RT ticket), both CPAN and MetaCPAN understands ChangeLog (so there should be a finite list somewhere in their code)
As I've said elsewhere the only general spec I would support is this:
If people want to provide more structure, fine, and if parsers learn to look for common structures, fine, but I see no good reason for people to modify their personal changelog style beyond what I've specified above.
Vim users may find the following useful:
If you're in insert or replace mode, this inserts the current date in the proper format at the cursor when the user hits the <F1> key.
In normal mode,
places the current date in the same format into the default register, whence it can be pasted using
p
orP
.The current spec does not accept the default date format from dzil. This is IMHO quite a big problem. Either from dzil $, or the spec (or its implementation). I opened an issue: https://github.com/bricas/cpan-changes/issues/17
...this actually hinders me finishing my quest :-P http://questhub.io/realm/perl/quest/51f81b0acc80951f7c000009
I'd also add to the spec that the most recent changes come first. I've seen some where the first release is at the top, and it's not quite as user-friendly to have to scroll all the way down to see the most recent changes.
There are benefits in standardising the Changes format. CPAN::Changes::Spec is not the first attempt to do so, and is probably not perfect, but it's the first attempt that has achieved traction. Some have argued that it's too strict, requiring W3C datetime format. Others have said that it's too lax, for example, not putting any restrictions on the format of the individual change lines; nor specifying whether versions should be listed in (either forward or backward) chronological order. Without addressing the merits of these individual points, I'd say that if some people think a spec is too strict, and others too lax, then it's probably achieved roughly the right balance.
Nobody's forcing anyone to use CPAN::Changes::Spec. However, if you do, tools that expect Changes files to be formatted according to CPAN::Changes::Spec will work better. Same situation with META.json and CPAN::Meta::Spec; you can publish a CPAN distribution with no META.json file, or one that doesn't conform to the spec, and PAUSE will still accept it; it should still be indexed OK; and if your Makefile.PL/Build.PL outputs the correct MYMETA.yml/json, dependencies will still be automatically resolved. (And if not, I believe CPAN.pm is still capable of scraping dependencies out of the Makefile.)
MetaCPAN is aware of CHANGES, Changes, ChangeLog, Changelog, CHANGELOG and NEWS. See https://github.com/CPAN-API/metacpan-web/blob/master/lib/MetaCPAN/Web/Model/API/Release.pm#L265
Thanks Olaf.
Agreed @ether. I'd been assuming that was in there, as that's one of the changes I've made to a number of files.
You sent me a Changes file for a module, with the dates of the release versions etc. How did you make the Changes file? I want to be able to generate my own retrospective Changes file.
Ben: where dates for releases are missing, I get them from backpan. Eg for you: http://backpan.perl.org/authors/id/B/BK/BKB/
OK thanks. I made a script, I don't know if it is any use to you but here it is:
http://www.lemoda.net/perl/perl-retro-changes/index.cgi
Another standardized format that some people use, although not a standardized changelog format, is YAML. Most notably, Ingy uses it for has many distributions (for example, Test::Base). Although I've used YAML for changelogs in the past, I personally follow CPAN::Changes::Spec now because—unlike most things—I don't hold a strong opinion on the topic and moved toward others' attempt at standardization.