Importance of Repositories in Public

It used to be so that a repository was only a place of work and the distribution was the actual result of that work. Only the contents of the distribution mattered. People would read the files README and INSTALL from the distribution after having downloaded it.

Not so anymore. Today the repository is out in the open in GitHub, GitLab, Codeberg or other shared hosting site. On the other hand, the documentation in the distribution is often discarded as distribution packages are rarely downloaded manually but rather via a package manager which installs them automatically.

Publicly viewable repository has in fact become much more than just a place of work. It is also an advertisement for the project and of the community behind it, if there is more than one author or contributor.

When a potential user first finds the project repository, the hosting site commonly presents him with the project README file. That makes README file in fact the welcome page to the project. Its purpose is changed from being purely informational to being an advertisement which competes for user’s attention with bright colors, animated pictures, videos and exciting diagrams, shapes and “bumper stickers”.

But under all the exciting cover it must also remain true to its nature: present the project as precisely as possible and stay up to date with its development.

README might also not be the only file which needs to be kept up to date because it is accessed in the (public) repository. Other potential files can include INSTALL, Changes and CODEOWNERS.

Many files therefore contain text which must be updated at least at the time of release: version numbers, API documentation, examples, file lists.

It is difficult to keep these files in sync with the code; just like documentation, which fact every programmer knows. The Dist::Zilla plugin Dist-Zilla-Plugin-WeaveFile will prevent the files from falling out of sync because their content is tested continuously.

There are other ways to do this, for instance Dist::Zilla::Plugin::CopyFilesFromBuild.

It is my philosophy that nothing in the repository is changed behind programmer’s back. It can also be dangerous to the programmer if he is not a frequent Git committer. Failed local tests are much safer. And when the test fails, it is easy to run dzil weave to update the files.

Dist-Zilla-Plugin-WeaveFile

The plugin Dist-Zilla-Plugin-WeaveFile works very much like my earlier plugin Dist-Zilla-Plugin-Software-Policies: it consists of three pieces: The Dist::Zilla command weave, the plugin WeaveFile which is used to define the configuration in dist.ini file, and the plugin Test::WeaveFile which creates tests for the distribution which check that the defined files exist and match their definition.

Example from dist.ini file:

; Uses default config file .weavefilerc
[WeaveFile / README.md]

; Uses a custom config file and specifies file explicitly
[WeaveFile]
config = install-weave.yaml
file = INSTALL

[Test::WeaveFile]

And the definition file .weavefilerc would then contain, for example:

---
snippets:
    badges: |
        [![CPAN](https://img.shields.io/cpan/v/My-Dist)](https://metacpan.org/dist/My-Dist)
    license: |
        # LICENSE
        [% USE date -%]

        This software is copyright (c) [% date.format(date.now, '%Y') %] by [% dist.author %].

        This is free software; you can redistribute it and/or modify it under
        the same terms as the Perl 5 programming language system itself.
files:
    "README.md": |
        [% snippets.badges %]

        # [% dist.name %] - [% dist.version %]

        [% dist.abstract %]

        [% pod("My::Module", "SYNOPSIS") %]
        [% pod("My::Module", "DESCRIPTION") %]

        [% pod("bin/myprog", "EXAMPLE") %]

        [% snippets.license %]

The templating system is Template-Toolkit. I am planning to change this so that user can choose another templating system if wanted, and then Template-Toolkit will be optional to install. Also allowing to change the output format (currently Markdown) is in plans. All pod text is converted to Markdown.

With a configuration like the above, when user runs dzil test, if the static files README.md and INSTALL are not in sync with their definitions, user can run:

dzil weave

or

dzil weave README.md
dzil weave INSTALL

Future

There might be additional generated information which we will be forced - for practical reasons - to commit into the repository. cpanfile could be one such. GitHub repositories are being scanned by different AI tools which could draw benefit from having such information at hand, instead of being generated and only available in the distribution in MetaCPAN. It does fight the principal of DRY, or, in this case “do not commit generated files” but it could be the lesser evil.

I have lately learned that Devin, the AI software engineer is being used to create summaries and presentations of GitHub repositories in DeepWiki. For an example of a Perl project, my Env-Assert.

Leave a comment

About Mikko Koivunalho

user-pic Perl Programmer for fun and office. CPAN modules and command line tools.