Musings Archives

Importance of Repositories in Public

By Mikko Koivunalho on April 23, 2026 8:39 PM under How-To, Introduce-Package, Musings

It used to be so that a repository was only a place of work and the distribution was the actual result of that work. Only the contents of the distribution mattered. People would read the files README and INSTALL from the distribution after having downloaded it.

Not so anymore. Today the repository is out in the open in GitHub, GitLab, Codeberg or other shared hosting site. On the other hand, the documentation in the distribution is often discarded as distribution packages are rarely downloaded manually but rather via a package manager which installs them automatically.

Publicly viewable repository has in fact become much more than just a place of work. It is also an advertisement for the project and of the community behind it, if there is more than one author or contributor.

When a potential user first finds the project repository, the hosting site commonly presents him with the project README file. That makes README file in fact the welcome page to the project. Its purpose is changed from being purely informational to being an advertisement which competes for user’s attention with bright colors, animated pictures, videos and exciting diagrams, shapes and “bumper stickers”.

But under all the exciting cover it must also remain true to its nature: present the project as precisely as possible and stay up to date with its development.

README might also not be the only file which needs to be kept up to date because it is accessed in the (public) repository. Other potential files can include INSTALL, Changes and CODEOWNERS.

Many files therefore contain text which must be updated at least at the time of release: version numbers, API documentation, examples, file lists.

It is difficult to keep these files in sync with the code; just like documentation, which fact every programmer knows. The Dist::Zilla plugin Dist-Zilla-Plugin-WeaveFile will prevent the files from falling out of sync because their content is tested continuously.

There are other ways to do this, for instance Dist::Zilla::Plugin::CopyFilesFromBuild.

It is my philosophy that nothing in the repository is changed behind programmer’s back. It can also be dangerous to the programmer if he is not a frequent Git committer. Failed local tests are much safer. And when the test fails, it is easy to run dzil weave to update the files.

Dist-Zilla-Plugin-WeaveFile

The plugin Dist-Zilla-Plugin-WeaveFile works very much like my earlier plugin Dist-Zilla-Plugin-Software-Policies: it consists of three pieces: The Dist::Zilla command weave, the plugin WeaveFile which is used to define the configuration in dist.ini file, and the plugin Test::WeaveFile which creates tests for the distribution which check that the defined files exist and match their definition.

Example from dist.ini file:

; Uses default config file .weavefilerc
[WeaveFile / README.md]

; Uses a custom config file and specifies file explicitly
[WeaveFile]
config = install-weave.yaml
file = INSTALL

[Test::WeaveFile]

And the definition file .weavefilerc would then contain, for example:

---
snippets:
    badges: |
        [![CPAN](https://img.shields.io/cpan/v/My-Dist)](https://metacpan.org/dist/My-Dist)
    license: |
        # LICENSE
        [% USE date -%]

        This software is copyright (c) [% date.format(date.now, '%Y') %] by [% dist.author %].

        This is free software; you can redistribute it and/or modify it under
        the same terms as the Perl 5 programming language system itself.
files:
    "README.md": |
        [% snippets.badges %]

        # [% dist.name %] - [% dist.version %]

        [% dist.abstract %]

        [% pod("My::Module", "SYNOPSIS") %]
        [% pod("My::Module", "DESCRIPTION") %]

        [% pod("bin/myprog", "EXAMPLE") %]

        [% snippets.license %]

The templating system is Template-Toolkit. I am planning to change this so that user can choose another templating system if wanted, and then Template-Toolkit will be optional to install. Also allowing to change the output format (currently Markdown) is in plans. All pod text is converted to Markdown.

With a configuration like the above, when user runs dzil test, if the static files README.md and INSTALL are not in sync with their definitions, user can run:

dzil weave

dzil weave README.md
dzil weave INSTALL

Future

There might be additional generated information which we will be forced - for practical reasons - to commit into the repository. cpanfile could be one such. GitHub repositories are being scanned by different AI tools which could draw benefit from having such information at hand, instead of being generated and only available in the distribution in MetaCPAN. It does fight the principal of DRY, or, in this case “do not commit generated files” but it could be the lesser evil.

I have lately learned that Devin, the AI software engineer is being used to create summaries and presentations of GitHub repositories in DeepWiki. For an example of a Perl project, my Env-Assert.

1 comment

GitHub and the Perl License

By Mikko Koivunalho on November 27, 2025 11:07 PM under How-To, Introduce-Package, Musings

When we publish our Perl module repository on GitHub, we might notice something peculiar in the "About" section of our repository: GitHub doesn't recognize the Perl 5 license. This can be a bit confusing, especially when we've explicitly stated the licensing in our LICENSE file.

Without properly defined license, GitHub ranks the quality of a repository lower. This is also unfortunate because it limits the "searchability" of our repository. GitHub cannot index it according to the license and users cannot search by license. This is today more important than ever before as many enterprises rule out open source projects purely on the grounds that their license is poorly managed.

The Problem: Two Licenses in One File

The standard Perl 5 license, as used by many modules, is a dual license: Artistic License (2.0) and GNU General Public License (GPL) version 1 or later. Often, this is included in a single LICENSE file in the repository root.

GitHub's license detection mechanism, powered by Licensee, is designed to identify a single, clear license. When it encounters a file with two distinct licenses concatenated, it fails to make a definitive identification.

Here's an example of a repository where GitHub doesn't recognize the license. Notice the missing license badge in the "About" section:

Also the "quick select" banner above Readme file does not acknowledge which license there is.

The Solution: Separate License Files

The simplest and most effective solution is to provide each license in its own dedicated file. This allows Licensee to easily identify and display both licenses. This is perfectly valid because the Perl 5 license explicitly allows for distribution under either the Artistic License or the GPL. Providing both licenses separately simply makes it clearer which licenses apply and how they are presented.

(The other reason for having multiple licenses is situation where different parts of the repository are under different licenses. But this is not our problem here.)

For example, instead of a single LICENSE file containing both, we would have:

LICENSE-Artistic-2.0
LICENSE-GPL-3

Let's look at an example from my own env-assert repository. In this repository, I've separated the licenses into LICENSE-Artistic-2.0 and LICENSE-GPL-3.

And here's how GitHub's "About" section looks for env-assert, clearly recognizing both licenses:

As we can see, GitHub now correctly identifies "Artistic-2.0" and "GPL-3.0" as the licenses for the project.

Same is also visible in the "quick select" bar:

Automating with Software::Policies and Dist::Zilla::Plugin::Software::Policies

Manually creating and maintaining these separate license files for every module can be tedious. Fortunately, there is a way to automate this process if you are using Dist::Zilla for authoring.

Dist::Zilla::Plugin::Software::Policies

If we're using Dist::Zilla for our module authoring, Dist-Zilla-Plugin-Software-Policies can automatically check that we have the correct License files. It uses Dist::Zilla's internal variable licence to determine the correct license files.

The Dist::Zilla plugin uses Software-Policies as a backend to do the heavy lifting.

Software::Policies

Software::Policies is a module that provides a framework for defining and enforcing software policies, including licensing. It comes with a pre-defined policy for Perl 5's double license. It can also generate other policy files, such as CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md.

By using Software::Policies, we can programmatically check for the presence and content of our license files.

This approach not only solves the GitHub license detection problem but also helps us maintain consistent and correct licensing across all our Perl modules, integrating it directly into our build workflow.

By configuring this plugin in our dist.ini, we can ensure that our distribution always includes the correct and properly formatted license files, making GitHub (and other license scanners) happy.

Here's a simplified example of how we might configure it in our dist.ini:

[Software::Policies / License]
policy_attribute = perl_5_double_license = true

[Test::Software::Policies]
include_policy = License

This configuration tells Dist::Zilla plugin Test::Software::Policies to apply the Perl licensing policy, which typically means Artistic License 2.0 and GPL. When we build our distribution with Dist::Zilla, the plugin will create a test file checks for the existence and content of the LICENSE-Artistic-2.0 and LICENSE-GPL-3 files. During testing phase, when running dzil test or dzil release, the test files will be run and if the license files are missing or incorrect, the tests will fail.

To generate the files, we can run the command dzil policies License or just dzil policies. This will create the files according to config in dist.ini, the [Software::Policies / License] part of dist.ini.

We cannot create the files automatically during build because then they will only be included in the release, not in the repository. It is precisely in the repository that we need them for GitHub's sake. So the process to create or update the license files has to have this small manual stage.

8 comments

Perl in a Business Application

By Mikko Koivunalho on May 3, 2017 7:54 AM under Musings

Perl in a Business Application - Musings of an Architect

Everybody knows that Perl is not the right language for a large scale enterprise application. This is common knowledge, right? But why is that? Explanations are as many as there are people explaining. Everything from "it's a script language, therefore slow" to "its free syntax breeds discoherence" to "Perl developers are horrible individualists".

Well, I didn't believe this, and I went on to help in a startup which wants to build some fintech systems, the first aim of which is to integrate with Finnish banks and collect daily payments from a customer's bank account.

It was decided to use Perl as the core language. If Perl is (was) good enough for Goldman Sachs and Morgan Stanley it surely is good enough for us. So off to build a framework!

Two Failed Attempts for System Architecture

We decided to do the web part with Dancer2 and build our own object system with mappers to read from and write to database and a clever filing metaphor with a class called BusinessObjectManager which creates, stores, restores and retires (removes) one object at a time. I had previously worked with a similar kind of metaphor in a C++ system. That was, of course, much more rigid, whereas with Perl we didn't seem to be able to create the safeguards we wanted to prevent future developers from abusing the system. The design just grew more complicated. I considered employing Moose but then instead took a whole different approach.

An approach familiar from Java web programming, Java internal services, which concentrates on business processes instead of business objects. I built an example service called SimulateBank with a simple structure: At the top was Dancer2 SimulateBank::Web package creating the WWW interface, this used SimulateBank::Client package which in turn called the JSON REST Api. Api was done with Dancer2 using SimulateBank::Api package which in turn used internally SimulateBank::Service which spoke to database directly. SimulateBank::Service implemented "services" like finddeposits, _createdeposit, _reportdeposit_ and canceldeposit_. These are processes which codify the business rules of the company. We do not care about objects, we only care about interfaces and processes.

This second approach was much more convenient and efficient. However, it still required code duplication and syncronization between Client and Service. What's more, I wasn't happy with it because it didn't feel "Perlish". I was resonably happy with the "super-structure" of the code, the division between Web and Api. But I felt I wasn't using the possibilities of Dancer2 framework with its plugins, like Dancer2::Plugin::Database and Dancer2::Plugin::Queue. I felt I was doing the same job twice when implementing my own interfaces to database and message queue instead of using the readily available Dancer2 facilities.

├── SimulateBank
│   ├── Api
│   │   └── Transactions.pm
│   ├── Client
│   │   └── Transactions.pm
│   ├── Common.pm
│   ├── Service
│   │   └── Transactions.pm
│   └── Web
│       └── Transactions.pm
└── SimulateBank.pm

And then it struct me! I had looked at the whole problem from the wrong angle. My approach was code first. I tried to create a perfect structure for the future Perl programmers to use.

Towards a Perlish Approach

Perlish approach is result oriented. After all, do we not pride ourselves on using a language which is fast to program with? Software only matters if it's put into production. But what about all that horrible Perlish hacking, the quick-and-dirty way?

That is the Perl way! To create a working solution today. Not to worry about tomorrow.

In my experience the most far reaching problems in software development are not done by programmers, or they are done by programmers when they have been forced into roles that should be done by others, such as database designer, integration architect and user interface designer. These are the people who should do the worrying. They are paid to design long-term solutions!

Business App Blues

The purpose of most enterprise applications is to collect data, and then deliver or distribute it, or act upon it. So is ours. From the very beginning we started to plan our datamodels and the resulting database schema meticulously. For instance, the application has only SELECT and INSERT access to many of our tables to prevent the loss of past state information. The whole schema has only one sequence and its value is inserted in every table so that the whole flow and order of operations in database is trackable.

Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.

Fred Brooks, The Mythical Man-Month: Essays on Software Engineering (1975, 1995), https://en.wikiquote.org/wiki/Fred_Brooks

With Fred Brooks' quote in mind, I approached again the issue of code first. Code could not come first. Data and datamodels came first. Code second. In fact, code comes third, because second place is given to apis, most notably the REST apis created with Dancer2.

Interfaces and Silos

Database schemas and system apis are both interfaces to data. They are the longest lasting parts of the system - and the ones that are the most difficult to change later. Code isn't. It can always be refactored and improved and tested against the unchanging interfaces.

If the interfaces are locked, especially the database schema, and we can be 99% sure that our data is always protected from a malfunctioning program, then it's time to give the programmer free hands to create the best code he can.

Furthermore, the future of our application is in constant change - like in many startups. Microservices is a natural way to extend this way of thinking. Different parts of the system become microservices and silos whose implementation code is their private part. This code can be quickly changed and it must have no connection to any other silos' code. This allows very radical changes if need be, such as web programming frameworks, math packages or even Perl interpreter version.

And what's best, none of the changes in the code threaten the stability of the whole system. Code quality becomes a matter of code reviews. Our backs (interfaces) covered, coding with Perl is fast and fun, because it just works (TM).

The Perl Way

There were several times when we were second guessing our decision to use Perl. This whole story happened in the course of one year's time. I consider myself lucky to have been given the chance to go through that whole mental process. I believe I understand Perl a lot better now - not perhaps as language but as a way of seeing software development and organizing development projects.

The first version was indeed only good for throwing away like Brooks writes in The Mythical Man-Month. We saved some parts and also some ideas from the second version. Speed is of the essence. The internal services for accessing database are mostly skipped and Dancer2 database plugin is used to fetch the data directly. Most action happens right in the same package where there the Dancer2 REST interface endpoints are defined because in most cases the data fetched from or written to database requires no additional handling. So there is no need to create additional layers, especially when the rigid database schema assures that fetched data is always sound (no nulls, no missing values or missing foreign fields). While quality control was earlier exercised only via code reviews, those are now complemented with api tests and rigid database schema modelling.

[Software engineering is the] establishment and use of sound engineering principles to obtain economically software that is reliable and works on real machines efficiently.

Friedrich Bauer (1972) "Software Engineering", In: Information Processing. p. 71

6 comments

« Introduce-Package | Main Index | Archives

About Mikko Koivunalho

Perl Programmer for fun and office. CPAN modules and command line tools.

More info »

Mikko Koivunalho

Musings Archives

Importance of Repositories in Public

Dist-Zilla-Plugin-WeaveFile

Future

GitHub and the Perl License

The Problem: Two Licenses in One File

The Solution: Separate License Files

Automating with Software::Policies and Dist::Zilla::Plugin::Software::Policies

Dist::Zilla::Plugin::Software::Policies

Software::Policies

Perl in a Business Application

Perl in a Business Application - Musings of an Architect

Two Failed Attempts for System Architecture

Towards a Perlish Approach

Business App Blues

Interfaces and Silos

The Perl Way

About Mikko Koivunalho

Search this blog