His story starts with a comment about Perl 6, to which someone replied, "Does anyone actually use Perl 6?" (or words to that effect).
"My first thought," he writes, "was, I bet more people use Perl 6 than Haskell, and it's well known that people use Haskell."
What is the relationship between popularity and viability?
]]> Does a language need to be popular to be viable?Cook compares the popularity of Perl to Haskell and several other languages:
"Common Lisp has been around since 1982, and was standardizing a language that had been in development since 1958."
"Erlang has been around since 1986."
"There is not a huge community devoted specifically to F#, but it shares tooling and libraries with C#…"
Popularity provides many hands to work on projects and many eyes to address issues with the platform. However, both Lisp and Erlang, he says, "have many of the benefits of popularity…accumulated over time" (emphasis mine). And all of these languages would be "safer bets for a production software project than several more popular languages."
As it is with art and entertainment, a long-lived cult-classic technology can be more viable than one that is popular in the moment.
He used the TIOBE index for his stats. I prefer the PYPL index, which uses Google Trends to track the number of searches on the term "<language> tutorial." They explain in their FAQ:
"C programming" is used much more than "PHP programming," because PHP does not need the qualifier. Tutorial is a word used frequently by developers learning any new language: it makes a good leading indicator. What is a "python tutorial," if not a tutorial on the programming language?
So this methodology should measure not how many people are using a given language, but how many are learning it.
Common Lisp, Erlang, and F# are missing from PYPL, because none of them is popular enough. Perl and Haskell do appear on the index, both at <1% and sinking.
However, PYPL (like TIOBE) measures popularity based on share of searches, not the volume of them. So if language A becomes bigger because more people are searching for language tutorials in general, language B's graph line descends, even if its overall volume continues to increase.
When I'm doing research, I'm usually more interested in the absolute volume. Therefore, I took to Google Trends myself. I downloaded data for a number of programming languages (number of searches per week), smoothed the raw search counts with a Gaussian weighting function, and plotted them on a logarithmic y-axis. The languages I looked up included all the ones in John D. Cook's article, plus Go, Perl 6, and Dart. I also included the variation "GoLang tutorial" and added it to "Go tutorial."
And here's what I saw:
(Click on the above image to zoom in.)
I also noticed some other interesting phenomena:
Perl is indeed more popular than Haskell, Lisp, Erlang, etc. Or at least more people are looking up Perl tutorials than tutorials on those other languages.
No one seems to be searching for "Perl 6 tutorial." Literally, zero searches recorded as far as I can determine. Google must be filtering them out somehow or lumping them together with "Perl tutorial," even though the two searches return distinctly different results. So Google does seem to know the difference, but Google Trends is still not showing any searches for P6 tutorials, and I don't believe that no one is searching for them.
Dart is still a minor player. There are some searches for "Dart tutorial," but there were so few that I didn't even include it in the graph.
There was a noticeable spike in searches for "Go tutorial" for about a month during July 2016. GopherCon 2016 was held July 11th through the 13th. Probably not a mere coincidence.
There were quite a few searches for "Go tutorial" back in 2004, years before the language was even invented. I'm not sure what those searches were pulling up back then, but I'm pretty sure current searches are talking about the programming language. (This also belies the PYPL's idea that "What is a 'go tutorial' if not a tutorial on the Go programming language?")
I concur with John D. Cook on Haskell, Lisp, Erlang, and F#.
GoLang is a viable language, being almost a decade old now, supported by Google, and increasingly used in production projects. It seems to me its growing popularity is a symptom (not a cause) of its viability.
The most interesting manifestation of the popularity-viability conundrum, however, is Dart. Despite Randal Schwartz's enthusiasm for Dart, the language is still seeking its niche. It's been around for 7 years, but it has not yet reached the adoption chasm, much less crossed it. It may have found a "killer app" in Flutter (which is a truly cool framework for mobile apps, and Randal Schwartz's recent lightning talk on Flutter frankly didn't do it justice). But whether Dart will make it big is still yet to be seen.
Even so, despite its lack of popularity and its newness, Dart is a viable platform for production projects.
Yes, people actually do use Dart for production projects, mostly in web and mobile development.
If Dart is viable, then how much more so Perl?
This post originally appeared on The Perl Shop blog as “How Viable is Perl?”
]]>BEGIN { my $bak; *A:: = $bak; BEGIN { $bak = \%A::; *A:: = \%a::b::c::d::e::f::; } sub A { return 'a::b::c::d::e::f'; }A::g();
A->g();
}
That could also conceivably be done in a package.
What am I missing?
]]>*A:: = $bak
runs.
For example:
BEGIN { my $bak; *A:: = $bak; BEGIN { $bak = \%A::; *A:: = \%a::b::c::d::e::f::; } A::g(); A->g(); }
This prints:
I am a::b::c::d::e::f::g. I am A::g.
...because the call to A::g()
is resolved at compile time, but the call to A->g()
is resolved at runtime.
How can we use long lists of symbols from an imported package and still keep the code readable?
I usually prefer use statements of the form:
use My::Module qw(symbol1 symbol2 symbol3);
Except for specially understood modules, like Moose
and Test::More
, I don't like to just import everything. Rather I like to explicitly call out only the specific symbols I need.
But what if you need to:
use My::Module qw(
symbol1 symbol2 symbol3 symbol4 symbol5 etc and so many symbols
that it takes up several lines all the time in every package
that uses it
);
There are a few alternative approaches.
We could just fall back on importing everything.
So… Five years from now, I'm going to revisit that code, and I'll see:
wiggle_foo_gadget(WIGGLE_WOBBLE);
And my first question is going to be, "Which of those 17 modules does that come from?!" And then I'll need to grep through the entire codebase to find it.
That's why I like to call out the specific symbols being imported, and that's our standard practice here at The Perl Shop. If the symbol is explicitly imported, a simple search through the current module will find it, and it will be clear where the symbol came from.
Additionally, when we import everything, we have zero control over what symbols are imported. If a later version of a used module exports more symbols, then they'll automatically get imported in our package, whether we want them there or not. This could result in name clashes and action-at-a-distance bugs down the line.
So alternative option #2…
That is, we can just use the full package name. So:
use My::Module ();do_something_with($My::Module::scalar_var);
do_something_else_with(@My::Module::array_var)
My::Module::do_the_hustle();
This is generally my preferred alternative. Additionally, reading My::Module::do_the_hustle()
, having the module cited adds context and can make the code easier to read.
But what if My::Module
is actually more like My::External::Module::With::Several::Levels
? Copying and pasting that monstrosity everywhere, that's going to get pretty confusing pretty fast.
And in my sample code, this was indeed what I was facing. So not a viable alternative.
This is a well-established idiom, implemented directly by Exporter
.
With a tag, the use
line would look something like this:
use My::Really::Long::Module::Name qw(:some_symbols);
This is much more concise, and it gives us at least some level of control over what is being imported. But it still doesn't make clear where the symbols came from.
This is the option I chose in my sample code. Mostly I did so for the reasons above. In production code, I would probably have opted for the very last of the options below ("Lexical aliasing"). But in this case, it was sample code for a testing book, and I really didn't want to take a section of the book to explain that non-standard idiom, because it had nothing to do with testing.
This is very nonstandard, but one thought that occurs to me is that one could export read-only hashes of symbols.
So in My/Really/Long/Module/Name.pm:
package My::Really::Long::Module::Name;use Readonly;
Readonly::Hash our %some_symbols => (
symbol1 => \&symbol1,
symbol2 => \&symbol2,
symbol3 => \&symbol3,
# ... and so forth
);
Then we can:
use My::Really::Long::Module::Name qw(%some_symbols);$some_symbols{symbol1}();
$some_symbols{symbol2}();
$some_symbols{symbol3}();
That's a little awkward, but it is succinct (at least in the using module) and does identify the symbol source.
But it doesn't exactly work with variables. That is, as soon as you put a reference to anything in a read-only data structure, the anything itself becomes read-only.
There's a better alternative…
That is, use My::Really::Long::Module::FooBar
and then refer to it as simply FooBar
.
We can accomplish this with Package::Alias
:
use Package::Alias FooBar => 'My::Really::Long::Module::FooBar';FooBar::make_it_rain();
say $FooBar::is_raining;
The Package::Alias
line above is actually syntactic sugar for:
BEGIN { use My::Really::Long::Module::FooBar; *{FooBar::} = \*{My::Really::Long::Module::FooBar::}; }
The disadvantage here is that the alias is global, not lexical. That is, if one module aliases FooBar
, then another module can't also alias FooBar
(whether or not they're aliasing to the same target). Package::Alias
will warn if you attempt that and then ignore the second and subsequent attempts.
(Note also that Package::Alias
will load the target package into the caller's namespace only if it hasn't been used previously. And it actually does use the default use
, which imports all available symbols into the caller's namespace—but only if no other package has formerly loaded it. For this reason, this functionality is only appropriate for packages that do not export symbols. Otherwise, you need to explicitly use My::Module ()
before using Package::Alias
.)
What I'd really like is to alias the target in lexical scope.
namespace::alias
claimed to do that. It uses Perl internals to accomplish its black magic. Unfortunately, there are several critical documented issues, and it has not successfully built since Perl 5.18. It was last released in 2012. So not an option.
aliased
does something similar. It creates a short-named subroutine that returns the long name of a target package. It also has magic to figure out what the subroutine should be named and other features. I haven't tried the module, but it looks like quite an elegant design: simple and powerful. Unfortunately, it really only works for object-oriented code. So a tool for the toolbox, but not applicable to this blog post.
The best I was able to come up with was this:
use My::Really::Long::Module::FooBar ();my $FooBar = \%My::Really::Long::Module::FooBar::;
$FooBar->{do_something}();
$FooBar->{do_something_else}();
my $is_done = $FooBar->{is_it_done}();$FooBar->{scalar_var}->$* = 'foo';
push $FooBar->{array_var}->@*, qw(bar baz qux quux);
$FooBar->{hash_var}{spam} = 'eggs';
This is slightly awkward, but it works. And it's a nonstandard idiom, but it's not so confusing that a competent Perl programmer can't quickly figure out what's going on. And maybe if we used it more, it would catch on.
This is an idiom I will keep in mind for future projects.
]]>I think there are a couple reasons why programmers use Test Hierarchy.
Test Hierarchy may appear to “just make sense” at first blush. After all, you have a hierarchy of classes under test—superclasses and subclasses—and you have a collection of test classes. It seems very symmetrical to have the test classes mirror the classes under test.
However, there’s no design justification for the test classes to be arranged in a parallel hierarchy. The easiest way to see this is to consider what happens when developers get tired of Test Hierarchy. What do they do? They drop back to procedural tests, with no inheritance at all. If you don’t need Test Hierarchy to test object-oriented code using procedural tests, why do you need it when using Test::Class
? Answer: You don’t.
Test inheritance should only be used to meet the needs of the tests, not the needs of the code under test.
This usually means that if we have test superclasses, they specifically contain shared setup and teardown code or test utility functions. See the Testcase Superclass pattern, by which a test class can inherit common functionality from an abstract test superclass. Using this pattern, the test hierarchy is organized in order to share common code across the entire project’s tests or an entire subsystem’s tests.
(But use SharedTestModule qw(shared_function)
is still preferred over inheritance, because it more explicitly states what is being shared and where.)
I’ve also seen programmers appeal to the Liskov Substitution Principle. This is the idea that if Bat
is a subclass of Mammal
, then any code that requires a Mammal
can be handed a Bat
without ill effects. Barbara Liskov and Jeanette Wing formally defined it like this:
Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T. (“Behavioural Subtyping Using Invariants and Constraints.” Barbara Liskov, Jeanette Wing. CMU-CS-99-156. MIT Lab, July 1999.)
In other words, a subclass adheres to the same interface contract as its superclasses.
Some programmers will say that if Bat
is a subclass of Mammal
, then a Bat
can do anything that a Mammal
can. In other words, a Bat
“is a” Mammal
. Therefore, it directly follows that BatTest
should test all the Mammal
behaviors that Bat
inherits.
No, it doesn’t, and no, it shouldn’t.
To understand why, consider a couple simple cases.
If a Mammal
has a care_for_young()
method, then every Bat
must also be able to care_for_young()
. This does not mean that the way bats care for their young is exactly the same as every other mammal. In fact, it’s distinctly different, because baby bats have needs that are distinct from the needs of other baby mammals. In fact, Mammal::care_for_young()
may even be an abstract method that dies with a “must be implemented in subclass” error.
Similarly, every Animal
can move()
. That means every Mammal
also can move()
, because Mammal
is a subclass of Animal
. By extension every Bat
can also move()
, because Bat
is a subclass of Mammal
. Now explain to me how the way a bat moves is identical to the way a sloth moves or the way a tarantula moves. It isn’t.
A subclass adheres to the same interface contract as its superclasses. It does not necessarily implement identical behaviors.
Therefore, just because a Bat
“is a” Mammal
, that doesn’t mean that a BatTest
“is a” MammalTest
. Actually, no, a BatTest
is not a MammalTest
. Not even close. Both BatTest
and MammalTest
are just tests. Or they might, at most, be derived from OrganismTest
abstract class which contains helper methods to set up and manage test fixtures common to all organisms.
Test::Class
can test anything straight Test::More
can, and vice-versa. The power in Test::Class
is not as legend says in testing object-oriented code. The power Test::Class
brings is its ability to collect related test methods together, run them independently, and inherit setup and teardown. (I’ll explore more of these details in Testing Strategies for Modern Perl.) Each test class should be derived directly from Test::Class
or from an abstract subclass thereof. Never inherit test methods. Just don’t do it.
Peace, love, and may all your TAP output turn green…
This post originally appeared on The Perl Shop blog as “Why Programmers Use the Test Hierarchy Antipattern.”
]]>Testing a 6,000-line module is as difficult as it sounds. Long methods and large classes are code smells that made Martin Fowler’s Refactoring book. It’s often nigh impossible to unit-test a monolith. It might be time to do some separating of concerns. This will also—no surprise for testing aficionados like me—make it easier to maintain and extend the framework. Rocky notes some specifics under “The need for modularity” in his post, e.g., don’t repeat yourself, separate data from presentation, separate interface from implementation, and separate version-specific behaviors from version-agnostic behaviors.
Understanding a 500-line test is as difficult as it sounds. Long, complex tests made Gerard Meszaros’s list of test smells, in his book xUnit Test Patterns. And as we’ll see as we go on, we can’t ignore the stink just because we’re Test::More
purists: his advice applies equally to class-based and procedural testing.[1]
Tests should not depend on each other. Rocky noted that “the slightest error” would generate thousands of lines of follow-on test failures. He didn’t explicitly say so, but I suspect that indicates that the first test failure left the fixture in an invalid state, causing all other tests to fail. This is what Meszaros calls “Data Sensitivity.” Each test needs to start with a clean fixture.
Each test should test one clear feature or unit of code. This is part of what causes “frail” tests, what Meszaros calls “Fragile Tests.” One big takeaway from Rocky’s post is the technique of round-trip testing. That is, you take the output of Deparse, which should be executable Perl, run it, and see if it generates the same behavior as the original snippet of code. He uses this with self-testing scripts so that the decompiled test code can verify itself. We can use a similar principle to validate HTML or other markup, for example, by asking what output we expect the markup to produce and rendering only those aspects of the generated markup.
Each test failure should clearly identify which feature or piece of code is broken. This is related to the bullet above but from the perspective of the programmer running the tests. If you can’t tell what went wrong, your tests are not helping you understand or write your code. Meszaros calls this “Obscure Test,” and one consequence is that you can’t use your tests as documentation. It can also result in buggy tests and high maintenance costs. This is why I prefer the four-phase test structure—setup, test, verify, and cleanup—with which you can see at a glance what each test does and why it failed. It’s also important to use complete assertion messages: each test failure should read like a good bug report, indicating what action we took, what we expected to happen, and what occurred instead.
All in all, an insightful read.
Peace, love, and may all your TAP output turn green…
This post originally appeared on The Perl Shop blog as “Testing Insights from B::DeparseTree.”
[1] This distinction is a bit of a myth as well, which I touch on in Testing Strategies for Modern Perl. Test::Class
and other xUnit-like frameworks don’t supplant procedural testing practices. Rather, they add new tools by which you can manage your tests. In particular, they can help you manage a collection of small, well-defined test methods. And yes, I can already hear you saying, “I do that with subtest $test_name => sub {}
.” Yes, that’s exactly what I mean. I just happen to use sub test_name : Test() {}
instead. Plus I can (a) run a named test method in isolation from the command line, (b) set up test fixtures before running each method in a module with a line of code, (c) automatically clean up after each test method (even failing ones) with a line of code, (d) inherit test fixtures across a whole set of test classes, and (e) abort a failed test method with a single line of code without affecting any other test methods.
A unit test, by definition, tests a unit of software, no more, no less. On the one hand, we have unit tests, which test a single module or class. On the other hand, we have integration tests, which test how multiple modules or classes work together. We want each unit test to poke and prod only the class that it tests. We want each subsystem integration test to test a natural subsystem, e.g., the data-export subsystem. We want our system tests to test the whole system. And we don’t want any test to be affected by any other other units, subsystems, or systems.
When our software depends on other software that may change over time, our tests may suddenly start failing because the behavior of the other software has changed. This problem, which is called Context Sensitivity, is a form of Fragile Test…Whatever application, component, class, or method we are testing, we should strive to isolate it as much as possible from all other parts of the software that we choose not to test. This isolation of elements allows us to Test Concerns Separately and allows us to Keep Tests Independent of one another. It also helps us create a Robust Test by reducing the likelihood of Context Sensitivity caused by too much coupling between our SUT [system under test] and the software that surrounds it. (xUnit Test Patterns: Refactoring Test Code. Gerard Meszaros. Addison-Wesley Professional, 2007.)
Test Hierarchy produces tests that purport to be unit tests but that don’t actually test isolated units.
Let’s say we have a Bat
class that is a subclass of Mammal
. If the tests use Test Hierarchy, then BatTest
not only tests the Bat
code, but also the superclass Mammal
code and its superclass Animal
.
This seems to make some intuitive sense, because after all, Bat
can do all the things that Mammal
and Animal
can do, all the methods that it inherits from those classes. But this intuition misses an important distinction. The unit is whatever code is in the Bat.pm
module, not whatever the Bat
class can do.[1]
When a Bat
unit test fails, it should indicate that we made a mistake in Bat.pm
, not any other module.
Our bad BatTest
doesn’t just test the code in Bat.pm
, but also the code in Mammal.pm
and Animal.pm
. This makes it an integration test (not a unit test), because it doesn’t just test its own module but other modules as well.
And it’s an integration test we don’t need. Generally, we write integration tests that exercise some system feature, like “export Foo
data in CSV format.” This might involve setting up the Foo
data fixture, invoking the appropriate export feature, then validating the CSV file that it generates. But we don’t need module tests that invoke low-level methods on other modules.
In fact, who said that there’s only one test per class?
Here at The Perl Shop, we generally create a separate test module per method or feature. So we’d create move.t
, eat.t
, and breathe.t
, each of which tests a different Animal
method. This way, we can group together tests by class method, and easily self-document which tests correspond with which feature.[2]
We also inline our test classes in our .t
scripts, which keeps the test code close to the test script and cuts in half the number of files we need to maintain. And makes it impossible to subclass them.
We’ve had great success with these practices, and they’re fundamentally incompatible with Test Hierarchy.
You might have also noticed this in Testing Strategies for Modern Perl. In chapter 2, we create a TicTacToe::BusinessLogic::Game
class, which is tested by new.t
, board.t
, and move.t
.
In the concluding post, I’ll reflect on some of the reasons developers use Test Hierarchy, and why these reasons don’t stack up.
Peace, love, and may all your TAP output turn green…
This post originally appeared on The Perl Shop blog as “Test Hierarchy Produces Poor Unit Tests.”
[1] Formally, Bat.pm
doesn’t directly define the Bat
class. Rather, it defines an implicit “subclass mixin”—all the methods and attributes in Bat.pm
that are added to (“composed with“) its superclass Mammal
. The object system, then, composes the Bat
mixin with Mammal
to create the full Bat
class. Similarly, Mammal.pm
conceptually defines a Mammal
mixin that is composed with its superclass Animal
in order to create the Mammal
class.
[2] Another alternative is to have a separate test class per fixture, which is useful if different test methods have different fixture requirements.
]]>Test::Class
is particularly good at testing object-oriented code, or so it is said. You can create a hierarchy of test classes that mirrors the hierarchy of classes under test. But this pattern, common in Perl projects, is conspicuously missing from the rest of the xUnit world, and with good reason.]]>
We've all heard of it.
Our project has a class Animal
that implements method move()
(because all animals can move). This class has a test class AnimalTest
that derives from Test::Class
and has a test for $animal->move()
.
So far so good.
Our project also has a class Bat
that derives from Mammal
, which derives from Animal
, and implements method fly()
. So we create a test class BatTest
that derives from MammalTest
, which in turn derives from AnimalTest
, and has a test for $bat->fly()
. That means that BatTest
not only exercises all the behavior of Bat
but also of Mammal
and Animal
, because it inherits all the tests in its ancestors.
Wow! What a cool feature! We get all that testing functionality essentially for free by inheritance!
And if the foregoing description sounded confusing, just imagine how good it's going to get as we extend the object hierarchy.
Repeat for umpteen different classes.
This arrangement of test classes is what I mean by Test::Class
Hierarchy, or more generally, Test Hierarchy.
Multiple prominent sources in the Perl community recommend Test Hierarchy.
Test::Class
, I'm sure to hear at least one person complain about "the rabbit hole of inheritance," as jnap once put it in a conversation. Of course, not every project misuses Test::Class
and its support for inheritance, but as he noted, "I've just seen it so wildly abused." (And to be fair, this is not the only abuse of Test::Class
, but it's the pattern I'm examining at the moment.)Enough Perl projects use Test Hierarchy that it pops up in criticisms of Test::Class
itself, and the negative effects are a significant point when they do.
It bears noting, however, that this practice is strongly discouraged in the rest of the programming world. It's so rare, in fact, that Gerard Meszaros doesn't even mention it in his book xUnit Test Patterns.
Rather, the recommended practice is to inherit our test classes directly from Test::Class
(or possibly from a project-or subsystem-specific test base class—but that's a different post). In general, we use as little hierarchy as possible, and whatever hierarchy we do use is organized according to the needs of the tests, not the needs of the system under test. And we never inherit test methods (although we may inherit setup and teardown code).
In summary, we write independent test classes, and we never inherit test methods.
That qualifies Test Hierarchy as a Perl antipattern:
Peace, love, and may all your TAP output turn green…
This post originally appeared on The Perl Shop blog as "Test::Class Hierarchy Is an Antipattern."
]]>