December 2012 Archives

Refactoring When Tests Are Failing

Today I'm working on some code. That's not surprising. It's also not surprising that the person I'm working on the code with is on holiday until the new year starts. Before he left, he made some significant changes to optimize our code, but for one subsystem he didn't have the time to apply his changes. This means that we have 7 tests failing out of 397. In a quick check we found there were plenty of conflicts between his code and mine and my work was generating even more conflicts. This meant we had two choices for how to deal with this while he was gone:

  1. I could continue my work with passing tests and deal with even more conflicts when he returns
  2. I could merge his code, deal with some failing tests. get rid of the conflicts and have a much faster code base

We opted for the latter, but when running the tests, how do I know if I broke something? This is one reason that we argue that all tests must pass for a test suite: do you scan through all of the broken tests to see if they've somehow broken differently? If you do, you'll start to ignore failures because that's normal. Eventually you'll get serious failures that showed up in testing but that you didn't notice.

So what follows isn't perfect, but if you have to temporarily deal with failing tests, it's far better than just trying to remember which tests are "allowed" to fail.

What are the files in a CPAN distribution?

I generally have no problem understanding what's in a CPAN distribution and why, but lately (for years, now), it seems that tools are building files that I don't know about. Is there a canonical place to find information about what absolutely, definitely needs to be in a CPAN distribution? I generally rely on my tools to do this stuff for me, but honestly, what's META.json, MYMETA.json, META.yml,MYMETEA.yaml? And then I wind up with files like MANIFEST.bak and MANIFEST.SKIP.bak and I honestly don't know the format of many of these files, why they're duplicated or where to look up this information. Which of these files should I commit to source control and which of these files should git ignore?

It's an embarrassment because honestly, I should know this stuff, but I don't. Surely there's a place where I can find all of this explained at once, perhaps in conjunction with a description of the files in my .cpan, .cpanm directories?

Test::Class + Moose?

I recently wrote about finding duplicate code in Perl and I've just uploaded Code::CutNPaste to the CPAN. Hopefully folks will find it useful.

However, what's not yet on the CPAN is Test::Class::Moose. People have asked me repeatedly how to mix the two, so I whipped up a quick alpha.

More On Finding Duplicate Code in Perl

While at the Quack and Hack, I wrote Code::CutNPaste. This tries to find duplicate code in Perl and does a fairly decent job, right down to finding code where people have changed the variable or subnames.

At the suggestion of Liz, I add a --jobs switch. For one project with almost 400 .pm files, I originally had this:

time find_duplicate_perl lib > report.txt
real    65m52.922s
user    39m2.998s
sys 24m27.776s

I now have this (a multi-core machine helps):

time find_duplicate_perl --jobs 4 lib > report.txt
real    22m49.700s                                                    
user    41m22.387s
sys 34m36.146s

It finds plenty of duplicated code, too. It also now does a reasonable job of not reporting this as a duplicate (see the (misspelled) --threshhold switch):

            };           |         };
        }                |     }
        return \@result; |     return \@result;
    }                    | }
    sub _confirm {       | sub _confirm {

Plus, one person is already threatening patches. When it's a bit more reasonable, I'll put it on the CPAN for you.

Finding Duplicate Code in Perl

When working with legacy code, it's useful to have a variety of tools that let you better understand your code base. For example, I recently wrote about finding unused subroutines. That heuristic approach was fine because I was still going to inspect the code manually rather than automatically remove the code.

So now I hacked out a rough "duplicate code finder" for Perl. It focuses on cut-n-drool code and has found more than I would have thought (even in my code!). If a developer changes variable names, it won't find it, but if I hacked around with B::Deparse, I could fix that, too.

About Ovid

user-pic Freelance Perl/Testing/Agile consultant and trainer. See http://www.allaroundtheworld.fr/ for our services. If you have a problem with Perl, we will solve it for you. And don't forget to buy my book! http://www.amazon.com/Beginning-Perl-Curtis-Poe/dp/1118013840/