Refactoring Very Old Perl 5 in Perl 6

By Aaron Baugher on August 6, 2015 11:28 AM

Back when I was first learning Perl, I'd been doing Unix system administration for a couple years, and one command I ran a lot was this one:

ps auxww | grep something

(On some systems it was 'ps -ef'.) That would get a full listing of all running processes and grep them for "something." I soon got tired of typing all that, so I made a shell alias:

alias pst='ps auxww | grep '

Then I could just run pst something, so it saved typing. But it still wasn't great. It left out ps's header line that showed what all the columns were, and they'd vary from one OS to another, so it wasn't always easy to tell from the data. Also, the grep process itself would show up in the list, which was annoying. (I already knew it was running, because I ran it.) So one of the first Perl scripts I wrote was this one, which I've been using ever since because it worked, even though the code is embarrassingly bad now:

#!/usr/local/bin/perl

open(IN,"ps axuww |")||die("Unable to get process listing\n");

$header = <IN>;
print "$header";

while(<IN>){
    next if ($_ !~ /$ARGV[0]/);
    s/^\s+|\s+$//g;
    @v=split(/\s+/);
    next if $v[1] == $$;
    print;
    print "\n";
    $trs+=$v[4];
    $drs+=$v[5];
    $size+=$v[6];
    $swap+=$v[7];
    $rss+=$v[8];
    $shrd+=$v[9];
    $lib+=$v[10];
    $dt+=$v[11];
}

print("-"x80);
print("\n$header");
printf("Totals:%22d%6d%6d%6d%6d%6d%6d%6d\n",$trs,$drs,$size,
       $swap,$rss,$shrd,$lib,$dt);

So much ugly by my standards now: no warnings or strict, unnecessary parentheses all over the place, 2-arg open with global filehandle, unnecessary $_, giving split its default argument, no my on variables, and more. Worst of all, I was accumulating totals on some of the columns, and that doesn't even make sense! Maybe it did on an OS I was using back then, but it doesn't on any I have now.

So it was long overdue for an update, and I thought I might as well do it in Perl 6. Here is the result, with numbered comments below:

#!/usr/bin/env perl6
use v6;

sub MAIN( $string ){                   # 1
    my $p = run 'ps', 'auxww', :out;   # 2
    my $header = $p.out.get;           # 3
    say $header, '-' x 80;             # 4

    for $p.out.lines {                 # 5
        next unless m/ $string /;      # 6
        .trim;                         # 7
        my @v = .words;                # 8
        next if @v[1] == $*PID;        # 9
        .say;                          # 10
    }
    say '-' x 80, $header;             # 11
}

I feel better just looking at it. The accumulation of totals is gone entirely, and here are notes on the rest:

(1) The MAIN sub handily replaces @ARGV (though I think that may still be available). By specifying an argument for it, I automatically get in $string what would have been in $ARGV[0], plus perl6 automatically throws an error if I don't supply it. So no need for a "die unless" on command line arguments anymore.

(2) This opens a Proc object ($p) to a running process provided by IO.run(). The :out adverb says I want the output of the process. By the way, look at the Proc docs for examples of this, not IO.

(3) Calling get() on the output stream (an IO::Handle object) returns a single line from the stream. That grabs the line of column headers that ps provides.

(4) Print out the header (which still has its newline), then a separator line of 80 hyphens. The 'x' operator still works on strings like it did in Perl 5, but see the 'xx' operator for repeating lists.

(5) Using IO::Handle::lines() in a for loop is more-or-less the equivalent of while(<$fd>) in Perl 5: it gets one line at a time until exhausted. The line will be in $_, and that will be the default object for any methods called as .method, like .trim and .say. Oh, and it auto-chomps, which is nice.

(6) This looks like Perl 5, except whitespace is allowed in the regex. Also, the string in $string will not be interpolated as it would be in Perl 5, so whatever argument I provide will be searched for literally. If I wanted to be able to enter patterns with meta-characters and have them interpolated, I'd need to put angle brackets around it, like this: <$string>. As I understand it, without the angle brackets, scalar values in regexes are automatically quotemeta'd.

(7) Str.trim() is the equivalent of that ugly regex in my original script, which trims off whitespace from both ends. I probably don't need that on my current systems, but I think I ran into some systems where ps didn't left-justify the first column.

(8) Str.split() no longer has a default pattern of splitting on whitespace, but the new Str.words() does that now.

(9) $*PID replaces the old $$, containing the process ID of the script itself. I don't want that in the output, so I skip the line that has that in the PID (process ID) column.

(10) If it reached this point, print the line with newline.

(11) Reprint the separator and header at the bottom, so I can see them there if the top scrolls out of the terminal.

And that's it! Corrections, suggestions, and questions welcome.

12 comments

Tagged as:

Perl 6, processes, shell, Unix

12 Comments

Liz | August 6, 2015 4:27 PM | Reply

Great post! Some comments:

1. You're using .trim the wrong way: .trim does *not* modify $_. It should probably warn in that case, actually.

2. You don't need .trim at all, as .words ignores any whitespace at the begin or at the end.

3. You can directly index on .words: since .words is lazy, it would also mean it would parse until the 2nd element is found.

4. If you're just looking for the occurrence of a string in another string, it's faster to use .index

So I would write lines 6-10 as:
.say if .index($string).defined && .words[1] != $*PID
We should probably have a better way to check for existence of a substring, I agree :-)

Neil Bowers | August 6, 2015 6:14 PM | Reply

While waiting for a late meeting I hacked up a quick perl5 version without using any CPAN modules:



    my @lines  = `ps auxww`;

    my $header = shift @lines;

    my $rule   = '-' x length($header)."\n";

    print $header, $rule;

    print grep { /$ARGV[0]/ && !/^\S+\s+$$\s/ } @lines;

    print $rule, $header;

That regex for ignoring the current process probably isn't portable, but you could use something like P9Y::ProcessTable to portably get at process info.
In version 2 I'd probably only print the footer if more than N lines matched.

So far I'm finding it just slightly disconcerting seeing lines like:



    .trim;

probably down to javascript, where something like that looks like part of a chain of calls, rather than invoking on the topic.

Martin Hradil | August 6, 2015 6:34 PM | Reply

> We should probably have a better way to check for existence of a substring, I agree :-)

We actually do now :) .. index now returns 0 but True, so .index($string) is enough.

Aaron Baugher | August 6, 2015 7:10 PM | Reply

Yeah, I realized later that trim() wasn't doing anything for me anyway, since none of my lines had whitespace at the beginning and they're being autochomped. Good to know words() trims, though, and those are some nice improvements; thanks! This "lazy" stuff is taking some getting used to.

ugexe | August 7, 2015 12:42 AM | Reply

Note: run is a method of Proc, not IO

Liz | August 7, 2015 3:25 AM | Reply

Martin Hradil: re .index returning 0 but True? Alas, it didn't make it past TimToady's vetting in the long run :-( I removed that feature yesterday again,

Long story: the feature was implemented using a subclass of Int, basically:
class Index is Int { method Bool { self.defined } }
Coercing an Int to an Index would work, but coercing an Index into an Int shouldn't (as it was already an Int). I made that work, however, but that was really against expectations (well, not mine, obviously).

Anyways, we're most likely going to have another way of handling the
if "foo".index("f") -> $pos { say $pos }
gotcha in another way.

Aaron Baugher replied to comment from ugexe | August 7, 2015 6:18 AM | Reply

ugexe, are you sure about that? I find run() listed as a method on the IO page, not on the Proc page. Am I misunderstanding something?

ugexe | August 7, 2015 12:22 PM | Reply

Yes, I am sure

http:/github.com/rakudo/rakudo/blob/a190b23bf9d81d84670a36ad448280ec8941144e/src/core/Proc.pm#L127

ugexe | August 7, 2015 12:24 PM | Reply

https://github.com/rakudo/rakudo/blob/a190b23bf9d81d84670a36ad448280ec8941144e/src/core/Proc.pm#L127

Aaron Baugher replied to comment from ugexe | August 7, 2015 1:34 PM | Reply

Ok, thanks. I see it on the Proc page now, in the section on new(). It just doesn't have its own =head section, so it doesn't show up in the method list at the top. Now I need to read more about Roles and find out why it does show up on IO's page.

Aristotle | August 11, 2015 10:52 PM | Reply

If I caught the most recent developments right, .index($substr).defined is now called .contains($substr). With that, the entire loop body becomes just

.say if .contains($string) && .words[1] != $*PID

Aristotle replied to comment from Aristotle | August 13, 2015 4:11 AM | Reply

Turns out I did not. The above comment is wrong. However, what has happened is that .index now returns an Index value, which is just like an Int except it is only false when it is undefined. So you can just say

.say if .index($string) && .words[1] != $*PID

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About Aaron Baugher

I'm a programmer and Unix sysadmin who uses Perl as much as possible, operating from the Midwest USA. To hire me for sysadmin or programming work, contact me at aaron.baugher @ gmail.com or as 'abaugher' on #perl6.

More info »

Aaron's Perl 6 Blog