Perl Weekly Challenge 28: File Type and Digital Clock
These are some answers to the Week 28 of the Perl Weekly Challenge organized by Mohammad S. Anwar.
Spoiler Alert: This weekly challenge deadline is due in a couple of days from now (October 6, 2019). This blog post offers some solutions to this challenge, please don't read on if you intend to complete the challenge on your own.
Challenge # 1: File Type
Write a script to check the file content without explicitly reading the content. It should accept file name with path as command line argument and print “The file content is binary.” or else “The file content is ascii.” accordingly.
On most operating systems (VMS is an exception to a certain extent), there is no 100%-reliable algorithm to know whether a file is text (ASCII or UTF-8) or binary, but only some heuristic guesses. Usually, programs that attempt to find out whether a file is text or binary read a raw block of bytes (often 4096 bytes) and make some statistics on the number of bytes corresponding to ASCII printable and space characters versus non-printable characters. If the number of non-printable character exceeds a certain fraction of the whole (for example one third, or 10%, or whatever), then the file is deemed to be binary. Also, any file containing a zero byte in the examined portion is considered a binary file.
File Type in Perl 5
In Perl 5, the -T
and -B
file test operators more or less work as described above. In the program below, we're first using some other file test operators (-e
, -z
and -f
) to check, respectively, that the file exists, that it is not empty and that it is a regular file and then use the -T
and -B
file test operators.
#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
die "Please supply a file name as argument" unless @ARGV == 1;
my $file = shift;
die "File $file does not exist" unless -e $file;
die "File $file is empty" if -z _;
die "File $file isn't a plain file" unless -f _;
if (-B _) {
say "$file is a binary file";
} elsif (-T _) {
say "$file is a text file (ASCII or UTF8)";
} else {
say "Could not determine file type"; # Probaby can't happen
}
Note that once we've used the first such file test operator (here -e
), or the stat
function, on the file name received as an argument, we can use the special _
filehandle for the other file test operators: this tells Perl that we want to use the same file, and Perl then knows it can use the same stat
structure retrieved in the previous call and thus avoids to make further system calls.
This is the output with some arguments:
$ perl file_type.pl
Please supply a file name as argument at file_type.pl line 7.
$ perl file_type.pl foobar
File foobar does not exist at file_type.pl line 9.
$ perl file_type.pl watched-values.pl
watched-values.pl is a text file (ASCII or UTF8)
$ perl file_type.pl amazon.pl.gz
amazon.pl.gz is a binary file
File Type in Perl 6
Perl 6 has most of the Perl 5 test file operators (albeit with a slightly different syntax), but operators equivalent to Perl 5 -T
and -B
file test operators currently do not exist (or are not yet implemented). We will use these existing file test operators to check that the file exists, that it is not empty and that it is a regular file, but we have to roll out our own is-binary
subroutine to try to mimic the Perl 5 -T
and -B
operators. This subroutine will read a raw block of the first 4096 bytes of the file and examine each byte in turn to make some statistics on space characters and printable characters versus non-printable characters.
The slight difficulty, though, is to determine exactly what should be considered a non-printable character. For lack of a standard definition of such characters, I've decided to consider that byte decimal values 0 to 8 and 14 to 31 correspond to ASCII non-printable characters. Those values will be stored in a set. With such a small number of non-printable characters compared to the full extended ASCII, the proportion of non-printable character would be around 10% on a random bytes binary file. I have decided to consider that a file shall be deemed to be text (ASCII) if there is less than one byte out of 32 that is non-printable, and binary otherwise. In addition, any file for which the buffer contains at least one null byte (value 0) is considered to be binary.
use v6;
sub is-binary ($file) {
my constant non-printable-bytes = (0..8).Set (|) (14..31).Set;
my constant block-size = 4096;
my $fh = $file.IO.open(:r, :bin);
my $buf = $fh.read(block-size);
$fh.close;
my ($printable, $non-printable) = 0, 0;
for $buf.list -> $byte {
return True if $byte == 0; # null byte
if $byte (elem) non-printable-bytes {
$non-printable++;
} else {
$printable++;
}
}
return True if $non-printable * 31 > $printable;
False;
}
sub MAIN ($file) {
die "File $file does not exist" unless $file.IO ~~ :e;
die "File $file is empty" if $file.IO ~~ :z;
die "File $file isn't a plain file" unless $file.IO ~~ :f;
say is-binary($file) ?? "File content is binary" !! "File content is text (ASCII)";
}
This appears to work as desired:
$ perl6 file-type.p6
Usage:
file-type.p6 <file>
$ perl6 file-type.p6 foobar.baz
File foobar.baz does not exist
in sub MAIN at file-type.p6 line 23
in block <unit> at file-type.p6 line 1
$ perl6 file-type.p6 file-type.p6
File content is text (ASCII)
$ perl6 file-type.p6 amazon.pl.gz
File content is binary
A Digital Clock
Write a script to display Digital Clock. Feel free to be as creative as you can when displaying digits. We expect bare minimum something like “14:10:11”.
A Digital Clock In Perl 5
For this challenge, we can just write a simple one-liner:
$ perl -e '$|++; while (1) { printf "\r%02d:%02d:%02d", (localtime)[2,1,0]; sleep 1; }'
22:13:27
Two interesting things to say about it: first, we use the \r
(carriage return) to go back to the first column of the screen and overwrite the previously displayed time with the new one each time we want to display a new time. This useless \r
carriage return character (dating from old typewriters) is often a pain in the neck when dealing with Windows-generated files under Unix or Linux (or the other way around), I'm happy that I could find here some useful purpose for this pesky and usually useless character. The other thing is to set the $|
(aka $OUTPUT_AUTOFLUSH
) special variable to a true value (1) to force a flush after every print on the output handle (otherwise, the printed lines are buffered and the output gets messy). Also note that this program uses printf
with a formatting string to make sure that each number is printed over two characters (with a leading zero when needed). This program will run "forever", until you kill it with a Ctrl C
command. It would be easy to add a counter to stop it after a while, if needed.
So, job done? Yes, sure, we're displaying a digital clock. But the task specification suggests to feel free to be creative when displaying the digits. So, let's try to get a nicer output. We could probably use some graphical library such as Tk
, but I haven't used it for a fairly long time and I'm also not sure how to use it in Perl 6. We could also possibly use an HTML display, but I fear that would require to run a Web server, and I don't want to run into annoying environment problems. So I decided to simply display the time with ASCII art.
#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
my @digit_strings = (
' _ - - _ -_ - - _ - _ - _ - _ - _ - ',
'| | - | - _| -_| -|_| -|_ -|_ - | -|_| -|_| - O ',
'|_| - | -|_ -_| - | - _| -|_| - | -|_| - _| - O ',
' - - - - - - - - - - ');
my @digits = map { [split /-/, $_] } @digit_strings;
sub display_time {
my @pieces = split //, shift;
for my $line (0..3) {
for my $digit (@pieces) {
$digit = 10 if $digit eq ":";
print $digits[$line][$digit];
}
say "";
}
}
my $clear_screen = ($^O eq "cygwin" or $^O eq "linux") ? "clear" : "cls";
while (1) {
my $time_str = sprintf "%02d:%02d:%02d", (localtime)[2,1,0];
system $clear_screen;
display_time $time_str;
sleep 1;
}
__END__
Example of displayed time:
_ _ _
| |_| O | | |_ O | _|
| | O |_| |_| O | |_
Running the script from the Linux command line (or from a Windows cmd
terminal) clears the screen and and displays at the top of the screen the time as shown at the end of the script above.
A Digital Clock In Perl 6
We can use a Perl6 one-line as we did in Perl 5:
$ perl6 -e 'loop { my $d = DateTime.now; printf "\r%02d:%02d:%02d", $d.hour, $d.minute, $d.second; sleep 1;'
14:35:06
As for Perl 5, we're using the \r
carriage-return character to overwrite what was displayed previously each time we display a new time. And we don't need to do anything special in Perl 6 to make sure the printed strings are properly flushed.
Let's now try to port our ASCII art display to Perl 6 (I actually wrote the Perl 6 version before the Perl 5 version, but let's pretend we're porting the P5 version).
use v6;
my @digit_strings = (
' _ - - _ -_ - - _ - _ - _ - _ - _ - ',
'| | - | - _| -_| -|_| -|_ -|_ - | -|_| -|_| - O ',
'|_| - | -|_ -_| - | - _| -|_| - | -|_| - _| - O ',
' - - - - - - - - - - ');
my @digits = map { [split /\-/, $_] }, @digit_strings;
sub display_time (Str $time) {
my @pieces = $time.comb;
for 0..3 -> $line {
for @pieces <-> $digit {
$digit = 10 if $digit eq ":";
print @digits[$line][$digit];
}
say "";
}
}
my $clear_screen = ($*VM.osname ~~ m:i/cyg | lin/) ??
"clear" !! "cls";
loop {
my $d = DateTime.now;
my $time_str = sprintf "%02d:%02d:%02d",
$d.hour, $d.minute, $d.second;
shell $clear_screen;
display_time $time_str;
sleep 1;
}
=finish
Example of displayed time:
_ _ _ _ _
| | O | | |_| O | | _|
| | O |_| _| O |_| |_
Wrapping up
The next week Perl Weekly Challenge is due to start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on Sunday, October, 13. And, please, also spread the word about the Perl Weekly Challenge if you can.
Leave a comment