Virtual Spring Cleaning (part 4 of X) in which I release Archive::SevenZip

Archive::SevenZip had been laying around on my computer since a long time. 7-Zip has the great advantage of being able to read a plethora of archive formats, including ISO9660 image files. Its main use is to unzip and sort incoming downloads on my desktop machine, but I have also used it to extract files from ISO-rips of CDs and DVDs.

The module tries to export an API close to Path::Class, because my long-term goal is to have directories and archives appear identical to an application. Especially for distributing (say) themes for a website, it is highly convenient to have a complete theme in a tarball and have the application be able to read and process a theme no matter whether the theme lives in theme/mytheme/ or in theme/mytheme_20160401.tar.gz. Surprisingly, others already made provisions to make such a scheme work. Template::Toolkit for example allows loading templates from about anywhere by just supplying the appropriate class.

The other API that Archive::SevenZip somewhat tries to emulate is the API of Archive::Zip. While I'm no big fan of camelCase names, the API is established and having a drop-in API compatible way to extend your application from reading just .Zip files to also reading .7z , .rar and .tar.gz files seems nice. The 100% compatibility is not yet there though.

Having my release process run on Debian uncovered some interesting implementation quirks of the 7z executable there. The 7z 9.20 executable refused to write its output to the same terminal where it spews its messages. But it offers no option to silence its diagnostic and progress messages. So I had to make some last-minute internal changes, using IPC::Open3 on unixish systems to have the child put its STDERR to /dev/null.

After getting version 0.01 onto CPAN, I had enough tests and enough trust in my test suite and version control to rip out the original Windows code path of the IPC, which used list-open to talk to 7-zip and to use the same code path that the unixish part uses instead. That worked flawlessly.

There still are a lot of interesting problems. One especially murky corner is the handling of different filename encodings. The Windows filesystem encodes filenames as "ANSI", while other strings are usually encoded as Latin-1 or UTF-8. This often leads to situations where a filename as read from a file cannot be found on disk. This situation is not handled at all with Archive::SevenZip yet.

1 Comment

Looks great!

Leave a comment

About Max Maischein

user-pic I'm the Treasurer for the Frankfurt Perlmongers e.V. . I have organized Perl events including 9 German Perl Workshops and one YAPC::Europe.