Test Driven Development of Excel::Writer::XLSX Part I
For the last few months I have been porting Spreadsheet::WriteExcel to the new Excel 2007+ file format.
The older Excel file format was comprised of sequential binary records whilst the new file format is a collection of XML files in a zip container.
The newer module maintains the same API as Spreadsheet::WriteExcel but is in a different namespace. It is called Excel::Writer::XLSX.
Some of the test driven development aspects of writing Excel::Writer::XLSX have been interesting and I'd thought that I'd blog about them here.
At its simplest an Excel XLSX file contains the following elements:
____ [Content_Types].xml
|
|____ docProps
| |____ app.xml
| |____ core.xml
|
|____ xl
| |____ workbook.xml
| |____ worksheets
| | |____ sheet1.xml
| |
| |____ styles.xml
| |
| |____ theme
| | |____ theme1.xml
| |
| |____ _rels
| |____ workbook.xml.rels
|
|____ _rels
|____ .rels
The .xml files contain the data and the .rels files link them together. These are all zipped into a single .xlsx file.
Since the files are XML based I initially considered generating them with a Class to XML mapper. However, interactions between individual files and between elements within the files made simple mapping impossible.
In the end I settled on using XML::Writer and statically auto-generating methods for outputting individual XML elements.
So, for example in the Sheet1.xml
worksheet file shown above a typical element representing the page setup settings might look like this:
<pageSetup paperSize="9" orientation="landscape" />
I extract this from a sample file and use a small Perl program to auto-generate the code for creating the XML element. The output looks something like the following:
# Write the pageSetup element.
$self->_write_page_setup();
##########################################################################
#
# _write_page_setup()
#
# Write the <pageSetup> element.
#
sub _write_page_setup {
my $self = shift;
my $paper_size = 9;
my $orientation = 'landscape';
my @attributes = (
'paperSize' => $paper_size,
'orientation' => $orientation,
);
$self->{_writer}->emptyTag( 'pageSetup', @attributes );
}
More importantly it also auto-generates a test case for the method like this:
###########################################################################
#
# Tests for Excel::Writer::XLSX::Worksheet methods.
#
# reverse('©'), January 2011, John McNamara, jmcnamara@cpan.org
#
use lib 't/lib';
use TestFunctions '_new_worksheet';
use strict;
use warnings;
use Test::More tests => 1;
###########################################################################
#
# Tests setup.
#
my $expected;
my $got;
my $caption;
my $worksheet;
###########################################################################
#
# Test the _write_page_setup() method.
#
$caption = " \tWorksheet: _write_page_setup()";
$expected = '<pageSetup paperSize="9" orientation="landscape" />';
$worksheet = _new_worksheet(\$got);
$worksheet->_write_page_setup();
is( $got, $expected, $caption );
__END__
In the simplest cases the new method will work as expected and the test case will pass first time. In most cases the method will need some additional code to handle non-default cases and more tests will have to be added. However, as a minimum all new methods will have a test case.
If you are interested in seeing the final implementation of the test case have a look at sub_write_page_setup.t
and the code for the method that it tests write_page_setup()
(search down).
There are currently around 200 test files and 1100 test cases in the Excel::Writer::XLSX test suite and the test code base is around 18 KLOC for a code base of around 16 KLOC.
The methodology of generating test cases and code together has been quite successful and in less than 6 months I have been able to re-write 90% of the methods of Spreadsheet::WriteExcel which took over 10 years to write in the first place.
This productivity has been helped greatly by having a pre-defined and unchanging specification in the form of the Spreadsheet::WriteExcel API and existing documentation and examples. This in not insignificant since these elements probably comprised around 50% of the effort that went into Spreadsheet::WriteExcel.
Another advantage of the new module is that I am able to test complete files generated by Excel::Writer::XLSX against files generated by Excel.
I'll write about that in a separate post.
Hi John
I, and probably many thousands of other programmers, thank you for your work.
I've only used Tony Bowden's Spreadsheet::ParseExcel::Simple, based on your Spreadsheet::ParseExcel, but even that greatly impressed me with the quality and quantity of work which went into this set of modules.
In case your're wondering why I've - apparently - never praised your work before, I have. About 2 years ago I wrote a program to extract the names of all the (first-level) modules I use, and sent off a perlthanks msg about each. I heard nothing back about that, so I don't even know if they were delivered. And in the above case, it presumably went to Tony, if anyone.
Nevertheless, I'm grateful for your work, and that of all authors and testers.
PS
Ahhhh. I've just found that I don't have to log in twice to comment, if I log in first and /then/ try to comment. I'm going to click sumbit rather than preview, so as to not press my luck.
Cheers
Ron