Testing and Validating HTML Templates

Here’s a little trick I’ve been using for a while to help ensure that a large sprawling Catalyst application always generates valid HTML. The idea is to unit test all your templates: the trick is to make it really easy with a helper class:

package My::Test::Template;

use strict;
use warnings;

use parent q(Template);

use Hash::Merge::Simple;
use Path::Class qw(dir);
use HTML::Lint::Pluggable;
use Test::More;

sub validate {
    my ( $self, $output ) = @_;

    # Routine to validate HTML5 structure

    # If this looks like an HTML fragment, wrap it in minimal tags
    if ( $output !~ m{^<html[^>]*>}ims ) {

        $output = join(
            "\n",
            '<html><head><title>Title</title></head><body>',
            $output,
            '</body></html>'
        );
    }

    my $lint = HTML::Lint::Pluggable->new();

    $lint->load_plugins('HTML5');
    $lint->only_types(HTML::Lint::Error::STRUCTURE);
    $lint->parse($output);
    $lint->eof;

    my $message = 'output is valid HTML5';
    if ( $lint->errors ) {
        for my $error ( $lint->errors ) {
            warn $error->as_string, "\n";
        }
        fail($message);
    }
    else {
        pass($message);
    }

    return $output;

} ## end sub validate

sub _init {
    my ($self, $config) = @_;

    # Modify the _init() routine from Template:
    #
    #   * add an INPUT parameter so we can specify the template file or string
    #     to test.
    #
    #   * set the default template INCLUDE_PATH and other config options to be
    #     the same as used by our Catalyst view.
    #
    $self->{INPUT} = $config->{INPUT} or die "INPUT parameter is required";

    $config = Hash::Merge::Simple::merge(
        {
            INCLUDE_PATH => [
                dir( $ENV{'PWD'}, '/root/src' )->cleanup,
                dir( $ENV{'PWD'}, '/root/lib' )->cleanup,
            ],
            TRIM        => 1,
            PRE_CHOMP   => 1,
            POST_CHOMP  => 0,
            PRE_PROCESS => 'main.tt2',
            TIMER       => 0,
        },
        $config
    );

    $self = $self->SUPER::_init($config);

    return $self;
}

sub process {
    my ( $self, $vars ) = @_;

    # Modify the process() routine from Template:
    #
    #    * make process() use the INPUT key for the template variable
    #    * die on errors rather than returning an error code
    #    * return the result of successful processing
    #    * always run the validate routine on processing a new template

    my $output = '';
    $self->SUPER::process( $self->{INPUT}, $vars, \$output )
      or die $self->error;

    return $self->validate($output);

} ## end sub process

1;

With this helper class, I can easily unit test the following template:

<div class="subnav_holder">
    <ul class="subnav">
        <li><a href="/faq">FAQ</a></li>
        [% IF has_media %]
        <li><a href="/media">In the media<a></li>
        [% END %]
        <li><a href="/about">About Us</a></li>
        <li><a href="/contact">Contact Us</a></li>
        <li><a href="/legal">Legal</a></li>
    </ul>
</div>

like so:

use strict;
use warnings;

use Test::More;

use My::Test::Template;
my $tt = My::Test::Template->new( { INPUT => 'subnav.tt2', })
    or die "$Template::ERROR\n";

unlike( $tt->process(), qr{media}ms => "Don't display media link" );

like( $tt->process( { has_media => 1 } ), qr{media}ms => 'Display media link' );

done_testing();

When I run the test I get:

ok 1 - output is valid HTML5
ok 2 - Don't display media link
 (5:45) <a> at (5:13) is never closed
 (5:45) <a> at (5:42) is never closed
not ok 3 - output is valid HTML5
#   Failed test 'output is valid HTML5'
#   at root/src/subnav.t line 37.
ok 4 - Display media link
1..4
# Looks like you failed 1 test of 4.

which highlights the HTML validation error that gets exposed if I stash “has_media”.

I really have no excuse if my app generates invalid HTML.

Happy Hacking!

Kal

Hubble, Bubble, Toil and Trouble: Catalyst, Bootstrap and HTML::FormFu

I thought I would share a little trick I use to get these three complex and idiosyncratic frameworks to play nice with each other.

Catalyst and HTML::FormFu are a powerful combination that allows you to tie the form displayed by your view to the form processed by your controller. This direct link means: 1) your field names in your generated HTML will always match the field names used in form processing, 2) default and redisplay field values set by your controller will always match up with the values displayed by the view, and 3) any constraint or validation issues detected by the form processing logic in your controller can be directly associated with the fields displayed by your view. Without this link, keeping a form defined in our view in sync with the form defined by our controller is a large source of potential bugs.

The problematic part of our potion is the combination of HTML::FormFu and Bootstrap. HTML::FormFu is a very powerful and elegant framework for managing forms. It is also very complex and can take some real time and effort to master. It can be a real challenge to get the forms generated by HTML::FormFu to be marked up exactly the way you want. This is problematic when combined with display frameworks like Bootstrap, which can be very fussy about their required markup.

The trick to getting this all to work is best show by example.

Lets start with the following HTML::FormFu form definition:

elements => [
    {
        name => 'min_amount',
        type => 'Text',
    },
    {
        name        => 'max_amount',
        type        => 'DollarAmount',
        constraints => [
            {
                type    => 'GreaterThan',
                others  => 'min_amount',
                message => 'the maximum amount must be greater than the minimum',
            },
        ],
    },
    {
        name  => 'search',
        type  => 'Submit',
    },
]

The corresponding form can be constructed and processed by a Catalyst controller in a number of ways. Ultimately we end up with a form object on the stash which stringifies to an HTML form. This can be use directly in our Template Toolkit view:

[% form %]

The resulting HTML, although lean, is going to be a challenge to style and layout with Bootstrap:

<form action="/my_form" method="post">
    <div class="text">
        <input name="min_amount" type="text" />
    </div>
    <div class="dollaramount">
        <input name="max_amount" type="text" />
    </div>
    <div class="submit">
        <input name="search" type="submit" />
    </div>
</form>

To be fair HTML::FormFu does have a large number of facilities for modifying the generated markup. However, in my experience, they add a lot of complexity to an already complex framework soup. They can also require embedding a lot of display related information in the controller which is not ideal.

An alternative is to construct the form display piece-wise.

First, we strip most of the formating logic from our HTML::FormFu form using the new ‘layout’ feature, and add some Bootstrap classes:

default_args => {
    elements => {
        Field => {
            layout => [ 'field' ],
            attrs  => { class => 'form-control' },
        },
        Submit => {
            attrs => { class => 'form-control btn btn-primary' },
        },
    },
}

(See HTML::FormFu::Role::Element::Field for the gory details of layout)

Now we can markup our form so that it works with Bootstrap:

<form class="form" action=[% form.action %] >
    <div class="row">
        <div class="col-md-2">Lead Amount</div>
        <div class="form-group col-md-4">
            <label for="min_date" class="sr-only">Minimum</label>
            <div class="input-group">
                <div class="input-group-addon">$</div>

                [% form.get_element('min_amount').placeholder('Minimum') %]

            </div>
        </div>
        <div class="form-group col-md-4">
            <label for="max_date" class="sr-only">Maximum</label>
            <div class="input-group">
                <div class="input-group-addon">$</div>

                [% form.get_element('max_amount').placeholder('Maximum') %]

            </div>
        </div>

        <div class="form-group col-md-2">

            [% form.get_element('search').value('Search Leads') %]

        </div>
    </div>

    <div class="row">
        <div class="col col-md-10 col-md-offset-2 text-danger">

        [% FOR error IN form.get_errors %]
            [% error.message | html_line_break %]
        [% END %]

        </div>
    </div>

</form>

The trick here is to note that the form object stringifies recursively. If you pick out an element of the form object, it will stringify to the corresponding HTML field.

[% form.get_element('search') %]

You can also modify the element in place before the stringification takes place:

[% form.get_element('search').value('Search') %]

The result looks something like this:

form-display.png

Magic!

This gives us a much better separation of display and processing concerns. It is now much easier for me to pass off my template to a designer who can easily tweak language and worry about the finer points of styling and layout.

Happy Hacking!

Kal

A Catalyst Service Bus (from scratch)

In the following I describe how to build (from scratch) a simple Catalyst application that acts as a service bus for a collection of other Catalyst applications.

Here I'm using the term service bus to describe an application that provides web services to other applications (rather than, say, a JavaScript enabled web browser). This service bus acts as a central hub, taking requests form applications for tasks that sit outside their scope and either executing those tasks or passing them on to other applications.

service-bus.png

The following assumes you have a working Catalyst development environment (see www.catalystframework.org for instructions).

A Service Application

Our service bus will communicate with the other applications using simple HTTP requests and will pass data using JSON strings. Let's start by using the Catalyst helper script to make a Service application:

> catalyst.pl Service
...
> cd Service

This generates a skeleton Catalyst application under the Service directory. A lot of files are generated, but, for the moment, we are only interested in the following:

service.conf                     # configure our app
script/service_server.pl         # run our app
script/service_create.pl         # extend our app
lib/Service/Controller/Root.pm   # actions for paths under /

Our service bus is going to pass around JSON data, so let's create a JSON view

> ./script/service_create.pl view JSON JSON

This creates the file lib/Service/View/JSON.pm. We want this to be our default view and for that view to generate JSON encoded strings from the response_data key in the stash. Edit service.conf and add the following to the end of the file:

default_view JSON

<View>
    <JSON>
        expose_stash response_data
    </JSON>
</View>

Now we will setup actions to handle some basic requests for testing the application. Edit lib/Service/Controller/Root.pm and add the following after the initial documentation boilerplate:

use JSON;

sub auto : Private {
    my ($self, $c) = @_;

    # JSON view requires response_data key to be well defined 
    $c->stash->{response_data} = { message => 'no message'};

    # decode and stash POST data
    if (my $data_str = $c->request->body_params->{data}) {
        $c->stash->{request_data} = decode_json($data_str);
    }
    return 1;
}

sub ping : Local {
    my ( $self, $c ) = @_;
    $c->stash->{response_data} = { message => 'pong'};
}

sub echo : Local {
    my ( $self, $c ) = @_;
    $c->stash->{response_data} = { 
        data    => $c->stash->{request_data},
        message => 'echoing sent data',
    };
}

We now have a minimally functional service bus, that supports two requests: ping and echo. We can run this Service application in a terminal with

> ./scripts/service_server.pl
...
HTTP::Server::PSGI: Accepting connections at http://0:3000/

and test it in another terminal using the GET and POST wrappers from LWP.

> GET http://localhost:3000/ping
{"message":"pong"}> 

> echo 'data={  "foo"  :  "bar"  }' | POST http://localhost:3000/echo
{"data":{"foo":"bar"},"message":"echoing sent data"}>

The debugging output in the first terminal shows how these requests are being processed. For those new to Catalyst, lets break down of what's happening with the POST request:

  1. The Service application that we started is listening for HTTP requests at port 3000 on localhost

  2. When we request the path /echo, the Catalyst dispatcher maps this to the echo() action in the Root controller.

  3. The dispatcher then constructs and executes the following chain of of actions (registered subroutines) in the Root controller:

    auto() -> echo() -> end()
    
    1. auto() decodes the string in the data parameter and stashes that value under request_data.

    2. echo() sets the echo key in the response_data stash to the same value as request_data.

    3. end() renders the current view and sets the HTTP response content to the result.

  4. The current view is the default view Service::View::JSON. This view encodes the value of the response_data key in the stash as a JSON string.

With the echo example, I added a little white space to the data in the POST request highlight the fact the passed JSON string gets decoded and re-encoded, rather than just being echoed.

Leave this Service application running in its terminal: we'll use it in the next example.

At this point the service bus is exposing a simple interface via the two public paths:

/ping
/echo

This can be extended to cover whatever operations are required of your service bus.

A Client for the Service Application

Now lets create a separate Catalyst application to talk to this service bus.

> catalyst.pl MyApp
...
> cd MyApp

First we'll create a model to handle the communication with the service bus:

> ./scripts/myapp_create model Service

This creates the file ./lib/MyApp/Model/Service.pm. Edit that file and add the following after the documentation boilerplate:

use LWP::UserAgent;
use JSON;

has _ua  => ( 
    is => 'ro', 
    default => sub { LWP::UserAgent->new(timeout => 10) } 
);
has _url => ( is  => 'rw', isa => 'Str' );

sub BUILD {
    my ($self, $args) = @_;
    $self->_url($args->{url});
}

sub post {
    my ($self, $path, $data_ref) = @_;

    my $ua = $self->_ua->clone(); 
    $path = $self->_url . '/' . $path;

    my $response;
    if ($data_ref) {
        $data_ref = encode_json($data_ref);
        $response = $ua->post($path, { data => $data_ref });
    }
    else {
        $response = $ua->get($path);
    }

    if ($response->is_success) {
        if ($response->decoded_content) {
            return decode_json($response->decoded_content);
        }
        return;
    }

    warn "$path: " . $response->status_line;
    return;
}

Note the two private attributes. The _ua attribute stores an LWP user agent object that we will clone before we use it. This is so we don't have to worry about the object growing (by recording request history) over the lifetime of the application. The _url attribute is set up to allow us to easily configure the URL for the service bus that this application talks to. Edit myapp.conf and add the following:

<Model>
    <Service>
            url http://localhost:3000
    </Service>
</Model>

Now when MyApp wants to post a message to the Service application, it will use http://localhost:3000 as the base of the URL used in the HTTP request.

Lets create an action to test the communication between the two applications. Edit ./lib/MyApp/Controller/Root.pm and add the following routine:

sub ping_service_bus : Local {
    my ( $self, $c ) = @_;

    # set the response body directly rather than using a view
    if (my $response = $c->model('Service')->ppost('ping')) {
        $c->response->body( $response->{message} . "\n");
    }
    else {
        $c->response->body('Could not ping the service bus');
    }
    return;
}

To keep our example simple, we set the response body directly in the ping_service_bus action, rather than using a view.

We run can run this application in another terminal, but we have to use a different port since port 3000 is already in use by the service bus.

> ./scripts/myapp_server.pl -p 3001
...
HTTP::Server::PSGI: Accepting connections at http://0:3001/

We can see the two applications communicating in yet another terminal with

> GET http://localhost:3001/ping_service_bus
pong

Extending the Interface

Now that we have a skeleton service bus, we can extend its public interface (the collection of requests it accepts) and the corresponding operations it performs. Details of this a very specific to your application, but we can give some examples that highlight the use of the Catalyst framework.

Suppose you have two types of applications that will connect to your service bus: a Sales application used by sales staff and an Inventory application used by warehouse staff. These applications were built independently, but it would be great if they could communicate: sales staff could know if products are already in stock; warehouse staff could be informed when stock needs to be ordered. This is something that could be accheived using a service bus. As the business grows, management might require aggregate information about both sides of the enterprise. You could build a Reports application that gathers that information via extensions you make to the service bus.

As the service bus grows to handle many different types of requests, it may make sense to organise them using namespaces. We could split our interface into requests relating to sales and requests relating to inventory. With Catalyst we do this by simply creating two new controllers:

> cd Service
> ./script/service_create.pl controller Sales
> ./script/service_create.pl controller Inventory

This creates the two files

lib/Service/Controller/Sales.pm       # actions for paths under /sales/
lib/Service/Controller/Inventory.pm   # actions for paths under /inventory/

With the default Catalyst namespacing, any requested path that begins with /sales/... is handled by actions in the Sales controller, and any path that begins with /inventory/ is handled by actions in the Inventory controller. Using namespaces this way, the set of paths your service bus accepts can evolve into a well structured API.

Suppose we want to allow third party clients to update their product pricing. We have many third party client, but the process of updating pricing is going to be the same. In addition, we will want to restrict access to these updates to the appropriate party.

Catalyst chained actions allow us to create virtual paths that handle this situation well. First lets create a controller to handle all paths begining with /pricing/:

> ./script/service_create.pl controller Pricing

Edit the file lib/Service/Controller/Pricing.pm and add the following actions:

# matches: /pricing/(brand)/...
sub brand : Chained('/') PathPart('pricing') CaptureArgs(1) {
    my ( $self, $c, $brand ) = @_;

    if ($brand !~ m{^\w+$}) {
        $c->error("bad brand: $brand");
        $c->detach();
    }

    $c->stash->{brand} = $brand;
    return;
}

# matches: /pricing/(brand)/publish
sub publish :Chained('brand') PathPart('publish') Args(0) {
    my ( $self, $c ) = @_;

    $c->stash->{response_data} = { 
        message => 'published pricing for' . $c->stash->{brand},
    };
    return;
}

If we restart our Service application we can now send requests to publish pricing for diferent brands using different paths:

> GET http://localhost:3000/pricing/best_for_less/publish
{"message":"published pricing for best_for_less"}>

We can now use standard webserver access techniques to restrict access to paths begining with /pricing/best_for_less/ to our 'Best for Less' clients.

With namespacing tecniques like the above, we can construct versitle message heirarchies.

Other ideas

We can do a lot more with this. The following are just some thoughts.

  1. Cron jobs

    We could write an action to send us an email if our pricing data has become stale, and set up a cron job to check for this at 9am weekdays as follows:

    0 9 * * 1-6 /usr/bin/GET http://service/sales/check_pricing > /dev/null
    
  2. Databases

    It is trivial to extend our Service application to access any number of databases. Two possiblilies immediate occur.

    1. Storing elements of the service bus configuration in the database allowing us to change it on the fly.

    2. Using the service bus to marshall data between different databases.

  3. Security

    A service bus can be used as an information security layer: access is controlled at a single point and we can implement path based access rules.

  4. Scaling

    The service bus is just a web application and we can scale it in well know ways. We can have any number of load balanced Service applications or even put different instances in different locations.

  5. Message Queues

    Service bus is purely synchronous and is not a Message Queue. But but a message queue could be a client to the service bus, or even a front-end.

  6. Dancer

    I would be interested to see if something similar (and more light-weight) could be done with Dancer.