Trying to make confluence usable.

=head1 RESTfluence

I've tried to make this blog post copy/pastable as valid perl and valid markdown. So with luck it can be copy/pasted into an editor if you want to use this.

Confluence. I don't really like it, but the major thing it's got going for it is that it's not Sharepoint. As I am spending the summer holidays doing some documentation at work, one of the things I wanted to do was to make confluence less hateful. So I cracked open the REST API to see how far I could get.

There used to be good tools, but atlassian got rid of the XMLRPC API not that long ago.

Progress I made was:

  • Got a list of all spaces, and all pages in each space.
  • Worked out how to obtain the content of a page.
  • Worked out how to change the content of a page (for when the time comes).

Where I got stuck:

  • Working out how to round-trip the confluence markup to/from markdown.

The rest of this post describes the script I put together. It's not useful enough for me to put on the CPAN but it's worth putting up somewhere.

=cut

package RESTfluence;

=for post

This is a script at the moment, but it still has its own package name. I'm using Moo because it makes everything easy. You'lll notice that I use lazy attributes (built on demand). Making lazy attributes so easy is the #1 thing I get out of modern perl. Remeber, Moo turns on warnings and strict automatically.

=cut

use Moo; # provides ->new, $self and easy to declare lazy attributes

=for post

There's a small list of CPAN modules needed:

=cut

use URI;                    # I hate URI string manipulation
use Net::Netrc;             # Credentials store
use REST::Client;           # Some convenence wrappers on REST
use JSON::MaybeXS qw(JSON); # the only worthwhile CPAN json module ;)
use feature 'say';          # modern perl print $string . "\n";

=for post

We're going to want storage for the wiki url and its credentials. For authentication, we're using a ~/.netrc file (chmod 0600) and store with this format:

machine https://myco.atlassian.net/wiki
  login me@myco.net
  password mypassword

=cut

has wiki => (
    is => 'ro',
    default => $ENV{MY_CONFLUENCE}, # e.g. https://myco.atlassian.net/wiki
);

has netrc => (
    is => 'lazy',
    default => sub {
        return Net::Netrc->lookup($_[0]->wiki);
    }
);

=for post

The other component is to provide a JSON parser, and an attribute that contains the REST::Client instance.

=cut

has json => (
    is => 'lazy',
    default => sub {
        JSON->new->pretty(1);
    }
);

has rest => (
    is => 'lazy',
    default => sub {
        my ($self) = @_;
        my $c = REST::Client->new;
        $c->getUseragent->default_headers->authorization_basic(
            $self->netrc->login, $self->netrc->password);
        $c->setHost($self->wiki . '/rest/api');
        return $c;
    }
);

=for post

run if ! caller is one of the best things ever.

=cut

__PACKAGE__->run() if ! caller;

=for post

As I said I got to the point where I could download content, so that's what run currently does.

=cut

sub run {
    my ($self, %args) = @_;
    $self = $self->new(%args);

    # my $all_pages = $self->get_all_pages();
    my $data = $self->get_page_content(title => q/Page Title HERE/, spaceKey => 'DKB');
    return $data;
}

=for post

I was going to add post, put and delete methods when necessary.

=cut

sub get {
    my ($self, $path, $args) = @_;
    $self->request('GET', $path, $args);
}

=for post

This is the bit that drives the request. It does some very gentle path mangling and returns the parsed json if status is 200, otherwise returns a data structure with the response content.

=cut

sub request {
    my ($self, $method, $path, $args) = @_;
    $args ||= [];
    $path = join '/' if ref($path); # ARRAY
    $path = "/$path" unless $path =~ m{^/};
    $self->rest->$method($path, @$args);

    my $rc = $self->rest->responseCode;
    say STDERR "Response: $rc";
    my $res;
    if ($rc == 200) {
        $res = $self->json->decode($self->rest->responseContent);

    }
    else {
        $res = { response_code => $rc,
                 message => $self->rest->responseContent,
             };
    }
    return $res;
}

=for post

The query subroutine is a convenience wrapper around REST::Client

=cut

    sub query {
    my ($self, %query) = @_;
    return $self->rest->buildQuery(%query);
}

=for post

This is where I start working with the API. This method just lists all spaces in the wiki, and returns them.

=cut

sub list_spaces {
    my ($self) = @_;
    my $orig = URI->new($self->rest->getHost);
    my $new = $orig->clone;

    my $offset = 0;
    my $all_spaces;
    my $spaces = {};
    my $limit = 50;

    while (! $spaces->{results} || @{$spaces->{results}} == $limit ) {
        $spaces = $self->get("/space?limit=$limit&start=$offset");
        push @$all_spaces,  map {
            {   name => $_->{name},
                key => $_->{key},
                id => $_->{id},
                api => $_->{_links}->{self},
                web => $self->wiki . $_->{_links}->{webui}   }
        } @{$spaces->{results}};
        $offset = $offset + $limit;
    }
    return $all_spaces;
}

=for post

get_pages_for returns all the pages in a space. From this result we can get the page space and the page title. Which is important for obtaining content.

=cut

sub get_pages_for {
    my ($self, $space) = @_;
    my $key = $space->{key};
    my $limit = 25;
    my $offset = 0;
    my $info;
    my $pages;

    while (! $pages || @{$pages->{results}} == $limit ) {
        $pages = $self->get("content?spaceKey=$key&type=page&limit=$limit&start=$offset");
        say STDERR "Limit: $limit Offset: $offset Key: $key";
        my $res = $pages->{results};
        push @$info, map { {
              title => $_->{title},
              id => $_->{id},
              web => $self->wiki . $_->{_links}->{webui},
              edit => $self->wiki . $_->{_links}->{editui},
              api => $_->{_links}->{self} }
                       } @$res;
        $offset = $offset + $limit;
    }
    return $info;
}

=for post

get_all_pages just gets all pages in all spaces

=cut

sub get_all_pages {
    my ($self) = @_;
        my $spaces = $self->list_spaces();
    foreach my $s (@$spaces) {
        $s->{pages} = $self->get_pages_for($s);
    }
    return $spaces;
}

=for post

get_page_content requires a hash:

( title => 'Page Title', spaceKey => 'whatever)

Currently this returns atlassian storage format which seems to be a mix of XHTML and XML. I'd really like to be able to round trip this to markdown so I can edit and create pages easily. But I'm currently at a loss as to how to proceed.

=cut

sub get_page_content {
    my ($self, %location) = @_;
    my $q = $self->query(%location, expand => 'body.storage');
    my $res = $self->get("/content$q");
    return ref $res ?  $res->{results}->[0]->{body}->{storage}->{value} : undef;
}


1; # end package on truth :)

Leave a comment

About kd

user-pic Australian perl hacker. Lead author of the Definitive Guide to Catalyst. Dabbles in javascript, social science and statistical analysis.