Testing for HTTP compression
How do I determine what the content-encoding of a web page is? A simple question which after doing a little searching did not turn up a simple answer. A stackoverflow question lead me to the solution but did not answer the question directly so here I am writing it up. We will need to install these modules first:
cpan Compress::Zlib LWP::UserAgent
Is a quick test to see what formats you can accept. 'gzip, x-gzip, deflate, x-bzip2' is the output on my system. Now lets fetch a web page and see what we get.
Looking through all the headers we can see that content-encoding is gzip for reddit. Now instead of dumping all the headers we could simply look at $response->header('content-encoding') and have what we need.
There is an interesting tidbit in the full headers of reddit:
======( $response->{_headers} [ 'delivery_formats.pl', line 19 ]======bless({
"client-date" => "Sat, 20 Oct 2012 18:27:48 GMT",
"client-peer" => "165.254.27.97:80",
"client-response-num" => 1,
"connection" => "close",
"content-encoding" => "gzip",
"content-length" => 8346,
"content-type" => "application/json; charset=UTF-8",
"date" => "Sat, 20 Oct 2012 18:27:48 GMT",
"server" => "'; DROP TABLE servertypes; --",
"set-cookie" => "reddit_first=%7B%22firsttime%22%3A%20%22first%22%7D; Domain=reddit.com; expires=Thu, 31 Dec 2037 23:59:59 GMT; Path=/",
"vary" => "Accept-Encoding",
}, "HTTP::Headers")
Lets see this is coming from server "'; DROP TABLE servertypes; --". A SQL injection as the server name, it makes me smile. Obviously the reddit developers have read that xkcd comic before. To protect your application against such an "attack" I would recommend reading bobby-tables.com which is a guide to preventing SQL injection.
Not only did this give me a good little chuckle, it also just gave me a bit of fright - I very well could have reviewed and approved code that could be susceptible to such attacks!
I just realized that in my new role as the curator of a fairly large website's data analytics based primarily on Hive, (a SQL-like layer on top of Hadoop), I still need to be watching for potential SQL-injection attacks, even though my systems are several steps insulated from the servers that receive and log this data!