App::PipeFilter for Top-N Reports

I'm testing some software at work by replaying pcap files at the application. I want to make sure the results in the database match what's in the original packet dump. There are hundreds of packet producers. I want to focus on the top ten to make better use of my time.

I've written a utility to dump interesting packet data as streams of JSON objects, one per packet. Each object includes the source and destination IP and port, among other things.

% jcut -o src_ip -o dest_ip deleteme.json | head -3

{"dest_ip":"10.10.91.77","src_ip":"10.16.250.39"}
{"dest_ip":"10.10.91.77","src_ip":"10.0.250.80"}
{"dest_ip":"10.10.91.77","src_ip":"10.90.250.39"}

A "top N" report for any single field is trivial. Extract values for that field and pass them to sort(1), uniq(1) and tail(1)

% jcut -o src_ip deleteme.json | sort | uniq -c | sort -n | tail -10

 514 {"src_ip":"10.0.250.21"}
 544 {"src_ip":"10.0.250.100"}
 560 {"src_ip":"10.13.250.71"}
 565 {"src_ip":"10.40.250.7"}
 611 {"src_ip":"10.60.250.79"}
 628 {"src_ip":"10.0.50.6"}
 807 {"src_ip":"10.0.250.20"}
1223 {"src_ip":"10.10.250.239"}
2448 {"src_ip":"10.0.250.60"}
2508 {"src_ip":"10.0.250.30"}

I'm also alpha testing json2tsv so I can import this into a spreadsheet. A happy side effect is that tab-separated columns are easier on the eyes.

% json2tsv -o src_ip deleteme.json | sort | uniq -c | sort -n | tail -10

 514 10.0.250.21
 544 10.0.250.100
 560 10.13.250.71
 565 10.40.250.7
 611 10.60.250.79
 628 10.0.50.6
 807 10.0.250.20
1223 10.10.250.239
2448 10.0.250.60
2508 10.0.250.30

json2tsv is in App::PipeFilter's repository, and it's scheduled to appear in the next CPAN release.

Leave a comment

About Rocco Caputo

user-pic Among other things I write software, a lot of which is in Perl.