App::PipeFilter for Top-N Reports
I'm testing some software at work by replaying pcap files at the application. I want to make sure the results in the database match what's in the original packet dump. There are hundreds of packet producers. I want to focus on the top ten to make better use of my time.
I've written a utility to dump interesting packet data as streams of JSON objects, one per packet. Each object includes the source and destination IP and port, among other things.
% jcut -o src_ip -o dest_ip deleteme.json | head -3
{"dest_ip":"10.10.91.77","src_ip":"10.16.250.39"}
{"dest_ip":"10.10.91.77","src_ip":"10.0.250.80"}
{"dest_ip":"10.10.91.77","src_ip":"10.90.250.39"}
A "top N" report for any single field is trivial. Extract values for that field and pass them to sort(1), uniq(1) and tail(1)
% jcut -o src_ip deleteme.json | sort | uniq -c | sort -n | tail -10
514 {"src_ip":"10.0.250.21"}
544 {"src_ip":"10.0.250.100"}
560 {"src_ip":"10.13.250.71"}
565 {"src_ip":"10.40.250.7"}
611 {"src_ip":"10.60.250.79"}
628 {"src_ip":"10.0.50.6"}
807 {"src_ip":"10.0.250.20"}
1223 {"src_ip":"10.10.250.239"}
2448 {"src_ip":"10.0.250.60"}
2508 {"src_ip":"10.0.250.30"}
I'm also alpha testing json2tsv so I can import this into a spreadsheet. A happy side effect is that tab-separated columns are easier on the eyes.
% json2tsv -o src_ip deleteme.json | sort | uniq -c | sort -n | tail -10
514 10.0.250.21
544 10.0.250.100
560 10.13.250.71
565 10.40.250.7
611 10.60.250.79
628 10.0.50.6
807 10.0.250.20
1223 10.10.250.239
2448 10.0.250.60
2508 10.0.250.30
json2tsv is in App::PipeFilter's repository, and it's scheduled to appear in the next CPAN release.
Leave a comment