Almost Reinventing XPath for YAML

I think I started to reinvent XPath. I hate when that happens.

It started simply enough. I had a directory with 12,000 YAML files in it. I wanted to grep them, but I can't do that from the command line:

 -bash: /usr/bin/grep: Argument list too long

Although slightly annoying this isn't a big deal. I can write a Perl script to do the job and get the files through opendir. I hardcoded the bits that I wanted to get out, and when I needed something else, I just changed the source.

I made a couple of these quick scripts before I realized I was going to be doing this a lot. I refactored it so I could specify on the command line the path to the YAML thingy I wanted, and called it, well, it was called extract, but I renamed it ypath to put it on Github.

 % ypath dist_info/dist_file *.yml

I could do it for multiple values:

 % ypath dist_info/dist_file,dist_info/module_list *.yml

This was all fine for awhile. But then I had a case where I needed to extract an array element, so I needed to handle array indices:

 % ypath dist_info/dist_file,dist_info/module_list/2/md5 *.yml

Then, I wanted to handle all the elements of an array if I ran into one, so I wanted to through an @ globby sort of thing:

 % ypath dist_info/dist_file,dist_info/module_list/@/md5 *.yml

Or maybe all of the keys of a hash:

 % ypath dist_info/build_file/% *.yml

But, at that point I realized what was happening and didn't go on. I didn't really need those bits even though I thought they would be cool. Unless I really, really, really need it to do something else, that's where I'm stopping with it.

Not only that, surely someone else must have already made a much better tool to do the same thing, even if I can't find it

4 Comments

I like it. Perhaps this weekend I'll hack on it a bit and turn it into an App::* module with an API for extracting YPaths. I can see a lot of uses for this kind of thing.

Just a note about the grep, "grep -r ." in the directory with the files would have worked.

But of course a Perl based solution is much better and faster :)

My thought was:

ls|xargs grep

Leave a comment

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).