Almost Reinventing XPath for YAML
I think I started to reinvent XPath. I hate when that happens.
It started simply enough. I had a directory with 12,000 YAML files in it. I wanted to grep them, but I can't do that from the command line:
-bash: /usr/bin/grep: Argument list too long
Although slightly annoying this isn't a big deal. I can write a Perl script to do the job and get the files through opendir. I hardcoded the bits that I wanted to get out, and when I needed something else, I just changed the source.
I made a couple of these quick scripts before I realized I was going to be doing this a lot. I refactored it so I could specify on the command line the path to the YAML thingy I wanted, and called it, well, it was called extract
, but I renamed it ypath to put it on Github.
% ypath dist_info/dist_file *.yml
I could do it for multiple values:
% ypath dist_info/dist_file,dist_info/module_list *.yml
This was all fine for awhile. But then I had a case where I needed to extract an array element, so I needed to handle array indices:
% ypath dist_info/dist_file,dist_info/module_list/2/md5 *.yml
Then, I wanted to handle all the elements of an array if I ran into one, so I wanted to through an @
globby sort of thing:
% ypath dist_info/dist_file,dist_info/module_list/@/md5 *.yml
Or maybe all of the keys of a hash:
% ypath dist_info/build_file/% *.yml
But, at that point I realized what was happening and didn't go on. I didn't really need those bits even though I thought they would be cool. Unless I really, really, really need it to do something else, that's where I'm stopping with it.
Not only that, surely someone else must have already made a much better tool to do the same thing, even if I can't find it
I like it. Perhaps this weekend I'll hack on it a bit and turn it into an App::* module with an API for extracting YPaths. I can see a lot of uses for this kind of thing.
Florian just pointed me to App::DPath, with is the tool I would have started with if I knew what I was doing. It handles YAML, JSON, and much more.
Just a note about the grep, "grep -r ." in the directory with the files would have worked.
But of course a Perl based solution is much better and faster :)
My thought was:
ls|xargs grep