Day 5: Look ma, no 'argument list too long'! (App::rmhere)
About the series: perlancar's 2014 Advent Calendar: Introduction to a selection of 24 modules which I published in 2014. Table of contents.
If you spend enough time on a Unix/Linux shell, sooner or later you'll be faced with the task of removing files in a directory that contains a lot of those files (in my case, it's usually a Maildir with lots of spam/unread emails). The directory will contain so many files (like 50k or even millions) that doing rm * will fail with the annoying "Argument list too long" message because the * wildcard is expanded by the shell into a multimegabyte list that doesn't fit into the readline buffer. To delete the contents of this directory you will have to resort to some tricks, like using xargs -n, or going up one level, deleting the container directory, and later recreating it.
So I wrote rmhere (distributed in App-rmhere). There are admittedly other faster and smaller scripts for this task (rmhere uses the Perinci::CmdLine CLI framework and thus have a bunch of dependencies). However, my decision to use the framework is because of its progress report feature. Because that's the other thing you will find when deleting files in a big directory: it will take a loong time. rmhere -P will display a nice progress bar on your terminal.
By default rmhere will delete all files in a directory. To prevent silly accidents, if you don't specify the -f (force) option, it will prompt confirmation for each file. Most of the time you'll want rmhere -fP (or with an extra -R to do recursive deletion).
Perl to the rescue: case study of deleting a large directory
Yup, I remember that Randal's post. And the above is essentially what the rmhere script does, with some options and progress reporting as the main raison d'etre.
BTW, another way to delete contents of a huge directory (mentioned in rmhere's POD) is: "find ./ -type f -maxdepth 1 -delete". This is probably rather speedy too.