APM: processing lots of files?

Sam Foster austin.pm at sam-i-am.com
Fri Apr 23 12:47:53 CDT 2004


So I'm still working on this one.
Just now I ran a script that crawled a directory structure to identify 
"empty" directory (directories that had only some boiler plate 
properties files and no actual data) that produced a list of around 5 
thousand matches.
It took a while.
Now I've taken that list, split it into 4 and given each piece to a 
rmtree script. I did this by cutting and pasting the lines into new text 
files, and creating new command prompts to start each instance of my 
script. This gives me 4 seperate processes running in parallel each 
tackling a part of the task.

What I'd like is a wrapper that does this for me. I give it the script 
filename, the filelist and perhaps the number of clones to create, and 
have it basically do the above for me.

But system calls wait for the process to finish before continuing so I'm 
not sure how to achieve this. I've looked at some forking code but I'll 
admit to being a little daunted.

I also looked at Parallel::Jobs on cpan and took a stab at use it 
without success - the child processes weren't terminating and nor did 
they seem to be running in parallel.

any pointers?

Sam



More information about the Austin mailing list