[Melbourne-pm] Unix / Perl

Craig Sanders cas at taz.net.au
Sun Dec 16 12:46:42 PST 2007


On Sun, Dec 16, 2007 at 10:40:21PM +1100, Mirko Fluher wrote:
> Could someone help with this mixture of Unix and Perl.
> 
> foreach $_ (`tree -isuDRf /dir1/fileserver/dir2/documents`)
> {
>         next if(/^\/dir1/);
>         ($line1, $line2) = /(.*?)\/dir1\/fileserver\/dir2\/documents(\/.*$)/;
>         $line1 =~ tr/[]//d;
>         $_ = $line1;
>         ($userid, $size, $month, $day, $year) = /^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/;
>         print "$userid,$size,$month,$day,$year,$line2\n";
> }
> # to run this prog.
> # ./make_index_docu_csv.pl|sort +1 -rnt, > /dir1/fileserver/dir2/documents/documents_index.csv
> 
> I would like to have the whole thing in perl ... the problem I have at the moment is that 'tree'
> truncates userid  .. :(

you could use IO::All - it has directory recursion and file-statting
abilities.

start from something like:

---cut here---
#! /usr/bin/perl

use strict;

use IO::All
use Date::Format;

my $dir = io('/dir1/fileserver/dir2/documents');

foreach (sort {$b->size <=> $a->size} $dir->All) {
  my $username = getpwuid($_->uid);
  my $date = time2str('%Y-%m-%d', $_->mtime);

  printf "%s,%s,%s,%s\n", $username, $_->size, $date, $_->name
}

---cut here---

Note: i have used YYYY-MM-DD as the date output format. this ISO format
is unambiguous, and it is the only format which sorts correctly. if
you really want "mm,dd,yyyy" (not recommended) then change the format
string given to time2str. better yet, if you use one of the CSV modules
(see next paragraph) then extract the month, day, year into separate
variables and print them as separate fields.


you probably also want to use DBD::CSV or Text::CSV_XS instead of just a
simple printf to make sure you produce correctly formatted CSV output
rather than just comma-separated (which will work in most cases, but not
if there are commas in any of the filenames)


there's a very valuable lesson about perl here: there will almost always
be a perl module in CPAN to do exactly what you need....and it will do
it properly, rather than implementing your own quick-and-dirty partial
solution.


craig

-- 
craig sanders <cas at taz.net.au>

BOFH excuse #33:

piezo-electric interference


More information about the Melbourne-pm mailing list