[Omaha.pm] More 10m hackery - date format conversion in flat files for DB load

Jay Hannah jay at jays.net
Tue May 17 15:12:12 PDT 2005


Problem:

You have a database unload file like this:

 > cat in
qweq|qwedqd|12/03/2003|12321|12/01/2002|XXX|
qweq|qwedqd|12/03/2003|12321|12/01/2002|XXX|
...etc...

But you need those pesky date fields to look like this so MS-SQL will 
load it (bcp):

qweq|qwedqd|2003-12-03 00:00:00|12321|2002-12-01 00:00:00|XXX
qweq|qwedqd|2003-12-03 00:00:00|12321|2002-12-01 00:00:00|XXX
...etc...


Solutions:

A hard coded solution to convert the first date only:
=======================================================
#!/usr/bin/perl

open (IN, "tndns_dly_arr.unl");
while (<IN>) {
    chomp;
    @l = split /\|/;
    @d = split /\//, $l[2];
    $l[2] = sprintf("%04d-%02d-%02d 00:00:00", @d[2,0,1]);
    print join "|", @l;
    print "|\n";
}
close IN;
=======================================================


An argument based version that accepts an arbitrary array indicating 
which fields should be converted:
=======================================================
 > cat j.pl

while (<STDIN>) {
    chomp;
    $line = $_;
    foreach (@ARGV) {
       $line = conv_date($line, $_);
    }
    print "$line\n";
}

sub conv_date {
    my ($line, $pos) = @_;
    @l = split /\|/, $line;
    @d = split /\//, $l[$pos];
    $l[$pos] = sprintf("%04d-%02d-%02d 00:00:00", @d[2,0,1]);
    my $ret = join "|", @l;
    return $ret;
}
=======================================================

You'd use the param version like this:

 > cat in | perl j.pl 2 4
qweq|qwedqd|2003-12-03 00:00:00|12321|2002-12-01 00:00:00|XXX
qweq|qwedqd|2003-12-03 00:00:00|12321|2002-12-01 00:00:00|XXX
 >


(
   Then load your converted data into MS-SQL w/ bcp:
   bcp dss.dbo.tndns_dly_arr in t.unl -U sa -c -t "|" -r "|\n" -m 1000000
)

j



More information about the Omaha-pm mailing list