[Pdx-pm] UNIX NL verses MS(DOS) CRLF
Thomas J Keller
kellert at ohsu.edu
Tue Aug 31 14:36:51 CDT 2004
David Cross gave a line end conversion filter in his "Data Munging with
Perl" book. I modified it to my taste:
#!/usr/bin/perl -w
## adapted by TJK from David Cross "Data Munging with Perl"
## uses the ASCII control chars to force conversion of line endings for
use on other platforms by "retro" programs
use strict;
(@ARGV == 2) or die "Usage: Converts line endings of STDIN. Needs two
args, source and target formats (Mac, Win, Unix).";
my ($src, $tgt) = @ARGV;
my %conv = ( Mac => "\cM", #carriage-return
Unix => "\cJ", #line-feed
Win => "\cM\cJ" #both
);
$src = $conv{$src};
$tgt = $conv{$tgt};
$/ = $src;
while (<STDIN>) {
s/$src/$tgt/go;
print;
}
EXAMPLE:
$ perl line_ending_filter Unix Win < ./text.txt > ./text.doc
On Aug 31, 2004, at 11:40 AM, Roderick A. Anderson wrote:
> Tom Phoenix wrote:
>
>> On Tue, 31 Aug 2004, Roderick A. Anderson wrote:
>>
>>
>>> I'm using Marc Overmeer's Mailbox collection to process spam folders
>>> (mailboxen) and any time the process (re)writes the file I end up
>>> with
>>> NLs instead of CRLFs. This causes all kinds of heart-ache and
>>> discontent with the users as they can't pop the messages back using
>>> Outlook.
>>>
>>>
>>
>> It sounds like you're using code which wasn't written to be portable
>> to
>> non-Unix machinery. If you're willing to fix it, you could perhaps
>> hack it
>> to use $\ , which could be set at the time the file is opened. I
>> haven't
>> looked at the code, though; it might be easier to search for uses of
>> \n in
>> the code and replace each one with $LINE_ENDING or some such.
>>
>>
>
> Thanks Tom,
>
> Well I think the code is portable but my implementation isn't that
> good.
>
> The files are on a Windows machine.
> The directory (folder) is shared (exported).
> The share is mounted on a Linux box.
> The program is running on the Linux box.
>
> Is there some way for the module to determine this? Well besides
> reading several lines of the file looking for CRLF or CR (Mac) verses
> NL. I'm not sure how the files are being opened with either just an
> open() or or some other lower level method. The module collection is a
> monster but using it means I can point fingers when things don't work
> quite right. (Then help debug where I can -- like now. :-)
> Right now I'm looking for a quick fix and will be trying Joe's
> suggestion until I can get it done up right.
>
>> If you don't want to change the code, you could post-process the
>> mailbox
>> file to put the CRLFs back in. That's going to be slower, though,
>> compared
>> to fixing the module.
>>
>>
> I'd rather not mess with others work.
>
>> Speaking of portability, be sure to use binmode if you're going to
>> handle
>> lines ends directly, so that Perl won't garble your files.
>>
>> Good luck with it!
>>
>>
> Thanks again,
>
>
> Rod
> --
>
> ---
> [Certified Virus free by ASISNA Mail Services. www.asisna.com ]
>
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at mail.pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
More information about the Pdx-pm-list
mailing list