[Pdx-pm] UNIX NL verses MS(DOS) CRLF

Thomas J Keller kellert at ohsu.edu
Tue Aug 31 14:36:51 CDT 2004


David Cross gave a line end conversion filter in his "Data Munging with 
Perl" book. I modified it to my taste:
#!/usr/bin/perl -w
## adapted by TJK from David Cross "Data Munging with Perl"
## uses the ASCII control chars to force conversion of line endings for 
use on other platforms by "retro" programs
use strict;

(@ARGV == 2) or die "Usage: Converts line endings of STDIN. Needs two 
args, source and target formats (Mac, Win, Unix).";

my ($src, $tgt) = @ARGV;

my %conv = (	Mac => "\cM",		#carriage-return
				Unix => "\cJ",		#line-feed
				Win => "\cM\cJ"		#both
			);
				
$src = $conv{$src};
$tgt = $conv{$tgt};

$/ = $src;

while (<STDIN>) {
	s/$src/$tgt/go;
	print;
}

EXAMPLE:
$ perl line_ending_filter Unix Win < ./text.txt > ./text.doc

On Aug 31, 2004, at 11:40 AM, Roderick A. Anderson wrote:

> Tom Phoenix wrote:
>
>> On Tue, 31 Aug 2004, Roderick A. Anderson wrote:
>>
>>
>>> I'm using Marc Overmeer's Mailbox collection to process spam folders
>>> (mailboxen) and any time the process (re)writes the file I end up 
>>> with
>>> NLs instead of CRLFs.  This causes all kinds of heart-ache and
>>> discontent with the users as they can't pop the messages back using
>>> Outlook.
>>>
>>>
>>
>> It sounds like you're using code which wasn't written to be portable 
>> to
>> non-Unix machinery. If you're willing to fix it, you could perhaps 
>> hack it
>> to use $\ , which could be set at the time the file is opened. I 
>> haven't
>> looked at the code, though; it might be easier to search for uses of 
>> \n in
>> the code and replace each one with $LINE_ENDING or some such.
>>
>>
>
> Thanks Tom,
>
> Well I think the code is portable but my implementation isn't that 
> good.
>
>     The files are on a Windows machine.
>     The directory (folder) is shared (exported).
>     The share is mounted on a Linux box.
>     The program is running on the Linux box.
>
> Is there some way for the module to determine this?  Well besides
> reading several lines of the file looking for CRLF  or CR (Mac) verses
> NL.  I'm not sure how the files are being opened with either just an
> open() or or some other lower level method.  The module collection is a
> monster but using it means I can point fingers when things don't work
> quite right. (Then help debug where I can -- like now. :-)
>    Right now I'm looking for a quick fix and will be trying Joe's
> suggestion until I can get it done up right.
>
>> If you don't want to change the code, you could post-process the 
>> mailbox
>> file to put the CRLFs back in. That's going to be slower, though, 
>> compared
>> to fixing the module.
>>
>>
> I'd rather not mess with others work.
>
>> Speaking of portability, be sure to use binmode if you're going to 
>> handle
>> lines ends directly, so that Perl won't garble your files.
>>
>> Good luck with it!
>>
>>
> Thanks again,
>
>
> Rod
> -- 
>
> ---
> [Certified Virus free by ASISNA Mail Services.    www.asisna.com ]
>
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at mail.pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list




More information about the Pdx-pm-list mailing list