[Pdx-pm] data munging: line-endings

Michael G Schwern schwern at pobox.com
Tue Oct 14 21:08:11 CDT 2003


On Tue, Oct 14, 2003 at 11:48:09AM -0700, Thomas Keller wrote:
> On Tuesday, October 14, 2003, at 11:14  AM, Colin Kuskie wrote:
> >s/\s*$//;
> This doesn't help for the files I need to process. "$/" needs to be 
> correct for correct line-by-line processing.
> I've been running them through the following filter, but it requires 
> that I know ahead of time that the input file has DOS type 
> line-endings. I guess that's not too onerous, but if we change the 
> application generating these files, the line-endings may change. But I 
> guess I'll see that immediately and be able to change the specification 
> to the filter script.

Detection's pretty simple.  Read in some text using read() and look for a 
line ending.

	open(FILE, $filename);
	read(FILE, $text, 1024);
	close FILE;

	($ending) = $text =~ /[\r\n]{1,2}/

the only possible edge case is if there's a blank line which may result in
"\n\n" or "\r\r" in which case you just remove duplicates.

	$ending =~ s/([\r\n])\1/$1/;

Now you can set $/ to $ending.


-- 
Michael G Schwern        schwern at pobox.com  http://www.pobox.com/~schwern/
If the women don't find you handsome, they should at least find you handy.
    -- Red Green



More information about the Pdx-pm-list mailing list