[Omaha.pm] Reading a Unix vs a DOS text file

Rob Townley rob.townley at gmail.com
Sat Mar 28 22:16:05 PDT 2009


On Thu, Mar 26, 2009 at 1:14 PM, Mike Hostetler <hostetlerm at gmail.com> wrote:
>
>
> On Thu, Mar 26, 2009 at 11:38 AM, Jay Hannah <jay at jays.net> wrote:
>>
>> Mike Hostetler wrote:
>>>
>>> So . . . it there a way I could get Perl to detect a file that uses CRLF
>>> as
>>> it's line terminator?  That way, I could use binmode on that file, and
>>> use
>>> ASCII on the rest.  Or does someone have a better suggestion?
>>>
>>
>> binmode is for binary files.
>>
>> CRLF (\r\n Windows) and LF (\n *nix) are for text files.
>>
>> I think you are confusing those two issues. They are not related. When
>> reading a file you could
>>
>> while (<IN>) {
>>  s/[\r\n]+$//;
>>
>> to remove those characters regardless of which format any given file was
>> written in.
>>
>> Does that help?   :)
>
> That does help.
>
> I mentioned binmode because I saw several references to that, but your
> solution is better.
>
> I knew someone would know more than me.  I just moonlight in Perl. :)
>
>
> --
> Mike Hostetler
> http://mike.hostetlerhome.com/
>
>
>
> _______________________________________________
> Omaha-pm mailing list
> Omaha-pm at pm.org
> http://mail.pm.org/mailman/listinfo/omaha-pm
>

>From what i remember, \n works well across platforms, but have had
trouble with socket programming across platform, but that was a decade
ago.

i would start with the parameters passed to perl to start the scripts.
 There is a switch to try to make it unnecessary to use chomp among
other things.  i may have used it to read files from DOS, convert
records to a standard, and output a special line ending character.

Secondly, "perldoc open" and check what is set for the IN and OUT
pragmas as they may be different.

Third, i would print the following perlvars and "perldoc perlport" to
shed some light on different places newline can be set.

# O::Handle->input_record_separator(EXPR)
# $INPUT_RECORD_SEPARATOR
# $RS
# $/

# IO::Handle->output_record_separator EXPR
# $OUTPUT_RECORD_SEPARATOR
# $ORS
# $\

The output record separator for the print operator. If defined, this
value is printed after the last of print's arguments. Default is
undef. (Mnemonic: you set $\ instead of adding "\n" at the end of the
print. Also, it's just like $/ , but it's what you get "back" from
Perl.)


More information about the Omaha-pm mailing list