[pgh-pm] system() not displaying scalar text
Benjamin R. Haskell
pm at benizi.com
Tue Sep 4 13:29:09 PDT 2007
On Tue, 4 Sep 2007, Matthew T. Engel wrote:
> I think the problem is that I am trying to use the system function to echo
> text to the terminal. Something like:
First off, in your example, there's no need to 'cat' the files into your
script. You can just pass them as extra parameters, since you're using
'<>', which uses '@ARGV' as its source for filenames to open.
Secondly, why are you using:
system("echo $_");
instead of:
print "$_\n"; # or print $_, $/; # or print; # with the '-l' switch ;)
If all you really want to do is echo data to the terminal, that's what you
should do.
> #!/usr/bin/perl
>
> while(<>)
>
> {
>
> chomp;
>
> system("echo $_");
>
> }
>
>
>
> If I run the script via
>
>
>
> $ cat ascii_text_file | ./above_script.pl Everything works fine where it
> essentially cats the input file. However, if I do a $ cat unicode_text_file
> | ./above_script.pl. I get blank lines where the echo'd data should be.
>
>
>
> I think the second file is Unicode because doing a $od -c Unicode_text file,
> shows /0 in front of all the characters, and if I vi the same file it shows
> ^@ before every character.
>
>
>
> I would like to be able to use Unicode and ascii text files
> interchangeabley. please advice. Thank you very much in advance.
If you're worried about being able to use both "ASCII" (as in
"7-bit ASCII") files and Unicode (as in "UTF-8" [or "UTF-16" == UCS-2]),
you shouldn't have any problems whatsoever, if you just 'print' it to the
terminal.
If you're worried about interoperability between ASCII (as in "ISO-8859-*"
[*=1,2,etc.]), then you'll *have* to do something more complicated.
In order for perl to deal with ISO-8859-*, if it uses anything outside of
7-bit ASCII, you must let it know how to interpret the byte stream. (or,
my recommendation: *convert it*.)
>From your description, it sounds like your unicode_text_file is in UTF-16
(== UCS-2?), where each character is two bytes. Usually, to distinguish
UTF-16-LE from -BE (little-/big-endian), there's a BOM (byte-order mark)
of 0xfeff at the start of the file. I think 'iconv', a very useful program
for converting character sets will handle that.
Other useful resources might be to 'Super Search' on Perlmonks for
'Unicode' and UTF-8 or UTF-16.
Best,
Ben
More information about the pgh-pm
mailing list