[sf-perl] gsm hentai convert
Bart Alberti
bart at solozone.com
Tue Mar 14 15:12:52 PST 2006
David Graff wrote:
>bart at solozone.com said:
>
>
>>Just to convert (which iconv does not do) ISO08859 or UTF-8 to gsm
>>338 what do you suggest since I do not eneed multi byte support I
>>have downloaded the Encode tar ball from CPAN. Is this a one-liner?
>>
>>
>
> perl -e 'binmode STDIN,":utf8"; binmode STDOUT,":encoding(gsm0338)";
> print while (<>)' < utf8.data > gsm.data
>
>
>For 8859 input, the binmode on STDIN would be ":encoding(iso-8859-1)"
>(or some other final digit, if your data uses some other 8859 page).
>
> Dave Graff
>
>
>
I do see by the 7 bit template reference that I am using stuff not
capable of being sent cleanly to gsm as you can see below. Is there some
easy was ( & I am about to write a shell script using 'tr' and its OCTAL
values to do this) to clear (LaTeX=? \"e and so forth ) to be their
equivalents without diacriticals (stripped letters if they do not make it)?
Note the 'panic' statement which I have never seen before, below when I
errorneously tried this using 8859 on a diffeent input when the command
line said utf-8:.
Bart Alberti
------------------------------------------------->>>>>>>>>>>>>>>>>>
bart at kissling:~> perl -e 'binmode STDIN,":utf8"; binmode
STDOUT,":encoding(gsm0338)"; print while (<>)' < books.utf8 >
gsm.list.books
"
panic: sv_setpvn called with negative strlen at -e line 1, <> line 887.
"\x{9837}" does not map to gsm0338, <> line 887.
panic: sv_setpvn called with negative strlen, <> line 887.
And with correct calling:
perl -e 'binmode STDIN,":utf8"; binmode STDOUT,":encoding(gsm0338)";
print while (<>)' < books.utf8 > gsm.books
"\x{00ed}" does not map to gsm0338 at -e line 1, <> line 56.
"\x{00eb}" does not map to gsm0338 at -e line 1, <> line 238.
"\x{00ed}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00e1}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00eb}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00f4}" does not map to gsm0338 at -e line 1, <> line 397.
"\x{00c7}" does not map to gsm0338 at -e line 1, <> line 554.
"\x{00eb}" does not map to gsm0338, <> line 563.
More information about the SanFrancisco-pm
mailing list