[sf-perl] gsm hentai convert

Bart Alberti bart at solozone.com
Tue Mar 14 15:12:52 PST 2006


David Graff wrote:

>bart at solozone.com said:
>  
>
>>Just to convert (which iconv does not do) ISO08859 or UTF-8 to gsm
>>338  what do you suggest since I do not eneed multi byte support I
>>have downloaded the Encode tar ball from CPAN. Is this a one-liner? 
>>    
>>
>
> perl -e 'binmode STDIN,":utf8"; binmode STDOUT,":encoding(gsm0338)";
>    print while (<>)'  < utf8.data > gsm.data
>
>
>For 8859 input, the binmode on STDIN would be ":encoding(iso-8859-1)"
>(or some other final digit, if your data uses some other 8859 page).
>
>	Dave Graff
>
>  
>
I do see by the 7 bit template reference that I am using stuff not 
capable of being sent cleanly to gsm as you can see below. Is there some 
easy was ( & I am about to write a shell script using 'tr' and its OCTAL 
values to do this) to clear (LaTeX=? \"e and so forth ) to be their 
equivalents without diacriticals (stripped letters if they do not make it)?
Note the 'panic' statement which I have never seen before, below when I 
errorneously tried this using 8859 on a diffeent input when the command 
line said utf-8:.

Bart Alberti
------------------------------------------------->>>>>>>>>>>>>>>>>>
bart at kissling:~> perl -e 'binmode STDIN,":utf8"; binmode 
STDOUT,":encoding(gsm0338)"; print while (<>)'  < books.utf8 > 
gsm.list.books

"
panic: sv_setpvn called with negative strlen at -e line 1, <> line 887.
"\x{9837}" does not map to gsm0338, <> line 887.
panic: sv_setpvn called with negative strlen, <> line 887.
  And with correct calling: 

perl -e 'binmode STDIN,":utf8"; binmode STDOUT,":encoding(gsm0338)"; 
print while (<>)'  < books.utf8 > gsm.books
"\x{00ed}" does not map to gsm0338 at -e line 1, <> line 56.
"\x{00eb}" does not map to gsm0338 at -e line 1, <> line 238.
"\x{00ed}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00e1}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00eb}" does not map to gsm0338 at -e line 1, <> line 371.
"\x{00f4}" does not map to gsm0338 at -e line 1, <> line 397.
"\x{00c7}" does not map to gsm0338 at -e line 1, <> line 554.
"\x{00eb}" does not map to gsm0338, <> line 563.




More information about the SanFrancisco-pm mailing list