[tpm] find, manipulate, then output

Antonio Sun antoniosun at lavabit.com
Mon May 28 19:01:41 PDT 2012


On Mon, May 28, 2012 at 12:55 PM, Peter Vereshagin <peter at vereshagin.org>wrote:

> AS> > AS> >> perl -ne 'print $2 . ", ". $1. "\n" while(/.../)'
> AS> > AS> >>
> AS> > AS> >> But I really can't work out the rest now.
> AS> > AS> >> Please help.
> AS> >
> AS> > Sure.
> AS> >
> AS> > perl -Mstrict -wE 'my ( $blob, $fname, $lname ) = map {""} 0 .. 2;
> while (
> AS> > my $str = <> ) { $blob .= $str; if ( $blob =~
> AS> > m/<(first|last)-name[^>]*>([^>]*)</ ) { my ( $kind => $name ) = ( $1
> => $2
> AS> > ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname =
> $name; }
> AS> > else { $lname = $name; }  $blob = ""; } if ( $fname and $lname ) {
> print
> AS> > "$lname, $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'
> AS> >
> AS>
> AS> Thank you Peter for your reply.
> AS>
> AS> I think the reason that your script is far more complicated than what I
> AS> thought is because that you are handling one line at a time. I believe
> that
> AS> if we to slurp the whole file in and have '.' match new lines as well,
> it
>
> why worry about '.' ? Did I miss a thing?
>
> AS> can be greatly simplified.
>
> ok I was counting on SAX rather than DOM in terms of XML parsers. At the
> least you didn't specify if the input is small enough for slurping, so
> memory is my concern, as always.
>
> And ... you needed a one-liner, right? As for me, one-liners have to be
> written quickly rather than just be simple. It's not always the same in
> Perl especially.
>
> AS> Again, I don't have my "perl notes" at hands, so I can't prove it now.
>
> Read File::Slurp and a perlre.
>
> AS> Thanks anyway.
>
> You're welcome. Here is handling the blob as a whole:
>
> perl -Mstrict -Mautodie -wE 'my ( $blob, $fname, $lname ) = map {""} 0 ..
> 2; $blob .= $_ while <>;  while ( $blob =~
> s/<(first|last)-name[^>]*>([^>]*)<// ) { my ( $kind => $name ) = ( $1 => $2
> ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname = $name; }
> else { $lname = $name; } if ( $fname and $lname ) { print "$lname,
> $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'
>

Thanks everyone for your help.

I guess that I shouldn't have chosen the XML as the example. That's the one
I can find/borrow without haven't to cook up my own test data. My focus was
regarding working on the strings found and output the processed content,
but all people seem to have been carried away by the XML.

Anyway. Thanks again everyone for your help.

FYI,

AS> I believe that
> AS> if we to slurp the whole file in and have '.' match new lines as well,
> it
>
 AS> can be greatly simplified.
>

This is what I meant, If we focus on working on the strings found and
output the processed content:

$ perl -n0777e 'print "$2, $1\n" while
m{<first-name>\s*(.*?)\s*</first-name>\s*<last-name>\s*(.*?)\s*</last-name>}gs'
test.txt
Franklin, Benjamin
Melville, Herman

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/toronto-pm/attachments/20120528/1de548ba/attachment.html>


More information about the toronto-pm mailing list