[tpm] find, manipulate, then output

Peter Vereshagin peter at vereshagin.org
Mon May 28 09:55:22 PDT 2012


Hello.

2012/05/28 12:09:56 -0400 Antonio Sun <antoniosun at lavabit.com> => To Peter Vereshagin :
AS> On Mon, May 28, 2012 at 11:51 AM, Peter Vereshagin <peter at vereshagin.org>wrote:
AS> 
AS> > AS> >> perl -ne 'print $2 . ", ". $1. "\n" while(/.../)'
AS> > AS> >>
AS> > AS> >> But I really can't work out the rest now.
AS> > AS> >> Please help.
AS> >
AS> > Sure.
AS> >
AS> > perl -Mstrict -wE 'my ( $blob, $fname, $lname ) = map {""} 0 .. 2; while (
AS> > my $str = <> ) { $blob .= $str; if ( $blob =~
AS> > m/<(first|last)-name[^>]*>([^>]*)</ ) { my ( $kind => $name ) = ( $1 => $2
AS> > ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname = $name; }
AS> > else { $lname = $name; }  $blob = ""; } if ( $fname and $lname ) { print
AS> > "$lname, $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'
AS> >
AS> 
AS> Thank you Peter for your reply.
AS> 
AS> I think the reason that your script is far more complicated than what I
AS> thought is because that you are handling one line at a time. I believe that
AS> if we to slurp the whole file in and have '.' match new lines as well, it

why worry about '.' ? Did I miss a thing?

AS> can be greatly simplified.

ok I was counting on SAX rather than DOM in terms of XML parsers. At the least you didn't specify if the input is small enough for slurping, so memory is my concern, as always.

And ... you needed a one-liner, right? As for me, one-liners have to be written quickly rather than just be simple. It's not always the same in Perl especially.

AS> Again, I don't have my "perl notes" at hands, so I can't prove it now.

Read File::Slurp and a perlre.

AS> Thanks anyway.

You're welcome. Here is handling the blob as a whole:

perl -Mstrict -Mautodie -wE 'my ( $blob, $fname, $lname ) = map {""} 0 .. 2; $blob .= $_ while <>;  while ( $blob =~ s/<(first|last)-name[^>]*>([^>]*)<// ) { my ( $kind => $name ) = ( $1 => $2 ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname = $name; } else { $lname = $name; } if ( $fname and $lname ) { print "$lname, $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'

--
Peter Vereshagin <peter at vereshagin.org> (http://vereshagin.org) pgp: A0E26627 


More information about the toronto-pm mailing list