<br><br><div class="gmail_quote">On Mon, May 28, 2012 at 12:55 PM, Peter Vereshagin <span dir="ltr"><<a href="mailto:peter@vereshagin.org" target="_blank">peter@vereshagin.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div id=":1r">AS> > AS> >> perl -ne 'print $2 . ", ". $1. "\n" while(/.../)'<br>
<div class="im">AS> > AS> >><br>
AS> > AS> >> But I really can't work out the rest now.<br>
</div>AS> > AS> >> Please help.<br>
AS> ><br>
AS> > Sure.<br>
AS> ><br>
AS> > perl -Mstrict -wE 'my ( $blob, $fname, $lname ) = map {""} 0 .. 2; while (<br>
AS> > my $str = <> ) { $blob .= $str; if ( $blob =~<br>
AS> > m/<(first|last)-name[^>]*>([^>]*)</ ) { my ( $kind => $name ) = ( $1 => $2<br>
AS> > ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname = $name; }<br>
AS> > else { $lname = $name; } $blob = ""; } if ( $fname and $lname ) { print<br>
AS> > "$lname, $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'<br>
AS> ><br>
AS><br>
AS> Thank you Peter for your reply.<br>
AS><br>
AS> I think the reason that your script is far more complicated than what I<br>
AS> thought is because that you are handling one line at a time. I believe that<br>
AS> if we to slurp the whole file in and have '.' match new lines as well, it<br>
<br>
why worry about '.' ? Did I miss a thing?<br>
<br>
AS> can be greatly simplified.<br>
<br>
ok I was counting on SAX rather than DOM in terms of XML parsers. At the least you didn't specify if the input is small enough for slurping, so memory is my concern, as always.<br>
<br>
And ... you needed a one-liner, right? As for me, one-liners have to be written quickly rather than just be simple. It's not always the same in Perl especially.<br>
<br>
AS> Again, I don't have my "perl notes" at hands, so I can't prove it now.<br>
<br>
Read File::Slurp and a perlre.<br>
<br>
AS> Thanks anyway.<br>
<br>
You're welcome. Here is handling the blob as a whole:<br>
<br>
perl -Mstrict -Mautodie -wE 'my ( $blob, $fname, $lname ) = map {""} 0 .. 2; $blob .= $_ while <>; while ( $blob =~ s/<(first|last)-name[^>]*>([^>]*)<// ) { my ( $kind => $name ) = ( $1 => $2 ); $name =~ s/^\s+|\s+$//g; if ( $kind eq "first" ) { $fname = $name; } else { $lname = $name; } if ( $fname and $lname ) { print "$lname, $fname\n"; ( $fname => $lname ) = map {""} 0 .. 1; } }'<div class="yj6qo ajU">
<div id=":4h" class="ajR" tabindex="0"></div></div></div></blockquote></div><br>Thanks everyone for your help. <br><br>I guess that I shouldn't have chosen the XML as the
example. That's the one I can find/borrow without haven't to cook up my own test data. My focus was regarding working on the strings found and output
the processed content, but all people seem to have been carried away by the XML. <br><br>Anyway. Thanks again everyone for your help.<br><br>FYI, <br><br><blockquote style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">
AS> I believe that<br>
AS> if we to slurp the whole file in and have '.' match new lines as well, it<br></blockquote><blockquote style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">
<div> AS> can be greatly simplified.<br>
</div></blockquote>
<br>This is what I meant, If we focus on working on the strings found and output
the processed content:<br><br>$ perl -n0777e 'print "$2, $1\n" while m{<first-name>\s*(.*?)\s*</first-name>\s*<last-name>\s*(.*?)\s*</last-name>}gs' test.txt<br>Franklin, Benjamin<br>
Melville, Herman<br><br>Thanks<br>