[Philadelphia-pm] Perl one liner, regex capture group problem

Stan Schwertly stan at schwertly.com
Fri Oct 28 08:03:15 PDT 2011


I wrote this one-liner for a friend, and wanted to replace the end with
Perl. It pulls out the image sources from a URL:

command: curl -so-
http://www.wikihow.com/Make-Easy-Homemade-Biscuits|egrep-o
"src='.*[^js]'"|cut -c 5-
abbreviated output:
'
http://pad2.whstatic.com/images/thumb/3/31/Gfrollsonplate_198.jpg/-crop-44-33-40px-Gfrollsonplate_198.jpg
'
'http://pad1.whstatic.com/skins/WikiHow/images/corner_sprite.png'
'
http://pad3.whstatic.com/images/thumb/7/71/Bread-rolls-2126.jpg/-crop-44-33-44px-Bread-rolls-2126.jpg
'
'http://pad1.whstatic.com/skins/WikiHow/images/corner_sprite.png'

I tried to replace it with the following command, but it doesn't seem to be
respecting the capture group:

command: curl -so-
http://www.wikihow.com/Make-Easy-Homemade-Biscuits|perl-nE "say $1 if
/src='(\S+(?:png|jpg))'/"
abbreviated output:
                                <a href='
http://www.wikihow.com/Make-Pineapple-Biscuits'><img class='rounders2_img'
alt='' src='
http://pad2.whstatic.com/images/thumb/b/be/Pineapple-tart.jpg/-crop-44-33-36px-Pineapple-tart.jpg'
/>

                                <img class='rounders2_sprite' alt='' src='
http://pad1.whstatic.com/skins/WikiHow/images/corner_sprite.png'/>

It's printing the matched line, but doesn't populate $1 correctly. What
should I change?

BR
Stan Schwertly
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/philadelphia-pm/attachments/20111028/81a0d19b/attachment.html>


More information about the Philadelphia-pm mailing list