[Kc] HTTP Links
Eric Wilhelm
scratchcomputing at gmail.com
Fri Aug 4 09:03:23 PDT 2006
# from djgoku
# on Friday 04 August 2006 08:44 am:
> What I wanted
>to do was to parse the html for *pdf links then use File::Fetch to get
>the PDFs, there might of been a module for this but
If you want a programming exercise, go ahead and write it. If you want
a *learning* exercise, learn to search CPAN. Probably the most
important thing to learn about Perl is how to not write code.
http://search.cpan.org/search?query=html+links&mode=all
>not sure how it
>would act on http links in multiline comments, so I thought I would
>just create something.
Why are you trying to parse links within the comments? There's only
three of them on that page and they appear to be commented for a
reason. If you're trying to solve the general case of finding links
hidden in comments, then use an XML parser to grab the comments and a
regular expression to look inside them (you can't count on anything in
the comments being valid HTML.) Of course, anything automated should
ignore the comments, but let me know what you come up with so I can add
comments to my pages that will crash it :-)
--Eric
--
Turns out the optimal technique is to put it in reverse and gun it.
--Steven Squyres (on challenges in interplanetary robot navigation)
---------------------------------------------------
http://scratchcomputing.com
---------------------------------------------------
More information about the kc
mailing list