OxPM: Search and Extract

Kevin.ADM-Gibbs at Alcan.Com Kevin.ADM-Gibbs at Alcan.Com
Thu Dec 5 05:18:56 CST 2002


Julian,

If you need to download the web page you should have a look at the LWP
module.  The LWP cook book in the standard distribution give examples of
how to use the module.

Once you have the file you can use the HTML::Parser module to check the
tags (<p>) you are interested in.  You'll then need to use regular
expressions to determine if the text contains your keyword.  Alternatively
you could use regular expressions to do the whole thing but that could be
trickier.

Cheers,

Kev.



                                                                                                                                       
                      "Julian Martin"                                                                                                  
                      <julianmartin at ntl        To:       <oxford-pm-list at happyfunball.pm.org>                                          
                      world.com>               cc:                                                                                     
                      Sent by:                 Subject:  OxPM: Search and Extract                                                      
                      owner-oxford-pm-l                                                                                                
                      ist at pm.org                                                                                                       
                                                                                                                                       
                                                                                                                                       
                      05/12/2002 11:00                                                                                                 
                      Please respond to                                                                                                
                      oxford-pm-list                                                                                                   
                                                                                                                                       
                                                                                                                                       






Hi
I would like to search some html pages for a  keyword and then extract the
<p>blah,  blah......keyword........blah</p> and then put the <p>blah,
blah......keyword........blah</p>'s into a results page. Any pointers
would be great ! I have Perl cookbook for any reference but cannot find
something like this in it.
Thanks

Julian.






More information about the Oxford-pm mailing list