[pm-h] perl application: I'm in over my head

Wed Feb 7 09:13:47 PST 2007

* G. Wade Johnson <gwadej at anomaly.org> [070207 09:43]:
> On Wed, 7 Feb 2007 05:09:08 -0600
> "Russell L. Harris" <rlharris at oplink.net> wrote:
> 
>> I have a small programming problem which is a bit over my head.  
> 
> Hopefully, we can help.

Those are words I love to hear! 

The reason that I was inclined to try Perl is that I have an example
in which a programmer used a Perl script to patch RSS feed information
into a HTML header.  But I have not yet written anything in Perl.

> Fortunately, the above part is easy. You can either code a list of
> files or (preferred) pass the names on the command line of the
> script.

Passing names won't work for the files which hyperlatex generates
automatically with sequential numbers:

    TG_1.html
    TG_2.html
    ...
    TG_73.html
    ...

because the number of files changes whenever a section is added to or
removed from the document.  But the approach you mention below:

    perl -i.bak script.pl *.html

should be fine, so long as the script doesn't abort if it encounters
an HTML file in which there is no meta tag (there may be a few such
files).

> Perl has a special mode triggered by the -i option that edits an input
> file 'in-place'. 
...

OK, I found the -i option in the second edition of the O'Reilly book,
"Learning Perl", which I have here.

> I would suggest reading the file into an array of lines. You can use
> regular expressions on each line to find the lines of interest:
> 
>   - line containing "</title>"
>     - That saves you from the point where someone inevitably puts the
>       end tag on a different line.
>   - line containing the meta description tag
>   - line containing the meta keywords tag

I think I can do this, using the RSS patch script as an example.

> Use the splice operator to remove the meta lines from the array.
> Use the splice operator to insert the meta lines after the title
> line.

The index of the second edition of "Learning Perl" says nothing about
the spice operator.  I suppose I need to drive out to Half-Price Books
and search for a copy of the fourth edition, which appears to be the
latest edition.

> Write out all lines to replace the old file.
> 
> By using this approach and the -i option, you can process all of the
> files in a directory with:
> 
>   perl -i.bak script.pl *.html
> 
> Or you can list individual files on the command line.

All the HTML files in the directory are generated by hyperlatex, so
the "*.html" approach should work and be foolproof.

> If you still need more help, (and someone else doesn't answer earlier)
> I can try to fill in more of the details this evening, after work.

I shall be eager to receive whatever detailed assistance you or others
may offer.  And thanks ever so much for the reply!

At this point it appears that I need to start digging through the
O'Reilly book "Learning Perl".  So I don't expect to have anything
running before the weekend, unless someone decides to lend me a hand.

Regards,

RLH