[pm-h] perl application: I'm in over my head
Russell L. Harris
rlharris at oplink.net
Wed Feb 7 09:13:47 PST 2007
* G. Wade Johnson <gwadej at anomaly.org> [070207 09:43]:
> On Wed, 7 Feb 2007 05:09:08 -0600
> "Russell L. Harris" <rlharris at oplink.net> wrote:
>
>> I have a small programming problem which is a bit over my head.
>
> Hopefully, we can help.
Those are words I love to hear!
The reason that I was inclined to try Perl is that I have an example
in which a programmer used a Perl script to patch RSS feed information
into a HTML header. But I have not yet written anything in Perl.
> Fortunately, the above part is easy. You can either code a list of
> files or (preferred) pass the names on the command line of the
> script.
Passing names won't work for the files which hyperlatex generates
automatically with sequential numbers:
TG_1.html
TG_2.html
...
TG_73.html
...
because the number of files changes whenever a section is added to or
removed from the document. But the approach you mention below:
perl -i.bak script.pl *.html
should be fine, so long as the script doesn't abort if it encounters
an HTML file in which there is no meta tag (there may be a few such
files).
> Perl has a special mode triggered by the -i option that edits an input
> file 'in-place'.
...
OK, I found the -i option in the second edition of the O'Reilly book,
"Learning Perl", which I have here.
> I would suggest reading the file into an array of lines. You can use
> regular expressions on each line to find the lines of interest:
>
> - line containing "</title>"
> - That saves you from the point where someone inevitably puts the
> end tag on a different line.
> - line containing the meta description tag
> - line containing the meta keywords tag
I think I can do this, using the RSS patch script as an example.
> Use the splice operator to remove the meta lines from the array.
> Use the splice operator to insert the meta lines after the title
> line.
The index of the second edition of "Learning Perl" says nothing about
the spice operator. I suppose I need to drive out to Half-Price Books
and search for a copy of the fourth edition, which appears to be the
latest edition.
> Write out all lines to replace the old file.
>
> By using this approach and the -i option, you can process all of the
> files in a directory with:
>
> perl -i.bak script.pl *.html
>
> Or you can list individual files on the command line.
All the HTML files in the directory are generated by hyperlatex, so
the "*.html" approach should work and be foolproof.
> If you still need more help, (and someone else doesn't answer earlier)
> I can try to fill in more of the details this evening, after work.
I shall be eager to receive whatever detailed assistance you or others
may offer. And thanks ever so much for the reply!
At this point it appears that I need to start digging through the
O'Reilly book "Learning Perl". So I don't expect to have anything
running before the weekend, unless someone decides to lend me a hand.
Regards,
RLH
More information about the Houston
mailing list