[pm-h] perl application: I'm in over my head

Wed Feb 7 05:22:41 PST 2007

On Wed, 7 Feb 2007 05:09:08 -0600
"Russell L. Harris" <rlharris at oplink.net> wrote:

> I have a small programming problem which is a bit over my head.  

Hopefully, we can help.

> The problem involves web pages which I am generating with hyperlatex.
> It has just been brought to my attention that I need to add
> "description" and "keywords" meta tags to my web site; but I have not
> been able to do so directly with hyperlatex.  I am able to use
> hyperlatex to create the appropriate meta tags for each HTML file, but
> the tags are created in the body of the file, rather than in the head;
> thus, my need is to relocate the tags.  

Okay.

> It seems to me that the following is a reasonable approach:
> 
> TASK: 
> 
>     When creating HTML files from a LaTeX source file, the "\xml{tag}"
>     command of hyperlatex creates a meta tag in the body of the file.
>     However, the the "description" and "keywords" meta tags belong in
>     the head of the file, and there appears to be no way (short of
>     hacking hyperlatex) to force hyperlatex to write a meta tag to the
>     head of the file.
> 
>     Consequently, once the HTML file has been created, it is necessary
>     to move the "description" and "keywords" meta tags from the body
>     to the head.  
> 
>     A Perl routine invoked by a "make" file appears to be a good way
>     to accomplish this task.
> 
> PROCEDURE:
> 
> For a fixed list of HTML files with assorted names:
> 
>     name1.html
>     name2.html
>     anothername1.html
>     anothername2.html
> 
> and a variable list of HTML files which are systematically-named:
> 
>     TG_1.html
>     TG_2.html
>     TG_3.html
>     ...
>     TG_73.html
>     TG_74.html
>     ...

Fortunately, the above part is easy. You can either code a list of
files or (preferred) pass the names on the command line of the script.

Perl has a special mode triggered by the -i option that edits an input
file 'in-place'. Basically, it takes it's input from the file and
writes the output back to the file (optionally creating a backup file).
I've used this to good affect modifying website files in the past.

> (1) Search for the tag: 
> 
>     <meta name="description" ... >
> 
> (2) If the tag is found, move the tag from the body of the HTML file
> to the head of the HTML file, inserting it immediately following the
> line of the title tag: 
> 
>     <title> ... </title>
> 
> (3) Search for the tag: 
> 
>     <meta name="keywords" ... >
> 
> (4) If the tag is found, move the tag from the body of the HTML file
> to the head of the HTML file, inserting it immediately following the
> line of the title tag:
> 
>     <title> ... </title>
> 
> (5) Proceed to the next file.
> 
> (6) Once all files in the two lists have been processed, exit.

That looks like a basically sound approach. I would suggest reading
the file into an array of lines. You can use regular expressions on
each line to find the lines of interest:

  - line containing "</title>"
    - That saves you from the point where someone inevitably puts the
      end tag on a different line.
  - line containing the meta description tag
  - line containing the meta keywords tag

Use the splice operator to remove the meta lines from the array.
Use the splice operator to insert the meta lines after the title line.

Write out all lines to replace the old file.

By using this approach and the -i option, you can process all of the
files in a directory with:

  perl -i.bak script.pl *.html

Or you can list individual files on the command line.

If you still need more help, (and someone else doesn't answer earlier)
I can try to fill in more of the details this evening, after work.

G. Wade
-- 
Those who live by the sword get shot by those who don't.