From gwadej at anomaly.org Thu Feb 1 16:58:28 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 1 Feb 2007 18:58:28 -0600 Subject: [pm-h] Fwd: European Perl Hackathon Message-ID: <20070201185828.045b8c5c@sovvan> From: Ann Barcomb Subject: [Hackathons] Announcement: European Perl Hackathon Date: Thu, 1 Feb 2007 16:22:13 +0100 (CET) To: hackathons at pm.org You are invited to attend the European Perl Hackathon in Arnhem, the Netherlands, from 2 - 4 March, 2007. Familiarity with the featured projects is not required; you need only bring a laptop and a willingness to join in. Although there is no fee to attend the hackathon, you are required to pay for your own accommodation and transportation. However, it is possible to book a room at the venue location when you register for the hackathon, at the price of 74 Euros for two nights plus breakfast. Space is limited to 30 participants, and registration is required. Reservations for accommodations made through the hackathon must be made by 9 February; reservations for the event itself must be made no later than 22 February. For more information about the event, please refer to http://conferences.yapceurope.org/hack2007nl Feel free to circulate this notice. From gwadej at anomaly.org Tue Feb 6 19:55:00 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 6 Feb 2007 21:55:00 -0600 Subject: [pm-h] February Meeting Message-ID: <20070206215500.52caca8d@sovvan> The February meeting of Houston.pm is in one week, on Feb. 13. We will be meeting at 1111 Fannin, downtown, starting between 6pm and 6:30pm as we have for the last few months. This month, Robert Boone will present an Intro to Catalyst. Hope to see you all there. G. Wade -- Who knows what email lurks in the hearts of men? From rlharris at oplink.net Wed Feb 7 03:09:08 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Wed, 7 Feb 2007 05:09:08 -0600 Subject: [pm-h] perl application: I'm in over my head Message-ID: <20070207110908.GA15658@cromwell.tmiaf> I have a small programming problem which is a bit over my head. The problem involves web pages which I am generating with hyperlatex. It has just been brought to my attention that I need to add "description" and "keywords" meta tags to my web site; but I have not been able to do so directly with hyperlatex. I am able to use hyperlatex to create the appropriate meta tags for each HTML file, but the tags are created in the body of the file, rather than in the head; thus, my need is to relocate the tags. It seems to me that the following is a reasonable approach: TASK: When creating HTML files from a LaTeX source file, the "\xml{tag}" command of hyperlatex creates a meta tag in the body of the file. However, the the "description" and "keywords" meta tags belong in the head of the file, and there appears to be no way (short of hacking hyperlatex) to force hyperlatex to write a meta tag to the head of the file. Consequently, once the HTML file has been created, it is necessary to move the "description" and "keywords" meta tags from the body to the head. A Perl routine invoked by a "make" file appears to be a good way to accomplish this task. PROCEDURE: For a fixed list of HTML files with assorted names: name1.html name2.html anothername1.html anothername2.html and a variable list of HTML files which are systematically-named: TG_1.html TG_2.html TG_3.html ... TG_73.html TG_74.html ... (1) Search for the tag: (2) If the tag is found, move the tag from the body of the HTML file to the head of the HTML file, inserting it immediately following the line of the title tag: ... (3) Search for the tag: (4) If the tag is found, move the tag from the body of the HTML file to the head of the HTML file, inserting it immediately following the line of the title tag: ... (5) Proceed to the next file. (6) Once all files in the two lists have been processed, exit. RLH From gwadej at anomaly.org Wed Feb 7 05:22:41 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Wed, 7 Feb 2007 07:22:41 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <20070207110908.GA15658@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> Message-ID: <20070207072241.78846fda@sovvan> On Wed, 7 Feb 2007 05:09:08 -0600 "Russell L. Harris" wrote: > I have a small programming problem which is a bit over my head. Hopefully, we can help. > The problem involves web pages which I am generating with hyperlatex. > It has just been brought to my attention that I need to add > "description" and "keywords" meta tags to my web site; but I have not > been able to do so directly with hyperlatex. I am able to use > hyperlatex to create the appropriate meta tags for each HTML file, but > the tags are created in the body of the file, rather than in the head; > thus, my need is to relocate the tags. Okay. > It seems to me that the following is a reasonable approach: > > TASK: > > When creating HTML files from a LaTeX source file, the "\xml{tag}" > command of hyperlatex creates a meta tag in the body of the file. > However, the the "description" and "keywords" meta tags belong in > the head of the file, and there appears to be no way (short of > hacking hyperlatex) to force hyperlatex to write a meta tag to the > head of the file. > > Consequently, once the HTML file has been created, it is necessary > to move the "description" and "keywords" meta tags from the body > to the head. > > A Perl routine invoked by a "make" file appears to be a good way > to accomplish this task. > > PROCEDURE: > > For a fixed list of HTML files with assorted names: > > name1.html > name2.html > anothername1.html > anothername2.html > > and a variable list of HTML files which are systematically-named: > > TG_1.html > TG_2.html > TG_3.html > ... > TG_73.html > TG_74.html > ... Fortunately, the above part is easy. You can either code a list of files or (preferred) pass the names on the command line of the script. Perl has a special mode triggered by the -i option that edits an input file 'in-place'. Basically, it takes it's input from the file and writes the output back to the file (optionally creating a backup file). I've used this to good affect modifying website files in the past. > (1) Search for the tag: > > > > (2) If the tag is found, move the tag from the body of the HTML file > to the head of the HTML file, inserting it immediately following the > line of the title tag: > > ... > > (3) Search for the tag: > > > > (4) If the tag is found, move the tag from the body of the HTML file > to the head of the HTML file, inserting it immediately following the > line of the title tag: > > ... > > (5) Proceed to the next file. > > (6) Once all files in the two lists have been processed, exit. That looks like a basically sound approach. I would suggest reading the file into an array of lines. You can use regular expressions on each line to find the lines of interest: - line containing "" - That saves you from the point where someone inevitably puts the end tag on a different line. - line containing the meta description tag - line containing the meta keywords tag Use the splice operator to remove the meta lines from the array. Use the splice operator to insert the meta lines after the title line. Write out all lines to replace the old file. By using this approach and the -i option, you can process all of the files in a directory with: perl -i.bak script.pl *.html Or you can list individual files on the command line. If you still need more help, (and someone else doesn't answer earlier) I can try to fill in more of the details this evening, after work. G. Wade -- Those who live by the sword get shot by those who don't. From rlharris at oplink.net Wed Feb 7 09:13:47 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Wed, 7 Feb 2007 11:13:47 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <20070207072241.78846fda@sovvan> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> Message-ID: <20070207171347.GA3615@cromwell.tmiaf> * G. Wade Johnson [070207 09:43]: > On Wed, 7 Feb 2007 05:09:08 -0600 > "Russell L. Harris" wrote: > >> I have a small programming problem which is a bit over my head. > > Hopefully, we can help. Those are words I love to hear! The reason that I was inclined to try Perl is that I have an example in which a programmer used a Perl script to patch RSS feed information into a HTML header. But I have not yet written anything in Perl. > Fortunately, the above part is easy. You can either code a list of > files or (preferred) pass the names on the command line of the > script. Passing names won't work for the files which hyperlatex generates automatically with sequential numbers: TG_1.html TG_2.html ... TG_73.html ... because the number of files changes whenever a section is added to or removed from the document. But the approach you mention below: perl -i.bak script.pl *.html should be fine, so long as the script doesn't abort if it encounters an HTML file in which there is no meta tag (there may be a few such files). > Perl has a special mode triggered by the -i option that edits an input > file 'in-place'. ... OK, I found the -i option in the second edition of the O'Reilly book, "Learning Perl", which I have here. > I would suggest reading the file into an array of lines. You can use > regular expressions on each line to find the lines of interest: > > - line containing "" > - That saves you from the point where someone inevitably puts the > end tag on a different line. > - line containing the meta description tag > - line containing the meta keywords tag I think I can do this, using the RSS patch script as an example. > Use the splice operator to remove the meta lines from the array. > Use the splice operator to insert the meta lines after the title > line. The index of the second edition of "Learning Perl" says nothing about the spice operator. I suppose I need to drive out to Half-Price Books and search for a copy of the fourth edition, which appears to be the latest edition. > Write out all lines to replace the old file. > > By using this approach and the -i option, you can process all of the > files in a directory with: > > perl -i.bak script.pl *.html > > Or you can list individual files on the command line. All the HTML files in the directory are generated by hyperlatex, so the "*.html" approach should work and be foolproof. > If you still need more help, (and someone else doesn't answer earlier) > I can try to fill in more of the details this evening, after work. I shall be eager to receive whatever detailed assistance you or others may offer. And thanks ever so much for the reply! At this point it appears that I need to start digging through the O'Reilly book "Learning Perl". So I don't expect to have anything running before the weekend, unless someone decides to lend me a hand. Regards, RLH From will.willis at gmail.com Wed Feb 7 13:05:55 2007 From: will.willis at gmail.com (Will Willis) Date: Wed, 7 Feb 2007 15:05:55 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <20070207171347.GA3615@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070207171347.GA3615@cromwell.tmiaf> Message-ID: <6ee1e6090702071305j10a60369vebd8a59bcb9aecff@mail.gmail.com> On 2/7/07, Russell L. Harris wrote: > * G. Wade Johnson [070207 09:43]: > > On Wed, 7 Feb 2007 05:09:08 -0600 > > "Russell L. Harris" wrote: > > Use the splice operator to remove the meta lines from the array. > > Use the splice operator to insert the meta lines after the title > > line. > > The index of the second edition of "Learning Perl" says nothing about > the spice operator. I suppose I need to drive out to Half-Price Books > and search for a copy of the fourth edition, which appears to be the > latest edition. > For online perl documentation, including information on splice(), visit this website, http://perldoc.perl.org/ Documentation may also be on your computer, try `perldoc perldoc` to get your feet wet then take a look at `perldoc perlfunc` and look for splice. Good luck! -Will From rlharris at oplink.net Wed Feb 7 15:20:30 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Wed, 7 Feb 2007 17:20:30 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <6ee1e6090702071305j10a60369vebd8a59bcb9aecff@mail.gmail.com> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070207171347.GA3615@cromwell.tmiaf> <6ee1e6090702071305j10a60369vebd8a59bcb9aecff@mail.gmail.com> Message-ID: <20070207232030.GA16687@cromwell.tmiaf> * Will Willis [070207 15:07]: > For online perl documentation, including information on splice(), > visit this website, http://perldoc.perl.org/ > > Documentation may also be on your computer, try `perldoc perldoc` to > get your feet wet then take a look at `perldoc perlfunc` and look for > splice. Thanks, Will. I'm running Debian Etch. 'perldoc perldoc' and 'perldoc perlfunc' work fine, as does 'perldoc -f splice', so that takes away the urgency for a trip to the bookstore (though I still intend to obtain a copy of the fourth edition of "Learning Perl"; there's hardly anything better than an O'Reilly book). Is there an easy way to send the perldoc output to the printer so that the pages are properly formatted -- something similar to the way formatted man pages may be printed with: man -t perldoc | lpr I tried perldoc -t perldoc | lpr but there is no formatting. I work much better with a piece of paper on the desk, rather than reading from the screen. RLH From john at nixnuts.net Wed Feb 7 15:43:55 2007 From: john at nixnuts.net (John Lightsey) Date: Wed, 07 Feb 2007 17:43:55 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <20070207232030.GA16687@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070207171347.GA3615@cromwell.tmiaf> <6ee1e6090702071305j10a60369vebd8a59bcb9aecff@mail.gmail.com> <20070207232030.GA16687@cromwell.tmiaf> Message-ID: <1170891835.22158.1.camel@localhost.localdomain> On Wed, 2007-02-07 at 17:20 -0600, Russell L. Harris wrote: > Is there an easy way to send the perldoc output to the printer so that > the pages are properly formatted -- something similar to the way > formatted man pages may be printed with: > > man -t perldoc | lpr > > I tried > > perldoc -t perldoc | lpr > > but there is no formatting. I work much better with a piece of paper > on the desk, rather than reading from the screen. Try perldoc -n groff perldoc | lpr John From rlharris at oplink.net Wed Feb 7 15:44:51 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Wed, 7 Feb 2007 17:44:51 -0600 Subject: [pm-h] perl application: I'm in over my head In-Reply-To: <20070207232030.GA16687@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070207171347.GA3615@cromwell.tmiaf> <6ee1e6090702071305j10a60369vebd8a59bcb9aecff@mail.gmail.com> <20070207232030.GA16687@cromwell.tmiaf> Message-ID: <20070207234451.GB16687@cromwell.tmiaf> * Russell L. Harris [070207 17:21]: > > Is there an easy way to send the perldoc output to the printer so that > the pages are properly formatted -- something similar to the way > formatted man pages may be printed with: > > man -t perldoc | lpr Answering my own question, this works for modules: perldoc -n groff -T perldoc | lpr and this works for functions: perldoc -n groff -T -f splice | lpr RLH From gwadej at anomaly.org Wed Feb 7 16:53:16 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Wed, 7 Feb 2007 18:53:16 -0600 Subject: [pm-h] Fw: UG News--February is Web Design and Development Month at O'Reilly Message-ID: <20070207185316.30483e60@sovvan> Begin forwarded message: Date: Wed, 07 Feb 2007 11:56:09 -0800 From: "Marsee Henon" To: gwadej at anomaly.org Subject: UG News--February is Web Design and Development Month at O'Reilly Hi, Can you share the following with your members if you think they might be interested? It's Web Design and Development Month here at O'Reilly and we just put together a special resource page dedicated to web development essentials including books, PDF Short Cuts, articles, and author events: http://www.oreilly.com/go/webdev Don't forget your members can receive 35% off any of these titles when they use discount code DSUG on our site. There's also free ground shipping in the US on orders over $29.95. Happy FebWeb, Marsee ================================================================ O'Reilly 1005 Gravenstein Highway North Sebastopol, CA 95472 http://ug.oreilly.com/ http://ug.oreilly.com/creativemedia/ ================================================================ -- Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius, and a lot of courage, to move in the opposite direction. -- Albert Einstein From rlharris at oplink.net Fri Feb 9 09:42:45 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Fri, 9 Feb 2007 11:42:45 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070207072241.78846fda@sovvan> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> Message-ID: <20070209174245.GA29184@cromwell.tmiaf> This message begins a new thread titled "relocate metatags in hyperlatex-generated HTML" which continues the thread "perl application: I'm in over my head". The problem is how to relocate metatags in hyperlatex-generated HTML, moving the tags from the body of the file to the head of the file. I've just finished reading the 4th edition of the O'Reilly Llama book, "Learning Perl"; and I'm still "in over my head". > On Wed, 7 Feb 2007 05:09:08 -0600 > "Russell L. Harris" wrote: > >> (1) Search for the tag: >> >> >> >> (2) If the tag is found, move the tag from the body of the HTML file >> to the head of the HTML file, inserting it immediately following the >> line of the title tag: >> >> ... >> >> (3) Search for the tag: >> >> >> >> (4) If the tag is found, move the tag from the body of the HTML file >> to the head of the HTML file, inserting it immediately following the >> line of the title tag: >> >> ... * G. Wade Johnson [070207 09:43]: > > I would suggest reading the file into an array of lines. You can use > regular expressions on each line to find the lines of interest. > > Use the splice operator to remove the meta lines from the array. > > Use the splice operator to insert the meta lines after the title line. > > Write out all lines to replace the old file. > > By using this approach and the -i option, you can process all of the > files in a directory with: > > perl -i.bak script.pl *.html * G. Wade Johnson [070207 21:56]: > > ---------------------------------------------------- > #!/usr/bin/perl -i.bak > > use strict; > use warnings; > > # slurp the whole file as a single string. > undef $/; > > while(<>) > { > # split the file into a list of lines, losing the newline > # in the process > my @lines = split( /\r?\n/, $_ ); > > # process the lines here > > # If you don't print these out, the new file will be empty. > print join( "\n", @lines ); > } > ----------------------------------------------------- OK; I have this framework running. A new file is being generated and the old file is saved with the ".bak" extension. But I haven't figured out how to use regular expression matching to obtain the offsets and lengths needed by "splice". For the "processing", it appears to me that I must: (1) find within the array @lines the offset of the line following the <\title> tag; this is the insertion point (2) find within the array @lines the offset and the length of the "keywords" metatag, which is the second of the two tags (3) call splice to remove the "keywords" metatag from the array @lines (4) insert the "keywords" metatag at the insertion point in the array @lines (5) find within the array @lines the offset and the length length of the "description" metatag, which is the first of the two tags (6) call splice to remove the "description" metatag from the array @lines (7) insert the "description" metatag at the insertion point in the array @lines By moving the second tag before moving the first, the offset of the insertion point does not change. In the Llama book, in a footnote in chapter 9 ("Processing Text with Regular Expressions") is the following warning: "...you can't correctly parse HTML with simple regular expressions. If you need to work with HTML or a similar markup language, use a module that's made to handle the complexities." What am I to make of this warning? RLH From john at nixnuts.net Fri Feb 9 16:41:36 2007 From: john at nixnuts.net (John Lightsey) Date: Fri, 09 Feb 2007 18:41:36 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070209174245.GA29184@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> Message-ID: <1171068096.6553.45.camel@localhost.localdomain> On Fri, 2007-02-09 at 11:42 -0600, Russell L. Harris wrote: ... > In the Llama book, in a footnote in chapter 9 ("Processing Text with > Regular Expressions") is the following warning: "...you can't > correctly parse HTML with simple regular expressions. If you need to > work with HTML or a similar markup language, use a module that's made > to handle the complexities." What am I to make of this warning? I don't have a copy of the book so I don't know the exact context, but I assume he's referring to the fact that HTML is like Perl in the sense you can write HTML that behaves identically in a variety of ways. text text text text text text All those do the same basic thing.. even text So using a regular expression to parse HTML is just brittle. You can't possibly account for all of the legal variations of HTML syntax. The more robust alternative is to use something like HTML::TokeParser or HTML::TreeBuilder to do most of the work. OTOH, if you're just grabbing two misplaced tags from the body and inserting it before , you can put the entire document in one scalar and make the change with a one line substitution. $html =~ s/(.*)(]+>)(.*)(]+>)/\2\4\1\3/is; Using one of the more robust methods of processing HTML is likely overkill. John From gwadej at anomaly.org Fri Feb 9 21:21:43 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Fri, 9 Feb 2007 23:21:43 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070209174245.GA29184@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> Message-ID: <20070209232143.691472b5@sovvan> On Fri, 9 Feb 2007 11:42:45 -0600 "Russell L. Harris" wrote: > This message begins a new thread titled > > "relocate metatags in hyperlatex-generated HTML" > > which continues the thread > > "perl application: I'm in over my head". > > The problem is how to relocate metatags in hyperlatex-generated HTML, > moving the tags from the body of the file to the head of the file. > > I've just finished reading the 4th edition of the O'Reilly Llama book, > "Learning Perl"; and I'm still "in over my head". Sounds like a good start. [snip] > OK; I have this framework running. A new file is being generated and > the old file is saved with the ".bak" extension. > > But I haven't figured out how to use regular expression matching to > obtain the offsets and lengths needed by "splice". > > For the "processing", it appears to me that I must: > > (1) find within the array @lines the offset of the line following > the <\title> tag; this is the insertion point > > (2) find within the array @lines the offset and the length of the > "keywords" metatag, which is the second of the two tags > > (3) call splice to remove the "keywords" metatag from the array > @lines > > (4) insert the "keywords" metatag at the insertion point in the > array @lines > > (5) find within the array @lines the offset and the length length > of the "description" metatag, which is the first of the two tags > > (6) call splice to remove the "description" metatag from the array > @lines > > (7) insert the "description" metatag at the insertion point in the > array @lines This is a good description of what you need to do. > By moving the second tag before moving the first, the offset of the > insertion point does not change. Not a bad approach, although there's a slightly easier way. > In the Llama book, in a footnote in chapter 9 ("Processing Text with > Regular Expressions") is the following warning: "...you can't > correctly parse HTML with simple regular expressions. If you need to > work with HTML or a similar markup language, use a module that's made > to handle the complexities." What am I to make of this warning? The "right" way to solve the problem would be to use an HTML parser. But, in this case it would be overkill and would require you to learn a lot more before you could get started (as John has already pointed out). In this particular case, you have the code that is generating the meta links, you are not working with HTML from the wild. If you were looking at the general case of extracting something from HTML, you would need the big guns. You may still want to go that route later when you are more comfortable with the language an the problem. Now, let's get back to the quick and dirty solution. Let's start with some assumptions, let me know if I fail one of them. 1. The end tag is not broken across a line boundary (pretty safe). 2. The 'keywords' metatag is all on one line. 3. There is nothing else on the line with the 'keywords' metatag. 4. The 'description' metatag is all on one line. 5. There is nothing else on the line with the 'description' metatag. If any of the above is not true, we will need to get a little more complicated. So let's work out the simple case. The core of the processing is a loop over the lines by index. Most of the time in Perl it is better to loop over an array with a foreach loop one the elements themselves, but in this case we want to save off the indexes of the elements we find interesting. If you have a C, C++, Java, or JavaScript background, you should recognize the loop my $lines_count = @lines; for(my $index = 0;$index < $lines_count;++$index) { #Test lines here } If the assumptions given above hold, the tests are relatively straight-forward. They would look something like: if($lines[$index] =~ m{}) { # save $index somewhere for later. } The regex match can use delimiters other than / if you supply the optional 'm'. This helps especially when parsing HTML/XML looking text, because otherwise you have to escape every '/' that's in the real expression. Repeat as needed for the other pieces. Then use splice to remove the ones you want to move and insert after the title index. G. Wade -- One OS to rule them all, One OS to find them, One OS to bring them all and in the darkness bind them, In the land of Redmond, where the Windows lie. From rlharris at oplink.net Sat Feb 10 06:57:51 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Sat, 10 Feb 2007 08:57:51 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070209232143.691472b5@sovvan> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> Message-ID: <20070210145751.GA22013@cromwell.tmiaf> * G. Wade Johnson [070209 23:28]: > Now, let's get back to the quick and dirty solution. > > Let's start with some assumptions, let me know if I fail one of them. > > 1. The end tag is not broken across a line boundary (pretty > safe). > > 2. The 'keywords' metatag is all on one line. > > 3. There is nothing else on the line with the 'keywords' metatag. > > 4. The 'description' metatag is all on one line. > > 5. There is nothing else on the line with the 'description' metatag. > > If any of the above is not true, we will need to get a little more > complicated. The only assumptions which fail are numbers 2 and 4. However, there is a work-around. The problem is that I am using XEmacs, and, when working on a LaTeX document, I normally use "auto-fill" mode, which automatically breaks the line and adds a newline character in the LaTeX source. This results in a metatag which typically spans two or more lines. Interestingly, when composing the title tag, I can append the "%" character to the end of the line wherever the line break happens to fall, with the result that hyperlatex creates HTML source in which the title tag is entirely on a single (very long) line. But this ploy does not work when composing a metatag. The work-around: I can turn off auto-fill mode when composing a metatag, and thereby create a metatag which is contained entirely on a single line in the HTML source. But if I forget to turn off auto-fill mode, or if I happen to go back and modify the metatag when auto-fill mode is in effect, then XEmacs breaks the line. So, because of Murphy's law ("Undesirable things which can happen almost invariably do."), it would be much better to provide for the case in which a metatag spans two or more lines. RLH From gwadej at anomaly.org Sat Feb 10 08:42:15 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sat, 10 Feb 2007 10:42:15 -0600 Subject: [pm-h] Perl workshop Message-ID: <20070210104215.74ef4087@sovvan> This is forwarded from th main Perl Mongers list. The Copenhagen Perl Mongers will host the Fifth Nordic Perl Workshop on April 28-29, 2007. Submit proposals for papers, offer sponsorship, or volunteer to help. The price for two days of the workshop with lunch included is 500 DKK (about $US90). Presentations are held mostly in english. And, as usual, the workshop fee is waived for speakers, so submit a talk! Hope to see you in Copenhagen! http://www.perlworkshop.dk/2007/ http://www.perlworkshop.dk/2007/cfp.html http://www.perlworkshop.dk/2007/sponsors.html -- A 'language' is a dialect with an army. From gwadej at anomaly.org Sun Feb 11 14:05:45 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sun, 11 Feb 2007 16:05:45 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070210145751.GA22013@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> <20070210145751.GA22013@cromwell.tmiaf> Message-ID: <20070211160545.39fd1a26@sovvan> On Sat, 10 Feb 2007 08:57:51 -0600 "Russell L. Harris" wrote: > * G. Wade Johnson [070209 23:28]: > > Now, let's get back to the quick and dirty solution. > > > > Let's start with some assumptions, let me know if I fail one of > > them. > > > > 1. The end tag is not broken across a line boundary (pretty > > safe). > > > > 2. The 'keywords' metatag is all on one line. > > > > 3. There is nothing else on the line with the 'keywords' metatag. > > > > 4. The 'description' metatag is all on one line. > > > > 5. There is nothing else on the line with the 'description' metatag. > > > > If any of the above is not true, we will need to get a little more > > complicated. > > The only assumptions which fail are numbers 2 and 4. However, there > is a work-around. Since we can't get each of the tags of interest on a single line, let's take a completely different approach. Again, this would not be recommended, for more comprehensive processing. We are beginning to reach the point where one of the HTML processing modules might be a good idea. That said, let's try a new approach. We'll change the main loop to: ------------------------------- #!/usr/bin/perl -i.bak use strict; use warnings; # slurp the whole file as a single string. undef $/; while(my $file = <>) { # processing steps here. my $metatags = q{}; # If you don't print these out, the new file will be empty. print $file; } ------------------------------- In this case, we will need to process the entire file as a single string. This also means that we need to handle regular expressions slightly differently to deal with linebreaks in the text. The main tool you'll use here is the substitute operator s///. We will use it two ways. 1. Extract meta tags from the file 2. Insert a string after the title. The first task will consist of the following code for each tag you wish to move. if($file =~ s{(]*>)}{}sm) { $metatags .= "$1\n"; } This replaces the keyword metatag with nothing, removing it from the string. We also capture the metatag with the parens, making it available in the variable $1. To insert the $metatag string after the title, use something like $file =~ s{()}{$1\n$metatags}sm; This will probably end up with some extra blank lines where the metatags were removed and inserted, but cleaning it up shouldn't be too hard. G. Wade -- There are 2 possible outcomes: If the result confirms the hypothesis, then you've made a measurement. If the result is contrary to the hypothesis, then you've made a discovery. -- Enrico Fermi From rlharris at oplink.net Sun Feb 11 17:55:30 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Sun, 11 Feb 2007 19:55:30 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070211160545.39fd1a26@sovvan> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> <20070210145751.GA22013@cromwell.tmiaf> <20070211160545.39fd1a26@sovvan> Message-ID: <20070212015530.GA7234@cromwell.tmiaf> * G. Wade Johnson [070211 16:07]: > let's try a new approach. ... > my $metatags = q{}; At this I am stumped. $metatags is a scalar variable; the name is plural, because (as is apparent the concatenation assignment operator below) it is to contain all the metatags to be relocated, with the newline character as the delimiter. q appears to be a quoting operator, with delimiters { and } . So, in this line of code, it appears that metatags is declared and is initialized to the null string. I am puzzled as to why it is necessary to initialize the variable to the null string, unless the concatenation operator fails when attempting to concatenate a string to an undefined scalar. I also am puzzled as to why the curly braces are used as delimiters for the null string. Finally, I am curious as to why it would not be sufficient merely to write: my $metatags = '' RLH From gwadej at anomaly.org Sun Feb 11 20:45:29 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sun, 11 Feb 2007 22:45:29 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070212015530.GA7234@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> <20070210145751.GA22013@cromwell.tmiaf> <20070211160545.39fd1a26@sovvan> <20070212015530.GA7234@cromwell.tmiaf> Message-ID: <20070211224529.3f02ac81@sovvan> On Sun, 11 Feb 2007 19:55:30 -0600 "Russell L. Harris" wrote: > * G. Wade Johnson [070211 16:07]: > > let's try a new approach. > ... > > my $metatags = q{}; > > At this I am stumped. > > $metatags is a scalar variable; the name is plural, because (as is > apparent the concatenation assignment operator below) it is to contain > all the metatags to be relocated, with the newline character as the > delimiter. Good. > > q appears to be a quoting operator, with delimiters { and } . q is a generalized quote operator. It is followed by a delimiter that marks the ends of the strings. It's especially useful with paired delimiters like {} and <>. > So, in this line of code, it appears that metatags is declared and is > initialized to the null string. Agreed. > I am puzzled as to why it is necessary to initialize the variable to > the null string, unless the concatenation operator fails when > attempting to concatenate a string to an undefined scalar. It doesn't fail, but the first time gives a warning. This also helps to document (for the next guy to read it) that we intend this to be a string and not something else. > I also am puzzled as to why the curly braces are used as delimiters > for the null string. This I got from the book "Perl Best Practices". The expression q{} is a bit more visually distinct than '' or "" which look different for different fonts. In some cases, '' looks more like a double-quote and can therefore be confusing. Other than that, q{} is identical to ''. > Finally, I am curious as to why it would not be sufficient merely to > write: > > my $metatags = '' It would be sufficient. It's just a style thing. G. Wade -- As a software development model, Anarchy does not scale well. -- Dave Welch From gwadej at anomaly.org Mon Feb 12 05:32:45 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 12 Feb 2007 07:32:45 -0600 Subject: [pm-h] February Meeting Message-ID: <20070212073245.5cccf14b@sovvan> Remember, the February meeting of Houston.pm is this Tuesday in the basement at 1111 Fannin downtown. http://maps.google.com/maps?f=q&hl=en&q=1111+fannin+77002&ie=UTF8&om=1&z=19&ll=29.754784,-95.365013&spn=0.001106,0.002682&t=h People will be in the lobby to let people in between about 6pm and 6:30pm. Parking on the street is free after 6. Robert Boone is presenting an introduction to Catalyst. See you there. G. Wade -- To vacillate or not to vacillate, that is the question ... or is it? From rlharris at oplink.net Tue Feb 13 16:10:53 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Tue, 13 Feb 2007 18:10:53 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070211224529.3f02ac81@sovvan> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> <20070210145751.GA22013@cromwell.tmiaf> <20070211160545.39fd1a26@sovvan> <20070212015530.GA7234@cromwell.tmiaf> <20070211224529.3f02ac81@sovvan> Message-ID: <20070214001053.GA8468@cromwell.tmiaf> Success! The following code does the job of relocating the meta tags: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% #!/usr/bin/perl -i.bak # 20070213 2325 gmt use strict; use warnings; undef $/; while(my $file = <>) { my $metatags = q{}; if($file =~ s{(]*>)}{}sm) { $metatags .= "$1\n"; } if($file =~ s{(]*>)}{}sm) { $metatags .= "$1\n"; } $file =~ s{()}{$1\n$metatags}sm; print $file; } %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% However, looking at the code, I see that I don't understand the working of the diamond operator in the context of the while structure. Following is the explanation which I have been able to piece together regarding of the diamond operator: The diamond operator is the line-input operator; when empty ( <> ), the diamond operator reads from the ARGV filehandle, which reads the array of filenames from the Perl command line. Each invocation of <> returns as a string from the filehandle the next record of the file (as determined by the input record separator), until the end of the file is reached, at which point <> returns undef. It appears that each of the three pattern-matching operators within while structure is able to read the file from start to finish. So it appears that, in this context, the while structure is acting simply as a "file open" mechanism, rather than a mechanism which steps through a file record by record. And what happens if I should invoke the script with the command line: $ ./metamove.pl *.html rather than $ ./metamove.pl just_one_file.html ? The documentation I thus far have found appears to say that <> reads a series of files as if they were a single contiguous file. RLH From gwadej at anomaly.org Tue Feb 13 20:09:57 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 13 Feb 2007 22:09:57 -0600 Subject: [pm-h] relocate metatags in hyperlatex-generated HTML In-Reply-To: <20070214001053.GA8468@cromwell.tmiaf> References: <20070207110908.GA15658@cromwell.tmiaf> <20070207072241.78846fda@sovvan> <20070209174245.GA29184@cromwell.tmiaf> <20070209232143.691472b5@sovvan> <20070210145751.GA22013@cromwell.tmiaf> <20070211160545.39fd1a26@sovvan> <20070212015530.GA7234@cromwell.tmiaf> <20070211224529.3f02ac81@sovvan> <20070214001053.GA8468@cromwell.tmiaf> Message-ID: <20070213220957.72f066d4@sovvan> On Tue, 13 Feb 2007 18:10:53 -0600 "Russell L. Harris" wrote: > Success! The following code does the job of relocating the meta tags: Congrats. > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [snip] > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > However, looking at the code, I see that I don't understand the > working of the diamond operator in the context of the while structure. > > Following is the explanation which I have been able to piece together > regarding of the diamond operator: > > The diamond operator is the line-input operator; when > empty ( <> ), the diamond operator reads from the ARGV filehandle, > which reads the array of filenames from the Perl command line. > Each invocation of <> returns as a string from the filehandle the > next record of the file (as determined by the input record > separator), until the end of the file is reached, at which point > <> returns undef. > > It appears that each of the three pattern-matching operators within > while structure is able to read the file from start to finish. So it > appears that, in this context, the while structure is acting simply as > a "file open" mechanism, rather than a mechanism which steps through a > file record by record. Not quite. The diamond operator is reading a string from the file up to the input record separator. However, this code contains a small piece of magic right before the while...the line undef $/; The $/ variable contains the input record separator. By undefining it, we have told Perl we want to read until the end of the file. This is sometimes called "slurp mode". So, the result is that each call to <> returns the entire file as a string. With multiple files on the command line, the while loop will process one file per pass, which is exactly what you want. > > And what happens if I should invoke the script with the command line: > > $ ./metamove.pl *.html > > rather than > > $ ./metamove.pl just_one_file.html > > ? The documentation I thus far have found appears to say that <> > reads a series of files as if they were a single contiguous file. Sort of. <> returns each record of the first file. Then, when the first file is finished, it moves on to the second file, etc. In slurp mode, The entire first file is a record. Then we move to the second file and treat it as one record, etc. G. Wade -- "The avalanche has already started. It is too late for the pebbles to vote." -- Ambassador Kosh, "Believers" From gwadej at anomaly.org Thu Feb 22 05:09:27 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 22 Feb 2007 07:09:27 -0600 Subject: [pm-h] Fw: [pm_groups] YAPC::Europe 2007 - Call for Participation Message-ID: <20070222070927.5fa7e99e@sovvan> Begin forwarded message: Date: Thu, 22 Feb 2007 12:16:41 +0100 From: Thomas Klausner To: pm_groups at pm.org Subject: [pm_groups] YAPC::Europe 2007 - Call for Participation (please forward to your local groups and all other potentially interested people...) Call for Participation - YAPC::Europe 2007 in Vienna ==================================================== Vienna.pm is officially announcing the call for participation for Yet Another Perl Conference Europe 2007. This years conference theme is "Social Perl". Location -------- The conference will be held in Vienna, Austria, from 29th to 31st August 2007 at the Vienna University of Economics and Business Administration. Star guests ----------- We found sponsors to invite some famous international Perl hackers. Thanks to nfotex for inviting Larry (and Gloria) Wall, to Anonymous Donor for getting Damian Conway from Australia to Austria (it's quite expensive to get the 'al' out of Australia...) and to geizhals.at for inviting Audrey Tang and Mark Jason Dominus. Schedule -------- The final schedule will be announced on 22nd of July 2007. Would-be speakers please see the Call for Papers available on the conference website for more information on key dates and talks. Costs ----- * Regular attendance: 100 EUR * Students and Early Birds: 80 EUR * Business/Sponsor Tariff: 200 EUR Regular attendance costs 100 Euro, and 80 Euros for students. Early birds only pay 80 Euros (if paid until 31st March 2007). There is also a voluntary business/sponsor tariff at 200 Euros, which is an easy way to sponsor YAPC::Europe 2007 and Perl in general. You will not only get three days packed with interesting talks and people, but also a goodie bag, a conference t-shirt, an invitation to the attendees dinner and the unique opportunity to see renowned members of the Perl community with orange mohawks. As YAPC::Europe is a community-driven conference, we're not in it for the profit. But should we make one, all money will be used for funding further Perl 5|6 development, future YAPC::Europe conferences and for advancing Perl usage / the Perl community in Austria. How to register --------------- To register for YAPC::Europe 2007 go to our website: http://vienna.yapceurope.org Click on the 'New user'-Link in the navbar and fill out the subsequent form. Accomodation & Getting to Vienna -------------------------------- Please note that you should *definitly* book your hotel room as soon as possible. While you will be able to get a room later, the hotels near the venue will fill up. So to prevent long trips through the city or paying more than you want to, book your hotel room!. You can find more information on accomodation in Vienna, and how to get to Vienna by plain, train, car etc at the conference website. Contact ------- For more information please see the YAPC Europe 2007 website: http://vienna.yapceurope.org If you have any questions, do not hesitate to send an email to vienna2007 at yapceurope.org The organisers will get back to you as soon as possible. Thomas Klausner, on behalfe of Vienna.pm -- #!/usr/bin/perl http://domm.zsi.at for(ref bless{},just'another'perl'hacker){s-:+-$"-g&&print$_.$/} -- Request pm.org Technical Support via support at pm.org pm_groups mailing list pm_groups at pm.org http://mail.pm.org/mailman/listinfo/pm_groups -- A 'language' is a dialect with an army.