From gwadej at anomaly.org Fri May 4 18:23:45 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Fri, 4 May 2007 20:23:45 -0500 Subject: [pm-h] Next Tuesday Night's Meeting Message-ID: <20070504202345.54a5939b@sovvan> I have to apologize again. I had lost track of the date. The date of the May meeting is next Tuesday, and I have not found anyone to present. Does anyone want to present? Do we want to do a social meeting instead? Do we cancel again? Any opinions would be helpful. G. Wade -- Only a Sith deals in absolutes. -- Obi-Wan Kenobi, "Revenge of the Sith" From robo4288 at gmail.com Fri May 4 19:03:59 2007 From: robo4288 at gmail.com (Robert Boone) Date: Fri, 4 May 2007 21:03:59 -0500 Subject: [pm-h] Next Tuesday Night's Meeting In-Reply-To: <20070504202345.54a5939b@sovvan> References: <20070504202345.54a5939b@sovvan> Message-ID: <435624390705041903o20c6541aja97249c6cb2060b4@mail.gmail.com> If no one has anything a social meeting would be good. On 5/4/07, G. Wade Johnson wrote: > I have to apologize again. I had lost track of the date. The date of > the May meeting is next Tuesday, and I have not found anyone to present. > > Does anyone want to present? > > Do we want to do a social meeting instead? > > Do we cancel again? > > Any opinions would be helpful. > > G. Wade > -- > Only a Sith deals in absolutes. > -- Obi-Wan Kenobi, "Revenge of the Sith" > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From gwadej at anomaly.org Sun May 6 16:36:30 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sun, 6 May 2007 18:36:30 -0500 Subject: [pm-h] Next Tuesday Night's Meeting In-Reply-To: <435624390705041903o20c6541aja97249c6cb2060b4@mail.gmail.com> References: <20070504202345.54a5939b@sovvan> <435624390705041903o20c6541aja97249c6cb2060b4@mail.gmail.com> Message-ID: <20070506183630.66707655@sovvan> On Fri, 4 May 2007 21:03:59 -0500 "Robert Boone" wrote: > If no one has anything a social meeting would be good. Anyone got a suggestion on where? G. Wade -- C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do, it blows away your whole leg. -- Bjarne Stroustrup From gwadej at anomaly.org Mon May 7 18:30:37 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 7 May 2007 20:30:37 -0500 Subject: [pm-h] Next Tuesday Night's Meeting In-Reply-To: <20070506183630.66707655@sovvan> References: <20070504202345.54a5939b@sovvan> <435624390705041903o20c6541aja97249c6cb2060b4@mail.gmail.com> <20070506183630.66707655@sovvan> Message-ID: <20070507203037.70ca6ef4@sovvan> On Sun, 6 May 2007 18:36:30 -0500 "G. Wade Johnson" wrote: > On Fri, 4 May 2007 21:03:59 -0500 > "Robert Boone" wrote: > > > If no one has anything a social meeting would be good. > > Anyone got a suggestion on where? Since we haven't gotten any suggestions and no presentation has materialized, I'm going to suggest we'll have a social meeting tomorrow (Tuesday) at the same place as last time. So, I'll be at the Bennighan's on 59 at Kirby. That's inside the loop, southwest of downtown. I'll try to reserve a table under "Wade" again. See you there. G. Wade -- Only a Sith deals in absolutes. -- Obi-Wan Kenobi, "Revenge of the Sith" From gwadej at anomaly.org Mon May 7 19:49:16 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 7 May 2007 21:49:16 -0500 Subject: [pm-h] Next Tuesday Night's Meeting In-Reply-To: <20070507203037.70ca6ef4@sovvan> References: <20070504202345.54a5939b@sovvan> <435624390705041903o20c6541aja97249c6cb2060b4@mail.gmail.com> <20070506183630.66707655@sovvan> <20070507203037.70ca6ef4@sovvan> Message-ID: <20070507214916.69a8d431@sovvan> On Mon, 7 May 2007 20:30:37 -0500 "G. Wade Johnson" wrote: > On Sun, 6 May 2007 18:36:30 -0500 > "G. Wade Johnson" wrote: > > > On Fri, 4 May 2007 21:03:59 -0500 > > "Robert Boone" wrote: > > > > > If no one has anything a social meeting would be good. > > > > Anyone got a suggestion on where? > > Since we haven't gotten any suggestions and no presentation has > materialized, I'm going to suggest we'll have a social meeting > tomorrow (Tuesday) at the same place as last time. > > So, I'll be at the Bennighan's on 59 at Kirby. That's inside the loop, > southwest of downtown. I'll try to reserve a table under "Wade" again. > > See you there. I guess it would be a good idea to say I'll be there around 6:30. -- Don't kill him!! If you kill him, he won't learn nothin'! -- The Riddler, "Batman Forever" From gwadej at anomaly.org Tue May 8 20:04:29 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 8 May 2007 22:04:29 -0500 Subject: [pm-h] YAPC website Message-ID: <20070508220429.7c116284@sovvan> As promised at the social meeting tonight, here's the URL for the YAPC::NA conference. http://conferences.mongueurs.net/yn2007/ G. Wade -- Reality is just a convenient measure of complexity. -- Alvy Ray Smith From gwadej at anomaly.org Tue May 8 20:06:51 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 8 May 2007 22:06:51 -0500 Subject: [pm-h] Device::USB for Windows Message-ID: <20070508220651.5cfb9d45@sovvan> I got a patch from someone who has built Device::USB for Windows. At some point I will try to get the PPD on-line. If you need it sooner, let me know and I can send it to you. G. Wade -- The computer should be doing the hard work. That's what it's paid to do, after all. -- Larry Wall From rlharris at oplink.net Thu May 10 00:13:27 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Thu, 10 May 2007 02:13:27 -0500 Subject: [pm-h] shopping for used router - wired Message-ID: <20070510071327.GA9822@cromwell.tmiaf> As a result of a conversation at the Tuesday meeting regarding power consumption, I now am shopping around for a firmware firewall/router/DHCP server for my home network, to replace an old Pentium-II machine running SmoothWall Express 2.0. The Cisco/Linksys brand was recommended to me. I have an ADSL connection to the Internet. I have an ADSL modem and an external 10/100 Ethernet switch, and no plans for wireless. So if I end up purchasing a new router, I likely would choose a no-frills unit such as the Linksys BEFSR11. Perhaps you have upgraded to a wireless firewall/router and have in the closet collecting dust an old wired router which would be suitable for my application. RLH From gwadej at anomaly.org Thu May 10 05:21:14 2007 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 10 May 2007 07:21:14 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510071327.GA9822@cromwell.tmiaf> References: <20070510071327.GA9822@cromwell.tmiaf> Message-ID: <20070510072114.2e9e0171@sovvan> On Thu, 10 May 2007 02:13:27 -0500 "Russell L. Harris" wrote: > As a result of a conversation at the Tuesday meeting regarding power > consumption, I now am shopping around for a firmware > firewall/router/DHCP server for my home network, to replace an old > Pentium-II machine running SmoothWall Express 2.0. > > The Cisco/Linksys brand was recommended to me. > > I have an ADSL connection to the Internet. I have an ADSL modem and > an external 10/100 Ethernet switch, and no plans for wireless. > > So if I end up purchasing a new router, I likely would choose a > no-frills unit such as the Linksys BEFSR11. > > Perhaps you have upgraded to a wireless firewall/router and have in > the closet collecting dust an old wired router which would be > suitable for my application. As I said at the meeting, I have a Netgear RP 614 that I no longer have need of. G. Wade -- There are two ways to write error-free programs; only the third one works. -- Alan Perlis From will.willis at gmail.com Thu May 10 08:35:51 2007 From: will.willis at gmail.com (Will Willis) Date: Thu, 10 May 2007 10:35:51 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510072114.2e9e0171@sovvan> References: <20070510071327.GA9822@cromwell.tmiaf> <20070510072114.2e9e0171@sovvan> Message-ID: <6ee1e6090705100835q56c2e0f8rdbba57d3f64cd010@mail.gmail.com> Russell, Sounds like Wade has what you're looking for, but for any others with working equipment just collecting dust; Offer it up on Houston's Freecycle mailing list, see: http://freecycle.org/. I've donated printers, monitors, and other things there, as well as being the recipient of such things as baby car seats and a baby's highchair. It's a wonderful program and a perfect place for those looking for specific things, or looking to clean out their closets. -Will On 5/10/07, G. Wade Johnson wrote: > On Thu, 10 May 2007 02:13:27 -0500 > "Russell L. Harris" wrote: > > > As a result of a conversation at the Tuesday meeting regarding power > > consumption, I now am shopping around for a firmware > > firewall/router/DHCP server for my home network, to replace an old > > Pentium-II machine running SmoothWall Express 2.0. > > > > The Cisco/Linksys brand was recommended to me. > > > > I have an ADSL connection to the Internet. I have an ADSL modem and > > an external 10/100 Ethernet switch, and no plans for wireless. > > > > So if I end up purchasing a new router, I likely would choose a > > no-frills unit such as the Linksys BEFSR11. > > > > Perhaps you have upgraded to a wireless firewall/router and have in > > the closet collecting dust an old wired router which would be > > suitable for my application. > > As I said at the meeting, I have a Netgear RP 614 that I no longer have > need of. > > G. Wade > -- > There are two ways to write error-free programs; only the third one > works. -- Alan Perlis > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From will.willis at gmail.com Thu May 10 08:35:51 2007 From: will.willis at gmail.com (Will Willis) Date: Thu, 10 May 2007 10:35:51 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510072114.2e9e0171@sovvan> References: <20070510071327.GA9822@cromwell.tmiaf> <20070510072114.2e9e0171@sovvan> Message-ID: <6ee1e6090705100835q56c2e0f8rdbba57d3f64cd010@mail.gmail.com> Russell, Sounds like Wade has what you're looking for, but for any others with working equipment just collecting dust; Offer it up on Houston's Freecycle mailing list, see: http://freecycle.org/. I've donated printers, monitors, and other things there, as well as being the recipient of such things as baby car seats and a baby's highchair. It's a wonderful program and a perfect place for those looking for specific things, or looking to clean out their closets. -Will On 5/10/07, G. Wade Johnson wrote: > On Thu, 10 May 2007 02:13:27 -0500 > "Russell L. Harris" wrote: > > > As a result of a conversation at the Tuesday meeting regarding power > > consumption, I now am shopping around for a firmware > > firewall/router/DHCP server for my home network, to replace an old > > Pentium-II machine running SmoothWall Express 2.0. > > > > The Cisco/Linksys brand was recommended to me. > > > > I have an ADSL connection to the Internet. I have an ADSL modem and > > an external 10/100 Ethernet switch, and no plans for wireless. > > > > So if I end up purchasing a new router, I likely would choose a > > no-frills unit such as the Linksys BEFSR11. > > > > Perhaps you have upgraded to a wireless firewall/router and have in > > the closet collecting dust an old wired router which would be > > suitable for my application. > > As I said at the meeting, I have a Netgear RP 614 that I no longer have > need of. > > G. Wade > -- > There are two ways to write error-free programs; only the third one > works. -- Alan Perlis > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From will.willis at gmail.com Thu May 10 08:35:51 2007 From: will.willis at gmail.com (Will Willis) Date: Thu, 10 May 2007 10:35:51 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510072114.2e9e0171@sovvan> References: <20070510071327.GA9822@cromwell.tmiaf> <20070510072114.2e9e0171@sovvan> Message-ID: <6ee1e6090705100835q56c2e0f8rdbba57d3f64cd010@mail.gmail.com> Russell, Sounds like Wade has what you're looking for, but for any others with working equipment just collecting dust; Offer it up on Houston's Freecycle mailing list, see: http://freecycle.org/. I've donated printers, monitors, and other things there, as well as being the recipient of such things as baby car seats and a baby's highchair. It's a wonderful program and a perfect place for those looking for specific things, or looking to clean out their closets. -Will On 5/10/07, G. Wade Johnson wrote: > On Thu, 10 May 2007 02:13:27 -0500 > "Russell L. Harris" wrote: > > > As a result of a conversation at the Tuesday meeting regarding power > > consumption, I now am shopping around for a firmware > > firewall/router/DHCP server for my home network, to replace an old > > Pentium-II machine running SmoothWall Express 2.0. > > > > The Cisco/Linksys brand was recommended to me. > > > > I have an ADSL connection to the Internet. I have an ADSL modem and > > an external 10/100 Ethernet switch, and no plans for wireless. > > > > So if I end up purchasing a new router, I likely would choose a > > no-frills unit such as the Linksys BEFSR11. > > > > Perhaps you have upgraded to a wireless firewall/router and have in > > the closet collecting dust an old wired router which would be > > suitable for my application. > > As I said at the meeting, I have a Netgear RP 614 that I no longer have > need of. > > G. Wade > -- > There are two ways to write error-free programs; only the third one > works. -- Alan Perlis > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From will.willis at gmail.com Thu May 10 08:35:51 2007 From: will.willis at gmail.com (Will Willis) Date: Thu, 10 May 2007 10:35:51 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510072114.2e9e0171@sovvan> References: <20070510071327.GA9822@cromwell.tmiaf> <20070510072114.2e9e0171@sovvan> Message-ID: <6ee1e6090705100835q56c2e0f8rdbba57d3f64cd010@mail.gmail.com> Russell, Sounds like Wade has what you're looking for, but for any others with working equipment just collecting dust; Offer it up on Houston's Freecycle mailing list, see: http://freecycle.org/. I've donated printers, monitors, and other things there, as well as being the recipient of such things as baby car seats and a baby's highchair. It's a wonderful program and a perfect place for those looking for specific things, or looking to clean out their closets. -Will On 5/10/07, G. Wade Johnson wrote: > On Thu, 10 May 2007 02:13:27 -0500 > "Russell L. Harris" wrote: > > > As a result of a conversation at the Tuesday meeting regarding power > > consumption, I now am shopping around for a firmware > > firewall/router/DHCP server for my home network, to replace an old > > Pentium-II machine running SmoothWall Express 2.0. > > > > The Cisco/Linksys brand was recommended to me. > > > > I have an ADSL connection to the Internet. I have an ADSL modem and > > an external 10/100 Ethernet switch, and no plans for wireless. > > > > So if I end up purchasing a new router, I likely would choose a > > no-frills unit such as the Linksys BEFSR11. > > > > Perhaps you have upgraded to a wireless firewall/router and have in > > the closet collecting dust an old wired router which would be > > suitable for my application. > > As I said at the meeting, I have a Netgear RP 614 that I no longer have > need of. > > G. Wade > -- > There are two ways to write error-free programs; only the third one > works. -- Alan Perlis > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From will.willis at gmail.com Thu May 10 08:35:51 2007 From: will.willis at gmail.com (Will Willis) Date: Thu, 10 May 2007 10:35:51 -0500 Subject: [pm-h] shopping for used router - wired In-Reply-To: <20070510072114.2e9e0171@sovvan> References: <20070510071327.GA9822@cromwell.tmiaf> <20070510072114.2e9e0171@sovvan> Message-ID: <6ee1e6090705100835q56c2e0f8rdbba57d3f64cd010@mail.gmail.com> Russell, Sounds like Wade has what you're looking for, but for any others with working equipment just collecting dust; Offer it up on Houston's Freecycle mailing list, see: http://freecycle.org/. I've donated printers, monitors, and other things there, as well as being the recipient of such things as baby car seats and a baby's highchair. It's a wonderful program and a perfect place for those looking for specific things, or looking to clean out their closets. -Will On 5/10/07, G. Wade Johnson wrote: > On Thu, 10 May 2007 02:13:27 -0500 > "Russell L. Harris" wrote: > > > As a result of a conversation at the Tuesday meeting regarding power > > consumption, I now am shopping around for a firmware > > firewall/router/DHCP server for my home network, to replace an old > > Pentium-II machine running SmoothWall Express 2.0. > > > > The Cisco/Linksys brand was recommended to me. > > > > I have an ADSL connection to the Internet. I have an ADSL modem and > > an external 10/100 Ethernet switch, and no plans for wireless. > > > > So if I end up purchasing a new router, I likely would choose a > > no-frills unit such as the Linksys BEFSR11. > > > > Perhaps you have upgraded to a wireless firewall/router and have in > > the closet collecting dust an old wired router which would be > > suitable for my application. > > As I said at the meeting, I have a Netgear RP 614 that I no longer have > need of. > > G. Wade > -- > There are two ways to write error-free programs; only the third one > works. -- Alan Perlis > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > From raprice at gmail.com Mon May 14 13:14:36 2007 From: raprice at gmail.com (Richard Price) Date: Mon, 14 May 2007 15:14:36 -0500 Subject: [pm-h] Question about using HTML::TableExtract Message-ID: <8d8e05eb0705141314i26558006mb4baf240c070ef49@mail.gmail.com> I am an intermediate perl user. I taught myself Perl by reading "Learning Perl," with some online tutorials and I have some other reference texts. I can generally do what I need to with with Perl, but my code is far from elegant. I understand the very basics of object-oriented programming in Perl, but I generally need sample code to get started with modules from cpan. I am a professor at Rice University and have found Perl to be invaluable for extracting data for my research, especially the regular expression capabilities of Perl. I have been unable to attend any of the monthly meetings, but hope to in the future. For my current project, I am trying to extract historical financial statement data from www.marketwatch.com. The url is http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0. I use WWW::Mechanize to download the webpage and then I use HTML::TableExtract to extract the text that I need. I want to transpose the table at depth=1, count=1 after extracting it so that each year is a row and each variable is a column. I have not been able to find any documentation on how to extract a column from a table using HTML::TableExtract. The following simple program downloads the data using WWW::Mechanize and extracts the table with HTML::TableExtract and prints the output of each row. #!/usr/bin/perl use HTML::TableExtract; use WWW::Mechanize; use strict; my $marketwatch = WWW::Mechanize->new( autocheck => 1 ); $marketwatch->get(" http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0 "); chomp(my $html = $marketwatch->content); my $table = HTML::TableExtract->new(keep_html=>0, depth => 1, count => 1, br_translate => 0 ); $table->parse($html); foreach my $row ($table->rows) { print join("\t", @$row), "\n"; } I am not able to figure out how to use the columns method. My intuition makes me think it should be something like the following (but my intuition is wrong): foreach my $column ($table->columns) { print join("\t", @$column), "\n"; } The error message I get says: Can't locate object method "columns" via package "HTML::TableExtract". The documentation doesn't shed much light (for me anyway). I can see in the code of the module that the columns method belongs to HTML::TableExtract::Table, but I can't figure out how to use it. I appreciate any help. For an experienced programmer, I am sure this is trivial, but I am the closest thing to a programmer in my department, and I don't really have anyone around me that I can get help from. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.pm.org/mailman/private/houston/attachments/20070514/7227d031/attachment.html From robo4288 at gmail.com Mon May 14 14:24:53 2007 From: robo4288 at gmail.com (Robert Boone) Date: Mon, 14 May 2007 16:24:53 -0500 Subject: [pm-h] Question about using HTML::TableExtract In-Reply-To: <8d8e05eb0705141314i26558006mb4baf240c070ef49@mail.gmail.com> References: <8d8e05eb0705141314i26558006mb4baf240c070ef49@mail.gmail.com> Message-ID: <435624390705141424o39e5a0f8keb7c5cc96dddad29@mail.gmail.com> On 5/14/07, Richard Price wrote: > I am an intermediate perl user. I taught myself Perl by reading "Learning > Perl," with some online tutorials and I have some other reference texts. I > can generally do what I need to with with Perl, but my code is far from > elegant. I understand the very basics of object-oriented programming in > Perl, but I generally need sample code to get started with modules from > cpan. I am a professor at Rice University and have found Perl to be > invaluable for extracting data for my research, especially the regular > expression capabilities of Perl. I have been unable to attend any of the > monthly meetings, but hope to in the future. > > For my current project, I am trying to extract historical financial > statement data from www.marketwatch.com. The url is > http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0. > I use WWW::Mechanize to download the webpage and then I use > HTML::TableExtract to extract the text that I need. I want to transpose the > table at depth=1, count=1 after extracting it so that each year is a row and > each variable is a column. I have not been able to find any documentation > on how to extract a column from a table using HTML::TableExtract. > > The following simple program downloads the data using WWW::Mechanize and > extracts the table with HTML::TableExtract and prints the output of each > row. > > #!/usr/bin/perl > > use HTML::TableExtract; > use WWW::Mechanize; > use strict; > > my $marketwatch = WWW::Mechanize->new( autocheck => 1 ); > $marketwatch->get("http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0 > "); > > chomp(my $html = $marketwatch->content); > > my $table = HTML::TableExtract->new(keep_html=>0, depth => > 1, count => 1, br_translate => 0 ); > $table->parse($html); > > foreach my $row ($table->rows) { > print join("\t", @$row), "\n"; > } > > I am not able to figure out how to use the columns method. My intuition > makes me think it should be something like the following (but my intuition > is wrong): > > foreach my $column ($table->columns) { > print join("\t", @$column), "\n"; > } > > The error message I get says: Can't locate object method "columns" via > package "HTML::TableExtract". The documentation doesn't shed much light > (for me anyway). I can see in the code of the module that the columns > method belongs to HTML::TableExtract::Table, but I can't figure out how to > use it. > > I appreciate any help. For an experienced programmer, I am sure this is > trivial, but I am the closest thing to a programmer in my department, and I > don't really have anyone around me that I can get help from. > > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > Website: http://houston.pm.org/ > It looks like you need to call method columns from a HTML::TableExtract::Table object and not a HTML::TableExtract object. >From the docs and your email maybe something like this could get you started: my $table = HTML::TableExtract->new(keep_html=>0, depth => 1, count => 1, br_translate => 0 ); $table->parse($html); my $t = $table->table(1,1); foreach my $row ($t->columns) { print join("\t", @$row), "\n"; } From raprice at gmail.com Tue May 15 12:04:50 2007 From: raprice at gmail.com (Richard Price) Date: Tue, 15 May 2007 14:04:50 -0500 Subject: [pm-h] Houston Digest, Vol 30, Issue 6 In-Reply-To: References: Message-ID: <8d8e05eb0705151204n69a2f169r8f4b979db5ff0cad@mail.gmail.com> > > > I am an intermediate perl user. I taught myself Perl by reading > "Learning > > Perl," with some online tutorials and I have some other reference > texts. I > > can generally do what I need to with with Perl, but my code is far from > > elegant. I understand the very basics of object-oriented programming in > > Perl, but I generally need sample code to get started with modules from > > cpan. I am a professor at Rice University and have found Perl to be > > invaluable for extracting data for my research, especially the regular > > expression capabilities of Perl. I have been unable to attend any of > the > > monthly meetings, but hope to in the future. > > > > For my current project, I am trying to extract historical financial > > statement data from www.marketwatch.com. The url is > > > http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0 > . > > I use WWW::Mechanize to download the webpage and then I use > > HTML::TableExtract to extract the text that I need. I want to transpose > the > > table at depth=1, count=1 after extracting it so that each year is a row > and > > each variable is a column. I have not been able to find any > documentation > > on how to extract a column from a table using HTML::TableExtract. > > > > The following simple program downloads the data using WWW::Mechanize > and > > extracts the table with HTML::TableExtract and prints the output of each > > row. > > > > #!/usr/bin/perl > > > > use HTML::TableExtract; > > use WWW::Mechanize; > > use strict; > > > > my $marketwatch = WWW::Mechanize->new( autocheck => 1 ); > > $marketwatch->get(" > http://www.marketwatch.com/tools/quotes/financials.asp?symb=ABSD&sid=0&report=2&freq=0 > > "); > > > > chomp(my $html = $marketwatch->content); > > > > my $table = HTML::TableExtract->new(keep_html=>0, depth => > > 1, count => 1, br_translate => 0 ); > > $table->parse($html); > > > > foreach my $row ($table->rows) { > > print join("\t", @$row), "\n"; > > } > > > > I am not able to figure out how to use the columns method. My > intuition > > makes me think it should be something like the following (but my > intuition > > is wrong): > > > > foreach my $column ($table->columns) { > > print join("\t", @$column), "\n"; > > } > > > > The error message I get says: Can't locate object method "columns" via > > package "HTML::TableExtract". The documentation doesn't shed much light > > (for me anyway). I can see in the code of the module that the columns > > method belongs to HTML::TableExtract::Table, but I can't figure out how > to > > use it. > > > > I appreciate any help. For an experienced programmer, I am sure this > is > > trivial, but I am the closest thing to a programmer in my department, > and I > > don't really have anyone around me that I can get help from. > > > > _______________________________________________ > > Houston mailing list > > Houston at pm.org > > http://mail.pm.org/mailman/listinfo/houston > > Website: http://houston.pm.org/ > > > > It looks like you need to call method columns from a > HTML::TableExtract::Table object and not a HTML::TableExtract object. > > >From the docs and your email maybe something like this could get you > started: > > my $table = HTML::TableExtract->new(keep_html=>0, depth => 1, count => > 1, br_translate => 0 ); > $table->parse($html); > > my $t = $table->table(1,1); > > foreach my $row ($t->columns) { > print join("\t", @$row), "\n"; > } Thanks. This works perfectly and saved me hours! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.pm.org/mailman/private/houston/attachments/20070515/ace4ea5e/attachment.html From rlharris at oplink.net Sat May 19 19:25:44 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Sat, 19 May 2007 21:25:44 -0500 Subject: [pm-h] reading a file as a string Message-ID: <20070520022544.GA8023@cromwell.tmiaf> Is there a preferred approach for copying an entire file into a string variable, while preserving the record delimiters (the newline character)? I have found two examples; is either of them a good approach? open (FILE,$filename) || die "Cannot open '$filename': $!"; undef $/; my $file_as_string = ; open (FILE,$filename) || die "Cannot open '$filename': $!"; my $file_as_string = join '', ; From andy at petdance.com Sat May 19 19:31:14 2007 From: andy at petdance.com (Andy Lester) Date: Sat, 19 May 2007 21:31:14 -0500 Subject: [pm-h] reading a file as a string In-Reply-To: <20070520022544.GA8023@cromwell.tmiaf> References: <20070520022544.GA8023@cromwell.tmiaf> Message-ID: <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> On May 19, 2007, at 9:25 PM, Russell L. Harris wrote: > Is there a preferred approach for copying an entire file into a string > variable, while preserving the record delimiters (the newline > character)? > > I have found two examples; is either of them a good approach? > > open (FILE,$filename) || die "Cannot open '$filename': $!"; > undef $/; > my $file_as_string = ; > > > open (FILE,$filename) || die "Cannot open '$filename': $!"; > my $file_as_string = join '', ; Of those two, choose the former. The second one reads all the lines into an array, and the glomps together a big string. The first one just reads into a string. Do it this way: my $file_as_string = do { open( my $fh, $filename ) or die "Can't open $filename: $!"; local $/ = undef; <$fh>; }; This lets you localize the $/ so that it gets set back outside the scope of the block. Otherwise, you might try to read from a file somewhere else and not know that you changed $/. Here's another way: use File::Slurp qw( read_file ); my $file_as_string = read_file( $filename ); xoxo, Andy -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance From rlharris at oplink.net Sat May 19 20:38:43 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Sat, 19 May 2007 22:38:43 -0500 Subject: [pm-h] reading a file as a string In-Reply-To: <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> References: <20070520022544.GA8023@cromwell.tmiaf> <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> Message-ID: <20070520033843.GB8023@cromwell.tmiaf> * Andy Lester [070519 21:35]: > > On May 19, 2007, at 9:25 PM, Russell L. Harris wrote: > > > Is there a preferred approach for copying an entire file into a string > > variable, while preserving the record delimiters (the newline > > character)? > > > > I have found two examples; is either of them a good approach? > > > > open (FILE,$filename) || die "Cannot open '$filename': $!"; > > undef $/; > > my $file_as_string = ; > > > > > > open (FILE,$filename) || die "Cannot open '$filename': $!"; > > my $file_as_string = join '', ; > > Of those two, choose the former. The second one reads all the lines > into an array, and the glomps together a big string. The first one > just reads into a string. > > Do it this way: > > my $file_as_string = do { > open( my $fh, $filename ) or die "Can't open $filename: $!"; > local $/ = undef; > <$fh>; > }; > > This lets you localize the $/ so that it gets set back outside the > scope of the block. Otherwise, you might try to read from a file > somewhere else and not know that you changed $/. > > Here's another way: > > use File::Slurp qw( read_file ); > my $file_as_string = read_file( $filename ); Thanks for the quick response, Andy. After G. Wade's mentoring regarding the diamond operator, I dreamed up the first approach: undef $/; my $file_as_string = ; The second approach is something I ran across in the 4th edition of "Learning Perl". My ultimate goal is to modify about a hundred document files by tacking on a new head and a new tail to each. The largest document file is about 500 Kbytes; the head and tail each are less than a kilobyte. Here is the Perl script which I propose to use: $^I = ".bak"; my $newhead = "newhead"; open(NEWHEAD,$newhead) || die "failed to open input file $newhead :$!"; undef $/; my $headstring = ; close(NEWHEAD) || die "failed to close input file $newhead : $!"; my $newtail = "newtail"; open(NEWTAIL,$newtail) || die "failed to open input file $newtail :$!"; undef $/; my $tailstring = ; close(NEWTAIL) || die "failed to close input file $newtail : $!"; my $bodystring = ''; my $newdocument = ''; undef $/; while ($bodystring = <>) { $newdocument .= $headstring; $newdocument .= $bodystring; $newdocument .= $tailstring; print "$newdocument"; $newdocument = ''; } I have tested the script on short dummy documents, but I wished to make sure that I am not overlooking something which could corrupt the document files. RLH From rlharris at oplink.net Sat May 19 22:43:39 2007 From: rlharris at oplink.net (Russell L. Harris) Date: Sun, 20 May 2007 00:43:39 -0500 Subject: [pm-h] reading a file as a string In-Reply-To: <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> References: <20070520022544.GA8023@cromwell.tmiaf> <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> Message-ID: <20070520054339.GC8023@cromwell.tmiaf> * Andy Lester [070519 21:35]: > Do it this way: > > my $file_as_string = do { > open( my $fh, $filename ) or die "Can't open $filename: $!"; > local $/ = undef; > <$fh>; > }; Andy, I am confused by the use of "my $fh" in the open command. By what mechanism is $fd assigned a value? RLH From andy at petdance.com Sat May 19 22:51:24 2007 From: andy at petdance.com (Andy Lester) Date: Sun, 20 May 2007 00:51:24 -0500 Subject: [pm-h] reading a file as a string In-Reply-To: <20070520054339.GC8023@cromwell.tmiaf> References: <20070520022544.GA8023@cromwell.tmiaf> <9F0E615D-FE3B-4A7D-9B37-99D214C0BB18@petdance.com> <20070520054339.GC8023@cromwell.tmiaf> Message-ID: On May 20, 2007, at 12:43 AM, Russell L. Harris wrote: >> my $file_as_string = do { >> open( my $fh, $filename ) or die "Can't open $filename: $!"; >> local $/ = undef; >> <$fh>; >> }; > > I am confused by the use of "my $fh" in the open command. By what > mechanism is $fd assigned a value? The "my $fh" declares a lexical scalar, and the open() assigns a filehandle to it. The old way: open( FILEHANDLE, $filename ) The new way: open( my $fh, '<', $filename ) The problem with the old FILEHANDLE method is that it's a global variable. It has no scope. If call a subroutine that also operates on FILEHANDLE, say by closing the file, and then return, you're going to be sad. Lexical filehandles let you keep filehandles within a scope, where they belong. When they go out of scope, they're automatically closed, too. Also, start using the 3-arg open. The middle argument gives an explicit instruction on how to open the file, for input or output. xoxo, Andy -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance