From jwarner at texas.net Mon Nov 1 21:00:18 2010 From: jwarner at texas.net (John Warner) Date: Mon, 1 Nov 2010 23:00:18 -0500 Subject: APM: Parsing HTML Message-ID: <000001cb7a42$77454650$65cfd2f0$@net> All, It's been a while since I did any Perl programming and I could use a pointer. I work as a lab admin at Dell where one of my job duties is to order equipment for the various teams I support. The process works like this: we have the teams configure a system in a shopping cart at Dell.com then submit the shopping cart to the lab admins. We, the lab admins, take the information from the cart (quantity, SKUs, and descriptions), do a whole bunch of manual manipulation and then paste the processed info into a tool we use for internal ordering. The problem: I have been using Win32::Watir to interact with dell.com to navigate to the SKUs in a shopping cart. I have thus far been unable to get data in a format I can use from the shopping cart. I get some of the HTML but not the stuff in the frame with the SKUs when I implement HTML::Parser start. I can get the information I am after with the Parser text function but without any kind of separation that I could write a useful regex to separate. my $url = "http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us &cs=04&l=en&s=bsd&itemtype=CFG&cart_id=1013663916825&toEmail=john_warner at del l.com"; my $ie = Win32::Watir->new( visible => 1, maximize => 0); print "Pointing to IE to found URL\n"; $ie->goto($url); print "Clicking \"Detail View\" link in basket\n"; $ie->getLink('linktext:', qr/Detail View/)->Click; print "Showing details of cart\n"; $ie->getLink('linktext:', qr/Show Details/)->Click; my $ordertext = $ie->text; #my $ordertext = $ie->html; print $ordertext; #do useful processing here. The crux of my problem (I think) is that I don't know what type of data (array, hash, etc) that $ie->html or $ie->text returns. Perhaps if I knew that I could make headway on processing. Thanks for your time! John Warner jwarner at texas.net H: 512.251.1270 C: 512.426.3813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.ellington at gmail.com Tue Nov 2 14:21:22 2010 From: e.ellington at gmail.com (Eric Ellington) Date: Tue, 2 Nov 2010 16:21:22 -0500 Subject: APM: Parsing HTML In-Reply-To: <000001cb7a42$77454650$65cfd2f0$@net> References: <000001cb7a42$77454650$65cfd2f0$@net> Message-ID: I do not know the answer to your question, but I have used WWW::Mechanize in the past witch much success. I know the maintainer monitors the Chicago PM boards, he is a nice guy and answers questions. Eric On Mon, Nov 1, 2010 at 11:00 PM, John Warner wrote: > All, > > > > It?s been a while since I did any Perl programming and I could use a > pointer.? I work as a lab admin at Dell where one of my job duties is to > order equipment for the various teams I support.? The process works like > this:? we have the teams configure a system in a shopping cart at Dell.com > then submit the shopping cart to the lab admins.? We, the lab admins, take > the information from the cart (quantity, SKUs, and descriptions), do a whole > bunch of manual manipulation and then paste the processed info into a tool > we use for internal ordering. > > > > The problem: > > I have been using Win32::Watir to interact with dell.com to navigate to the > SKUs in a shopping cart.? I have thus far been unable to get data in a > format I can use from the shopping cart.? I get some of the HTML but not the > stuff in the frame with the SKUs when I implement HTML::Parser start.? I can > get the information I am after with the Parser text function but without any > kind of separation that I could write a useful regex to separate. > > > > my $url = > ?http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us&cs=04&l=en&s=bsd&itemtype=CFG&cart_id=1013663916825&toEmail=john_warner at dell.com?; > > > > my $ie = Win32::Watir->new( visible => 1, maximize => 0); > > print "Pointing to IE to found URL\n"; > > $ie->goto($url); > > > > print "Clicking \"Detail View\" link in basket\n"; > > $ie->getLink('linktext:', qr/Detail View/)->Click; > > > > print "Showing details of cart\n"; > > $ie->getLink('linktext:', qr/Show Details/)->Click; > > > > my $ordertext = $ie->text; > > #my $ordertext = $ie->html; > > print $ordertext; > > > > #do useful processing here? > > > > The crux of my problem (I think) is that I don?t know what type of data > (array, hash, etc) that $ie->html or $ie->text returns.? Perhaps if I knew > that I could make headway on processing? > > > > Thanks for your time! > > > > John Warner > > jwarner at texas.net > > H:? 512.251.1270 > > C:? 512.426.3813 > > > > _______________________________________________ > Austin mailing list > Austin at pm.org > http://mail.pm.org/mailman/listinfo/austin > -- Eric Ellington e.ellington at gmail.com From montgomery.conner at gmail.com Tue Nov 2 17:00:09 2010 From: montgomery.conner at gmail.com (Montgomery Conner) Date: Tue, 2 Nov 2010 19:00:09 -0500 Subject: APM: Parsing HTML In-Reply-To: References: <000001cb7a42$77454650$65cfd2f0$@net> Message-ID: The standard way to check the structure of some arbitrary data in Perl is with Data::Dumper, Perl's defacto data serialization library. Somewhere in your program 'use Data::Dumper;'... then later 'print Dumper( $ordertext );'. Hope that helps. On Tue, Nov 2, 2010 at 4:21 PM, Eric Ellington wrote: > I do not know the answer to your question, but I have used > WWW::Mechanize in the past witch much success. I know the maintainer > monitors the Chicago PM boards, he is a nice guy and answers questions. > > > Eric > > > > On Mon, Nov 1, 2010 at 11:00 PM, John Warner wrote: > > All, > > > > > > > > It?s been a while since I did any Perl programming and I could use a > > pointer. I work as a lab admin at Dell where one of my job duties is to > > order equipment for the various teams I support. The process works like > > this: we have the teams configure a system in a shopping cart at > Dell.com > > then submit the shopping cart to the lab admins. We, the lab admins, > take > > the information from the cart (quantity, SKUs, and descriptions), do a > whole > > bunch of manual manipulation and then paste the processed info into a > tool > > we use for internal ordering. > > > > > > > > The problem: > > > > I have been using Win32::Watir to interact with dell.com to navigate to > the > > SKUs in a shopping cart. I have thus far been unable to get data in a > > format I can use from the shopping cart. I get some of the HTML but not > the > > stuff in the frame with the SKUs when I implement HTML::Parser start. I > can > > get the information I am after with the Parser text function but without > any > > kind of separation that I could write a useful regex to separate. > > > > > > > > my $url = > > ? > http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us&cs=04&l=en&s=bsd&itemtype=CFG&cart_id=1013663916825&toEmail=john_warner at dell.com > ?; > > > > > > > > my $ie = Win32::Watir->new( visible => 1, maximize => 0); > > > > print "Pointing to IE to found URL\n"; > > > > $ie->goto($url); > > > > > > > > print "Clicking \"Detail View\" link in basket\n"; > > > > $ie->getLink('linktext:', qr/Detail View/)->Click; > > > > > > > > print "Showing details of cart\n"; > > > > $ie->getLink('linktext:', qr/Show Details/)->Click; > > > > > > > > my $ordertext = $ie->text; > > > > #my $ordertext = $ie->html; > > > > print $ordertext; > > > > > > > > #do useful processing here? > > > > > > > > The crux of my problem (I think) is that I don?t know what type of data > > (array, hash, etc) that $ie->html or $ie->text returns. Perhaps if I > knew > > that I could make headway on processing? > > > > > > > > Thanks for your time! > > > > > > > > John Warner > > > > jwarner at texas.net > > > > H: 512.251.1270 > > > > C: 512.426.3813 > > > > > > > > _______________________________________________ > > Austin mailing list > > Austin at pm.org > > http://mail.pm.org/mailman/listinfo/austin > > > > > > -- > Eric Ellington > e.ellington at gmail.com > _______________________________________________ > Austin mailing list > Austin at pm.org > http://mail.pm.org/mailman/listinfo/austin > -------------- next part -------------- An HTML attachment was scrubbed... URL: