APM: Parsing HTML
John Warner
jwarner at texas.net
Mon Nov 1 21:00:18 PDT 2010
All,
It's been a while since I did any Perl programming and I could use a
pointer. I work as a lab admin at Dell where one of my job duties is to
order equipment for the various teams I support. The process works like
this: we have the teams configure a system in a shopping cart at Dell.com
then submit the shopping cart to the lab admins. We, the lab admins, take
the information from the cart (quantity, SKUs, and descriptions), do a whole
bunch of manual manipulation and then paste the processed info into a tool
we use for internal ordering.
The problem:
I have been using Win32::Watir to interact with dell.com to navigate to the
SKUs in a shopping cart. I have thus far been unable to get data in a
format I can use from the shopping cart. I get some of the HTML but not the
stuff in the frame with the SKUs when I implement HTML::Parser start. I can
get the information I am after with the Parser text function but without any
kind of separation that I could write a useful regex to separate.
my $url = "http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us
<http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us&cs=04&l=en&s=bsd&
itemtype=CFG&cart_id=1013663916825&toEmail=john_warner at dell.com>
&cs=04&l=en&s=bsd&itemtype=CFG&cart_id=1013663916825&toEmail=john_warner at del
l.com";
my $ie = Win32::Watir->new( visible => 1, maximize => 0);
print "Pointing to IE to found URL\n";
$ie->goto($url);
print "Clicking \"Detail View\" link in basket\n";
$ie->getLink('linktext:', qr/Detail View/)->Click;
print "Showing details of cart\n";
$ie->getLink('linktext:', qr/Show Details/)->Click;
my $ordertext = $ie->text;
#my $ordertext = $ie->html;
print $ordertext;
#do useful processing here.
The crux of my problem (I think) is that I don't know what type of data
(array, hash, etc) that $ie->html or $ie->text returns. Perhaps if I knew
that I could make headway on processing.
Thanks for your time!
John Warner
jwarner at texas.net
H: 512.251.1270
C: 512.426.3813
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/austin/attachments/20101101/aaafa763/attachment.html>
More information about the Austin
mailing list