APM: Parsing HTML

Montgomery Conner montgomery.conner at gmail.com
Tue Nov 2 17:00:09 PDT 2010


The standard way to check the structure of some arbitrary data in Perl is
with Data::Dumper, Perl's defacto data serialization library.

Somewhere in your program 'use Data::Dumper;'... then later 'print Dumper(
$ordertext );'.


Hope that helps.


On Tue, Nov 2, 2010 at 4:21 PM, Eric Ellington <e.ellington at gmail.com>wrote:

> I do not know the answer to your question, but I have used
> WWW::Mechanize in the past witch much success. I know the maintainer
> monitors the Chicago PM boards, he is a nice guy and answers questions.
>
>
> Eric
>
>
>
> On Mon, Nov 1, 2010 at 11:00 PM, John Warner <jwarner at texas.net> wrote:
> > All,
> >
> >
> >
> > It’s been a while since I did any Perl programming and I could use a
> > pointer.  I work as a lab admin at Dell where one of my job duties is to
> > order equipment for the various teams I support.  The process works like
> > this:  we have the teams configure a system in a shopping cart at
> Dell.com
> > then submit the shopping cart to the lab admins.  We, the lab admins,
> take
> > the information from the cart (quantity, SKUs, and descriptions), do a
> whole
> > bunch of manual manipulation and then paste the processed info into a
> tool
> > we use for internal ordering.
> >
> >
> >
> > The problem:
> >
> > I have been using Win32::Watir to interact with dell.com to navigate to
> the
> > SKUs in a shopping cart.  I have thus far been unable to get data in a
> > format I can use from the shopping cart.  I get some of the HTML but not
> the
> > stuff in the frame with the SKUs when I implement HTML::Parser start.  I
> can
> > get the information I am after with the Parser text function but without
> any
> > kind of separation that I could write a useful regex to separate.
> >
> >
> >
> > my $url =
> > “
> http://ecomm.dell.com/dellstore/basket_retrieve.aspx?c=us&cs=04&l=en&s=bsd&itemtype=CFG&cart_id=1013663916825&toEmail=john_warner@dell.com
> ”;
> >
> >
> >
> > my $ie = Win32::Watir->new( visible => 1, maximize => 0);
> >
> > print "Pointing to IE to found URL\n";
> >
> > $ie->goto($url);
> >
> >
> >
> > print "Clicking \"Detail View\" link in basket\n";
> >
> > $ie->getLink('linktext:', qr/Detail View/)->Click;
> >
> >
> >
> > print "Showing details of cart\n";
> >
> > $ie->getLink('linktext:', qr/Show Details/)->Click;
> >
> >
> >
> > my $ordertext = $ie->text;
> >
> > #my $ordertext = $ie->html;
> >
> > print $ordertext;
> >
> >
> >
> > #do useful processing here…
> >
> >
> >
> > The crux of my problem (I think) is that I don’t know what type of data
> > (array, hash, etc) that $ie->html or $ie->text returns.  Perhaps if I
> knew
> > that I could make headway on processing…
> >
> >
> >
> > Thanks for your time!
> >
> >
> >
> > John Warner
> >
> > jwarner at texas.net
> >
> > H:  512.251.1270
> >
> > C:  512.426.3813
> >
> >
> >
> > _______________________________________________
> > Austin mailing list
> > Austin at pm.org
> > http://mail.pm.org/mailman/listinfo/austin
> >
>
>
>
> --
> Eric Ellington
> e.ellington at gmail.com
> _______________________________________________
> Austin mailing list
> Austin at pm.org
> http://mail.pm.org/mailman/listinfo/austin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/austin/attachments/20101102/f1d3730b/attachment.html>


More information about the Austin mailing list