[kw-pm] xml

Arguile arguile at lucentstudios.com
Sun Feb 22 11:29:13 CST 2004


On Sun, 2004-02-22 at 11:10, Daniel R. Allen wrote:
> ...So you mean that communicating application-to-application over the web
> is made easier by wrapping things up in another layer of encapsulation
> than by parsing the webpage output and submitting standard forms?

Simply put. Yes.

I'll address both the above and other common problems with this type of
service.

        Web scraping is horribly prone to breakage. A single change to
        the display structure of the page and your whole service can
        break down.
        
        Traditionally writing network applications involved writing your
        own protocol.
        
        Binary data exchange formats are normally positional (ie. byte
        offset) so adding extra fields or changing things about breaks
        existing implementations. Not only that but they _require_
        proper documentation unless you like spending hours reverse
        engineering things.

Those are the the main points XML (and friends) address.

        Elements can be added anywhere in the document without effecting
        existing applications. Given the heirarchy itself isn't changed
        elements can be re-ordered at whim as well. There is absolutely
        no presentation conisderations involved (what HTML should have
        been).
        
        XML is just plain text so HTTP can be used (however with true
        RPC you have statelessness to consider).
        
        Position is irrelevant if the heirarchy is preserved. XML is
        easily machine parsable and nominally human parsable; so to some
        degree it's self documenting. A good DTD/Schema is still
        suggested (strongly) but not required.

> Just to play devil's advocate- even if it would be easier to use, my bank
> hasn't published an XML-RPC interface so I can do things my own way, and I
> bet they never will.  If I want to interact with them, or with most other
> websites, I need to either write the parsing tools myself, or find
> somebody who's already done it (such as with Mail::Webmail::Yahoo or
> WWW::Search).  Who's to say that CPAN authors won't do a better job of it
> than the banks and credit-card companies and car-rental companies and
> governments (who probably don't even care?)

Web scraping is still needed for those above reasons. But petition your
bank or other institution to provide a proper XML DTD/Schema as it is a
better solution.

> Just being devil's advocate here. :-)

Myself aswell. I really don't like many of the uses people have proposed
for XML. Like anything though, give someone a hammer and every problem
becomes a nail.

RPC for example, is not a nail. There is a nail in it, as you have to
exchange the data in some form, but there's much more to it than that.
And HTTP is almost never the answer to the rest of that problem.

Talking from many many wasted hours^H^H^H^H^Hweeks reverse engineering
binary data packets without any documentation -- I can only imagine how
the Samba team feels -- XML can be a great tool when it's used properly.
Sure it's overly verbose and processor intensive, but it saves
programmers tons of time of frustration. Which these days is much more
valuable than a few clock cycles and some bandwidth (after gziping it's
not too bad on that score).


> On Sun, 22 Feb 2004, lloyd carr wrote:
> 
> > > Apart from the acronyms, web services are published protocols for
> > > applications to talk to each other via the web.  They let your perl
> > > programs talk directly to google and amazon for example.
> > >
> > > Why use these instead of WWW::Mechanize or other webpage-parsing
> > > modules?... Good question...
> >
> > UGH Daniel! Why use a service in place of HTML scraping?!
> > Why use XML when HTML will do just fine?!? :-(
> > Why use XML-RPC in place of CGI?!
> >
> > As Daniel demonstrated in his excellent talk on modules, of which his beer
> > coasters are an excellent example, it is a great good to hide the
> > complexity and specifics of you implementation as inside a module. Dare I
> > say that is an even greater good, that in addition to hiding the
> > complexity and specifics of your implementation, you make it possible that
> > the client and service need not reside on the same machine or be written
> > in the same language or be running on the same OS!
> >
> > The hype and acronym
> > soup may collapse under it's own weight, as it should, but I can still see
> > many applications in our heterogeneous networked world.
> >
> > The web in web service is perhaps misleading, the WWW of browsing and
> > surfing is only the smallest fraction of where this technology could be used.
> >
> > -Lloyd
> > _______________________________________________
> > kw-pm mailing list
> > kw-pm at mail.pm.org
> > http://mail.pm.org/mailman/listinfo/kw-pm
> >
> 
> _______________________________________________
> kw-pm mailing list
> kw-pm at mail.pm.org
> http://mail.pm.org/mailman/listinfo/kw-pm
-- 
Arguile <arguile at lucentstudios.com>




More information about the kw-pm mailing list