[kw-pm] Re: paying for well-formed content

Stewart C. Russell scruss at sympatico.ca
Tue Jun 3 10:34:53 CDT 2003


Daniel R. Allen wrote:
> 
> ... the OED, Oxford English Dictionary, which 
> pioneered some of the
> complicated markup work that led to XML.

I suppose you could say that. Oxford hold huge archives in what is essentially SGML without the DTD, and have done for years. XML pioneer Tim Bray worked with Oxford on getting their data in an easily worked format.

> They basically translated 4100
> pages of highly structured dictionary ...

highly lexically, or typographically structured (depending how old the data is), that is. Any attempt to make a complex dictionary conform to a DTD will fail, as research by Susan Armstrong in Geneva shows. And my bitter experience at a typesetting house (dictionary data; SQL; just say no) in Markham backs up.

> ... equally well-structured electronic format.

what a fun job transforming lexical databases to output! I've spent about five years of my life doing it. Mostly with Perl, too.

> The third edition
> is rumoured to have the word "Perl" in it, even.

losers. I was there years ago: <http://tinyurl.com/dctt>, and have the mail from Tom C. to prove it (somewhere).

A well deserved casualty of the dotcom burst was xrefer.com, who paid money to license reference content, but supply it for free to browsers. They ain't doing as well as they wuz, to no-one's great surprise.

 Stewart





More information about the kw-pm mailing list