SPUG: X-Y scatter-plotting with annotated datapoints

Jeremy Kahn kahn at cpan.org
Thu Aug 15 16:43:31 CDT 2002


SPUGsters --

So I've got this complex data-set representing phonetic symbols and a
couple of pitch and timing values (in milliseconds and Hertz) for each
symbol. (This is data emitted by a text processing engine, coming up with
a plausible prosody for text-to-speech synthesis). Sample (and entirely
bogus) data is at the bottom of this note.

What I'd like to do is an X-Y plot (where time in ms is the X axis, and
Hz is the Y axis, with each "target" representing a point on that plot).
Ideally, the dots should be connected along the timeline (in their X
ordering).

More important than the connectedness, though, is that the targets be
associated with the phonetic symbols they are subordinate to.  These data
are pretty much worthless as a series of points if I can't synchronize
those points to the phonetic symbols.  I can think of two nice ways to
demonstrate the synchronization:

  * attach the phone symbol to each point
  * plot the phone symbols as regions along the X axis

But I've gotten stuck looking for a tool that can build me this graph.  I
experimented with Excel's graphing abilities -- no dice, as far as I could
tell, since I could get the X-Y plot, but no way to attach the phone
symbols, or I could get the phone symbols, but then they were
evenly-spaced along the X (time!) axis, despite their widely varying time
signatures.

I once solved a problem like this using SAS, but I don't have it now and I
don't have the budget.

So, I thought, I'll turn to Perl, since everything is easier in Perl.
(It is, isn't it?) I found Martin Verbruggen's GD::Graph, which is very
spiffy, but doesn't have a way to do this all neatly packaged. Before I
plunge in to write an extension to GD::Graph (or -- please, no -- write it
directly in GD!), does anybody have any ideas or advice?  Has anybody
solved a problem like this before using Perl?

--Jeremy


Data follows:
Note the format below is not what I'm using; I've got it in XML, so it's
completely parsed. I am *not* interested in help parsing the data, just
giving some sense of the kind of data I have.

Input text: "she had"

__DATA__

ms    Hz    phonetic-symbol
---------------------------
0     120   S
20    118   (also S)
30    110   i
70    125   (also i)
90    120   h
120   110   (also h)
170   108   A
200   105   (also A)
210   100   d
245    95   (also d)



 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://seattleperl.org




More information about the spug-list mailing list