From perl at minty.org  Sun Jan  7 07:29:50 2007
From: perl at minty.org (Murray)
Date: Sun, 7 Jan 2007 15:29:50 +0000
Subject: [Edinburgh-pm] wednesday
Message-ID: <20070107152948.GO16833@minty.org>


On the 10th January, in the year:

1475, Stephen III of Moldavia defeats the Ottoman Empire.
1810, Marriage of Napoleon and Josephine is annulled.
1861, Florida secedes from the US during the American Civil War.
1863, The first section of the London Underground Railway opens.
1920, League of Nations holds its first meeting, ending World War I.
2001, Wikipedia starts.

Seems like an excellent date for the first 2007 tipple of Edinburgh PM.  

Guildform Arms. 7pm?

http://en.wikipedia.org/wiki/January_10
http://www.guildfordarms.com/

From robrwo at gmail.com  Sun Jan  7 07:54:31 2007
From: robrwo at gmail.com (Robert Rothenberg)
Date: Sun, 07 Jan 2007 15:54:31 +0000
Subject: [Edinburgh-pm] wednesday
In-Reply-To: <20070107152948.GO16833@minty.org>
References: <20070107152948.GO16833@minty.org>
Message-ID: <45A117B7.4010406@gmail.com>

On 07/01/07 15:29 Murray wrote:
> On the 10th January, in the year:
> 
> 1475, Stephen III of Moldavia defeats the Ottoman Empire.
> 1810, Marriage of Napoleon and Josephine is annulled.
> 1861, Florida secedes from the US during the American Civil War.
> 1863, The first section of the London Underground Railway opens.
> 1920, League of Nations holds its first meeting, ending World War I.
> 2001, Wikipedia starts.
> 

Some other interesting anniversaries:

49 BC - Julius Caesar crosses the Rubicon, signaling the start of civil war.
1776 - Thomas Paine publishes Common Sense
1870 - John D. Rockefeller incorporates Standard Oil.
1927 - The film Metropolis by Fritz Lang premieres.
1946 - First General Assembly of the United Nations opens in London.
1990 - Time Warner is formed from the merger of Time Inc. and Warner
       Communications Inc.
2000 - America Online announces an agreement to buy Time Warner for
       $162 billion, the largest corporate merger in history.

and birthdays of interesting people

Rasputin (1869)
Donald Knuth (1938)
Rod Stewart (1945)
George Foreman (1949)

> Seems like an excellent date for the first 2007 tipple of Edinburgh PM.  

I'll be out of town on Wednesday.

Rob

From rory at employees.org  Sun Jan  7 23:51:47 2007
From: rory at employees.org (Rory Macdonald)
Date: Mon, 08 Jan 2007 07:51:47 +0000
Subject: [Edinburgh-pm] wednesday
In-Reply-To: <20070107152948.GO16833@minty.org>
References: <20070107152948.GO16833@minty.org>
Message-ID: <1168242707.18790.16.camel@fiji>

On Sun, 2007-01-07 at 15:29 +0000, Murray wrote:
> On the 10th January, in the year:
> 
> 1475, Stephen III of Moldavia defeats the Ottoman Empire.
> 1810, Marriage of Napoleon and Josephine is annulled.
> 1861, Florida secedes from the US during the American Civil War.
> 1863, The first section of the London Underground Railway opens.
> 1920, League of Nations holds its first meeting, ending World War I.
> 2001, Wikipedia starts.
> 
> Seems like an excellent date for the first 2007 tipple of Edinburgh PM.  
> 
> Guildform Arms. 7pm?

Family commitment, so won't be there.

Rory

From perl at aaroncrane.co.uk  Mon Jan  8 02:57:47 2007
From: perl at aaroncrane.co.uk (Aaron Crane)
Date: Mon, 8 Jan 2007 10:57:47 +0000
Subject: [Edinburgh-pm] wednesday
In-Reply-To: <20070107152948.GO16833@minty.org>
References: <20070107152948.GO16833@minty.org>
Message-ID: <20070108105747.GA6492@aaroncrane.co.uk>

Murray writes:
> On the 10th January
> Guildform Arms. 7pm?

I should be there.

-- 
Aaron Crane

From rory at employees.org  Tue Jan 16 14:06:44 2007
From: rory at employees.org (Rory Macdonald)
Date: Tue, 16 Jan 2007 22:06:44 +0000
Subject: [Edinburgh-pm] Summer workshop query
Message-ID: <1168985205.18790.63.camel@fiji>

Hi,

It has been suggested in other circles(1) that it would be a good idea
to have a small series of summer workshops in the run up to YAPC::EU.

The motivation it seems is to provide a series of events which together
could afford to bring a couple of extra overseas perl 'stars' to their
workshops and then to YAPC::EU. The enticement would not just be
financial but would offer a little more variety in terms of places to
visit and audiences to present to.

Its an interesting suggestion, so I thought I'd sound y'all out on this.

I've already queried whether actual/potential sponsors may see this as a
threat to the attraction of the main event. Feedback from appropriate
parties is being sought.

General feedback is welcome, as is any interest in organising such an
event for Edinburgh (presentations by/for ed.pm has been mentioned a
couple of times before).

Rory

1 - PM groups leaders mailing list.

From robrwo at gmail.com  Tue Jan 16 16:03:52 2007
From: robrwo at gmail.com (Robert Rothenberg)
Date: Wed, 17 Jan 2007 00:03:52 +0000
Subject: [Edinburgh-pm] Summer workshop query
In-Reply-To: <1168985205.18790.63.camel@fiji>
References: <1168985205.18790.63.camel@fiji>
Message-ID: <45AD67E8.3070409@gmail.com>


I like the idea.

Note that there seems to be many people at Edinburgh Uni who use Perl for
bio-informatics-related work.  Perhaps we can get their interest?  (I'll
forward your message to some people I know....)

Rob


On 16/01/07 22:06 Rory Macdonald wrote:
> Hi,
> 
> It has been suggested in other circles(1) that it would be a good idea
> to have a small series of summer workshops in the run up to YAPC::EU.
> 
> The motivation it seems is to provide a series of events which together
> could afford to bring a couple of extra overseas perl 'stars' to their
> workshops and then to YAPC::EU. The enticement would not just be
> financial but would offer a little more variety in terms of places to
> visit and audiences to present to.
> 
> Its an interesting suggestion, so I thought I'd sound y'all out on this.
> 
> I've already queried whether actual/potential sponsors may see this as a
> threat to the attraction of the main event. Feedback from appropriate
> parties is being sought.
> 
> General feedback is welcome, as is any interest in organising such an
> event for Edinburgh (presentations by/for ed.pm has been mentioned a
> couple of times before).
> 
> Rory


From nickwoolley at yahoo.co.uk  Wed Jan 17 15:59:29 2007
From: nickwoolley at yahoo.co.uk (Nick Woolley)
Date: Wed, 17 Jan 2007 23:59:29 +0000
Subject: [Edinburgh-pm] Summer workshop query
In-Reply-To: <45AD67E8.3070409@gmail.com>
References: <1168985205.18790.63.camel@fiji> <45AD67E8.3070409@gmail.com>
Message-ID: <45AEB861.8020000@yahoo.co.uk>

As Rob R says, there are certainly other Perl users lurking in Edinburgh 
who might see this as a reason to come along. As for presentations, I'm 
not sure if I have anything very interesting and Perl-related to say at 
the moment, but of course others might.  I could give a hand generally tho.

N

		
___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com

From rory at employees.org  Thu Jan 25 13:39:46 2007
From: rory at employees.org (Rory Macdonald)
Date: Thu, 25 Jan 2007 21:39:46 +0000
Subject: [Edinburgh-pm] Regex performance article
Message-ID: <1169761186.3639.5.camel@fiji>

Hi,

A friend passed this link on to me and thought you might also find it
interesting reading.

http://swtch.com/~rsc/regexp/regexp1.html

Rory

From perl at aaroncrane.co.uk  Sun Jan 28 10:31:53 2007
From: perl at aaroncrane.co.uk (Aaron Crane)
Date: Sun, 28 Jan 2007 18:31:53 +0000
Subject: [Edinburgh-pm] Regex performance article
In-Reply-To: <1169761186.3639.5.camel@fiji>
References: <1169761186.3639.5.camel@fiji>
Message-ID: <20070128183153.GA30752@aaroncrane.co.uk>

Rory Macdonald writes:
> http://swtch.com/~rsc/regexp/regexp1.html

That was very interesting; thanks for posting the link, Rory.

I don't think the approach Russ Cox describes is necessarily easy and/or
useful to fit into Perl, though.

There are two circumstances where the try-and-backtrack NFA execution
algorithm for regexes (that is, the algorithm used by Perl and PCRE and many
other tools) causes O(2^n) (exponential-time) performance.  One is when the
regex uses backreferences.  There are no known backreference algorithms that
avoid O(2^n) execution in all cases, so that's not surprising.  The other is
when the regex uses nested quantifiers -- things like /(a*)+/.

The thing that these two situations have in common is that they're fairly
rare in practice.  Backreferences are almost never useful, and when they
are, it's usually for something whose execution can't be O(2^n), like /\b
(\w+) \s+ \1 \b/x for finding duplicate words.  And nested quantifiers seem
to be used mainly in pathological test cases for regex engines.  The few
real-world situations they're needed for are things like this:

  / " (?: [^"\\]+ | \\ .)* " /x

which also can't take O(2^n) time.  It is admittedly possible to write such
things in ways that can exhibit O(2^n) performance, but I don't find that
significantly more interesting than the fact that you can write buggy code
that contains an infinite loop.  (For the record, you can also write regexes
that match the same strings, but are even more efficient; I have production
code containing

  / " (?> [^"\\]* ) (?> (?> \\ . [^"\\]* )* ) " /x

which can be executed a lot more quickly by Perl's engine.)

Russ mentions this point, but sees the issue as "a choice between an
implementation with a predictable, consistent, fast running time on all
inputs or one that usually runs quickly but can take years of CPU time (or
more) on some inputs".  That's a fair comment -- it would be nice not to
have to spend mental effort on constructing regexes in ways that avoid
misuse of nested quantifiers -- but it's also a simplification of the
situation.

First, as I'm sure Russ knows, asymptotic time complexity isn't the only
factor in how well a piece of code performs.  For small problem sizes (and
often, real-world problems _are_ small), the lower-order terms can hurt a
lot.  We never describe an algorithm as taking O(k_1n? + k_2n + k_3) time,
because that's asymptotically equivalent to O(n?) time complexity.  But for
small values of N, the k_x constants can dominate the actual runtime.

I recently encountered an excellent example of this, related precisely to
this issue of regex performance.  Another part of the production code
mentioned above needs to determine whether any of a set of 339 literal
strings can be found in another string, and we need to answer that question
millions of times per day.  We were building a regex from the literal
strings, separating each one with "|".  Profiling revealed that executing
this regex was a performance bottleneck.  That made sense: at each position
in the string being matched, Perl's regex engine has to try each of the
alternatives in turn.  On the other hand, a DFA execution algorithm, as
described by Russ, can keep track of all of the alternatives in parallel (at
each match position).

With that in mind, I looked round for a suitable off-the-shelf DFA engine to
st-- I mean, to borrow.  I found TRE http://laurikari.net/tre/ (one of the
"efficient implementations" Russ mentions on http://swtch.com/~rsc/regexp/),
and it's already in Debian, so it would be easy for us to use.  Some simple
proof-of-concept XS code later, I benchmarked the same regex on the Perl
engine and on TRE.

Perl's asymptotically-slow engine was about twice as fast as the
asymptotically-fast TRE engine.  So, I gave up on DFAs for this task, and
wrote something even faster instead:

  http://search.cpan.org/~arc/Text-Match-FastAlternatives-0.04/

It's entirely possible that there are DFA implementations that perform at
least as well as Perl's regex engine, but TRE is at least an existence proof
that picking a better algorithm isn't necessarily sufficient for good
performance.

The second problem with Russ's position is that backtracking NFA algorithms
typically have more features than DFA or DFA-conversion implementations.

  - Capturing parentheses have been considered hard to implement in DFA
    engines, though Russ says that "Thompson-style algorithms can be adapted
    to track submatch boundaries without giving up efficient performance",
    and certainly the Tcl regex implementation (a hybrid DFA/NFA engine, as
    far as I can tell) offers them.

  - There are no known algorithms for backreferences that always avoid
    O(2^n) matching time.

  - NFA implementations are beginning to offer recursive regex invocation.
    I'm not a mathematician, but I'm pretty sure that you can't construct a
    DFA from a regex that recursively invokes parts of itself.

  - Perl lets you embed code in regexes, and that code may have arbitrary
    side-effects.  Further, the side-effects are unpicked when Perl
    backtracks past the code.  Again, I'm no mathematician, but I think this
    makes it very hard to use a DFA implementation for embedded code blocks.

There seems to be a trade-off between expressiveness in regex syntax, versus
fast worst-case execution time.  Backtracking NFA implementations come down
firmly in favour of expressiveness.  Traditional DFA implementations prefer
to minimise asymptotic time complexity.  Hybrid implementations get to pick
and choose.

My own preference is for expressiveness, on balance.  That's in line with my
preference for Perl over more minimal languages; good code in a more
expressive language is typically easier to understand than good code in a
less expressive language.

That said, Russ's article is definitely interesting, and perhaps it will
move people to try out the algorithms he suggests in widely-used systems
like Perl or PCRE.

And this does seem to be a good time to go about doing that.  Perl 5.10 will
have many significant enhancements to its regex engine, including facilities
that make it much easier to build and use pluggable regex engines.  It's
quite easy to imagine Perl continuing to offer its current expressive but
sometimes slow regex engine as the default, but letting you switch to a DFA
engine for individual regexes where that would make sense.

-- 
Aaron Crane

From nickwoolley at yahoo.co.uk  Tue Jan 30 08:16:13 2007
From: nickwoolley at yahoo.co.uk (Nick Woolley)
Date: Tue, 30 Jan 2007 16:16:13 +0000
Subject: [Edinburgh-pm] [OT] wrapping image captions in HTML sensibly
Message-ID: <45BF6F4D.7010400@yahoo.co.uk>

Hello,

I don't suppose anyone knows the CSS required to get HTML like this to 
behave? I'm googling without much luck so far.

  <div class="figure">
   <p><img src="blah.jpg" alt="arbitrarily sized image"></p>
   <p>This is a caption which unwrapped is wider than the image and
   we want it to be wrapped to the width of the image without expanding
   the parent div.</p>
  </div>

So I want it to display (as described within) in a cross platform way, 
i.e. I want figure div elements like this to be able to flush right or 
left and have the text flow around them as normal. I don't know what 
size the images will be in advance, so I can't hard code that.

However, by default the caption text will inflate the div to the width 
of the unwrapped caption, or the max width of the parent block, and that 
just looks really rubbish.

Adding this CSS seems to do the trick in Firefox, but not IE7 (I can't 
currently test IE6).

div.figure {
   display: table; width: 0;
}

Finally, I don't want to change the HTML if avoidable because it's 
automatically generated by something I don't want to muck with.  This 
also means I don't want to insert javascript.

It sounds like a fairly common problem, so perhaps someone here already 
solved it?

Cheers,

Nick

		
___________________________________________________________ 
All New Yahoo! Mail ? Tired of Vi at gr@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html

From asmith9983 at gmail.com  Wed Jan 31 06:04:29 2007
From: asmith9983 at gmail.com (Andrew Smith)
Date: Wed, 31 Jan 2007 14:04:29 +0000
Subject: [Edinburgh-pm] united media site:brevity
Message-ID: <45C0A1ED.5040205@gmail.com>

Hi
I saw this attachment today and laughed
I'm sure our Aaron will see the funny side too as he is a good guy.

I don't consider my day to have begun until I've read Dilbert.
--
Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: brevity070131.gif
Type: image/gif
Size: 26339 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/edinburgh-pm/attachments/20070131/fea6b16c/attachment-0001.gif