From perl at minty.org Sun Jan 7 07:29:50 2007 From: perl at minty.org (Murray) Date: Sun, 7 Jan 2007 15:29:50 +0000 Subject: [Edinburgh-pm] wednesday Message-ID: <20070107152948.GO16833@minty.org> On the 10th January, in the year: 1475, Stephen III of Moldavia defeats the Ottoman Empire. 1810, Marriage of Napoleon and Josephine is annulled. 1861, Florida secedes from the US during the American Civil War. 1863, The first section of the London Underground Railway opens. 1920, League of Nations holds its first meeting, ending World War I. 2001, Wikipedia starts. Seems like an excellent date for the first 2007 tipple of Edinburgh PM. Guildform Arms. 7pm? http://en.wikipedia.org/wiki/January_10 http://www.guildfordarms.com/ From robrwo at gmail.com Sun Jan 7 07:54:31 2007 From: robrwo at gmail.com (Robert Rothenberg) Date: Sun, 07 Jan 2007 15:54:31 +0000 Subject: [Edinburgh-pm] wednesday In-Reply-To: <20070107152948.GO16833@minty.org> References: <20070107152948.GO16833@minty.org> Message-ID: <45A117B7.4010406@gmail.com> On 07/01/07 15:29 Murray wrote: > On the 10th January, in the year: > > 1475, Stephen III of Moldavia defeats the Ottoman Empire. > 1810, Marriage of Napoleon and Josephine is annulled. > 1861, Florida secedes from the US during the American Civil War. > 1863, The first section of the London Underground Railway opens. > 1920, League of Nations holds its first meeting, ending World War I. > 2001, Wikipedia starts. > Some other interesting anniversaries: 49 BC - Julius Caesar crosses the Rubicon, signaling the start of civil war. 1776 - Thomas Paine publishes Common Sense 1870 - John D. Rockefeller incorporates Standard Oil. 1927 - The film Metropolis by Fritz Lang premieres. 1946 - First General Assembly of the United Nations opens in London. 1990 - Time Warner is formed from the merger of Time Inc. and Warner Communications Inc. 2000 - America Online announces an agreement to buy Time Warner for $162 billion, the largest corporate merger in history. and birthdays of interesting people Rasputin (1869) Donald Knuth (1938) Rod Stewart (1945) George Foreman (1949) > Seems like an excellent date for the first 2007 tipple of Edinburgh PM. I'll be out of town on Wednesday. Rob From rory at employees.org Sun Jan 7 23:51:47 2007 From: rory at employees.org (Rory Macdonald) Date: Mon, 08 Jan 2007 07:51:47 +0000 Subject: [Edinburgh-pm] wednesday In-Reply-To: <20070107152948.GO16833@minty.org> References: <20070107152948.GO16833@minty.org> Message-ID: <1168242707.18790.16.camel@fiji> On Sun, 2007-01-07 at 15:29 +0000, Murray wrote: > On the 10th January, in the year: > > 1475, Stephen III of Moldavia defeats the Ottoman Empire. > 1810, Marriage of Napoleon and Josephine is annulled. > 1861, Florida secedes from the US during the American Civil War. > 1863, The first section of the London Underground Railway opens. > 1920, League of Nations holds its first meeting, ending World War I. > 2001, Wikipedia starts. > > Seems like an excellent date for the first 2007 tipple of Edinburgh PM. > > Guildform Arms. 7pm? Family commitment, so won't be there. Rory From perl at aaroncrane.co.uk Mon Jan 8 02:57:47 2007 From: perl at aaroncrane.co.uk (Aaron Crane) Date: Mon, 8 Jan 2007 10:57:47 +0000 Subject: [Edinburgh-pm] wednesday In-Reply-To: <20070107152948.GO16833@minty.org> References: <20070107152948.GO16833@minty.org> Message-ID: <20070108105747.GA6492@aaroncrane.co.uk> Murray writes: > On the 10th January > Guildform Arms. 7pm? I should be there. -- Aaron Crane From rory at employees.org Tue Jan 16 14:06:44 2007 From: rory at employees.org (Rory Macdonald) Date: Tue, 16 Jan 2007 22:06:44 +0000 Subject: [Edinburgh-pm] Summer workshop query Message-ID: <1168985205.18790.63.camel@fiji> Hi, It has been suggested in other circles(1) that it would be a good idea to have a small series of summer workshops in the run up to YAPC::EU. The motivation it seems is to provide a series of events which together could afford to bring a couple of extra overseas perl 'stars' to their workshops and then to YAPC::EU. The enticement would not just be financial but would offer a little more variety in terms of places to visit and audiences to present to. Its an interesting suggestion, so I thought I'd sound y'all out on this. I've already queried whether actual/potential sponsors may see this as a threat to the attraction of the main event. Feedback from appropriate parties is being sought. General feedback is welcome, as is any interest in organising such an event for Edinburgh (presentations by/for ed.pm has been mentioned a couple of times before). Rory 1 - PM groups leaders mailing list. From robrwo at gmail.com Tue Jan 16 16:03:52 2007 From: robrwo at gmail.com (Robert Rothenberg) Date: Wed, 17 Jan 2007 00:03:52 +0000 Subject: [Edinburgh-pm] Summer workshop query In-Reply-To: <1168985205.18790.63.camel@fiji> References: <1168985205.18790.63.camel@fiji> Message-ID: <45AD67E8.3070409@gmail.com> I like the idea. Note that there seems to be many people at Edinburgh Uni who use Perl for bio-informatics-related work. Perhaps we can get their interest? (I'll forward your message to some people I know....) Rob On 16/01/07 22:06 Rory Macdonald wrote: > Hi, > > It has been suggested in other circles(1) that it would be a good idea > to have a small series of summer workshops in the run up to YAPC::EU. > > The motivation it seems is to provide a series of events which together > could afford to bring a couple of extra overseas perl 'stars' to their > workshops and then to YAPC::EU. The enticement would not just be > financial but would offer a little more variety in terms of places to > visit and audiences to present to. > > Its an interesting suggestion, so I thought I'd sound y'all out on this. > > I've already queried whether actual/potential sponsors may see this as a > threat to the attraction of the main event. Feedback from appropriate > parties is being sought. > > General feedback is welcome, as is any interest in organising such an > event for Edinburgh (presentations by/for ed.pm has been mentioned a > couple of times before). > > Rory From nickwoolley at yahoo.co.uk Wed Jan 17 15:59:29 2007 From: nickwoolley at yahoo.co.uk (Nick Woolley) Date: Wed, 17 Jan 2007 23:59:29 +0000 Subject: [Edinburgh-pm] Summer workshop query In-Reply-To: <45AD67E8.3070409@gmail.com> References: <1168985205.18790.63.camel@fiji> <45AD67E8.3070409@gmail.com> Message-ID: <45AEB861.8020000@yahoo.co.uk> As Rob R says, there are certainly other Perl users lurking in Edinburgh who might see this as a reason to come along. As for presentations, I'm not sure if I have anything very interesting and Perl-related to say at the moment, but of course others might. I could give a hand generally tho. N ___________________________________________________________ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com From rory at employees.org Thu Jan 25 13:39:46 2007 From: rory at employees.org (Rory Macdonald) Date: Thu, 25 Jan 2007 21:39:46 +0000 Subject: [Edinburgh-pm] Regex performance article Message-ID: <1169761186.3639.5.camel@fiji> Hi, A friend passed this link on to me and thought you might also find it interesting reading. http://swtch.com/~rsc/regexp/regexp1.html Rory From perl at aaroncrane.co.uk Sun Jan 28 10:31:53 2007 From: perl at aaroncrane.co.uk (Aaron Crane) Date: Sun, 28 Jan 2007 18:31:53 +0000 Subject: [Edinburgh-pm] Regex performance article In-Reply-To: <1169761186.3639.5.camel@fiji> References: <1169761186.3639.5.camel@fiji> Message-ID: <20070128183153.GA30752@aaroncrane.co.uk> Rory Macdonald writes: > http://swtch.com/~rsc/regexp/regexp1.html That was very interesting; thanks for posting the link, Rory. I don't think the approach Russ Cox describes is necessarily easy and/or useful to fit into Perl, though. There are two circumstances where the try-and-backtrack NFA execution algorithm for regexes (that is, the algorithm used by Perl and PCRE and many other tools) causes O(2^n) (exponential-time) performance. One is when the regex uses backreferences. There are no known backreference algorithms that avoid O(2^n) execution in all cases, so that's not surprising. The other is when the regex uses nested quantifiers -- things like /(a*)+/. The thing that these two situations have in common is that they're fairly rare in practice. Backreferences are almost never useful, and when they are, it's usually for something whose execution can't be O(2^n), like /\b (\w+) \s+ \1 \b/x for finding duplicate words. And nested quantifiers seem to be used mainly in pathological test cases for regex engines. The few real-world situations they're needed for are things like this: / " (?: [^"\\]+ | \\ .)* " /x which also can't take O(2^n) time. It is admittedly possible to write such things in ways that can exhibit O(2^n) performance, but I don't find that significantly more interesting than the fact that you can write buggy code that contains an infinite loop. (For the record, you can also write regexes that match the same strings, but are even more efficient; I have production code containing / " (?> [^"\\]* ) (?> (?> \\ . [^"\\]* )* ) " /x which can be executed a lot more quickly by Perl's engine.) Russ mentions this point, but sees the issue as "a choice between an implementation with a predictable, consistent, fast running time on all inputs or one that usually runs quickly but can take years of CPU time (or more) on some inputs". That's a fair comment -- it would be nice not to have to spend mental effort on constructing regexes in ways that avoid misuse of nested quantifiers -- but it's also a simplification of the situation. First, as I'm sure Russ knows, asymptotic time complexity isn't the only factor in how well a piece of code performs. For small problem sizes (and often, real-world problems _are_ small), the lower-order terms can hurt a lot. We never describe an algorithm as taking O(k_1n? + k_2n + k_3) time, because that's asymptotically equivalent to O(n?) time complexity. But for small values of N, the k_x constants can dominate the actual runtime. I recently encountered an excellent example of this, related precisely to this issue of regex performance. Another part of the production code mentioned above needs to determine whether any of a set of 339 literal strings can be found in another string, and we need to answer that question millions of times per day. We were building a regex from the literal strings, separating each one with "|". Profiling revealed that executing this regex was a performance bottleneck. That made sense: at each position in the string being matched, Perl's regex engine has to try each of the alternatives in turn. On the other hand, a DFA execution algorithm, as described by Russ, can keep track of all of the alternatives in parallel (at each match position). With that in mind, I looked round for a suitable off-the-shelf DFA engine to st-- I mean, to borrow. I found TRE http://laurikari.net/tre/ (one of the "efficient implementations" Russ mentions on http://swtch.com/~rsc/regexp/), and it's already in Debian, so it would be easy for us to use. Some simple proof-of-concept XS code later, I benchmarked the same regex on the Perl engine and on TRE. Perl's asymptotically-slow engine was about twice as fast as the asymptotically-fast TRE engine. So, I gave up on DFAs for this task, and wrote something even faster instead: http://search.cpan.org/~arc/Text-Match-FastAlternatives-0.04/ It's entirely possible that there are DFA implementations that perform at least as well as Perl's regex engine, but TRE is at least an existence proof that picking a better algorithm isn't necessarily sufficient for good performance. The second problem with Russ's position is that backtracking NFA algorithms typically have more features than DFA or DFA-conversion implementations. - Capturing parentheses have been considered hard to implement in DFA engines, though Russ says that "Thompson-style algorithms can be adapted to track submatch boundaries without giving up efficient performance", and certainly the Tcl regex implementation (a hybrid DFA/NFA engine, as far as I can tell) offers them. - There are no known algorithms for backreferences that always avoid O(2^n) matching time. - NFA implementations are beginning to offer recursive regex invocation. I'm not a mathematician, but I'm pretty sure that you can't construct a DFA from a regex that recursively invokes parts of itself. - Perl lets you embed code in regexes, and that code may have arbitrary side-effects. Further, the side-effects are unpicked when Perl backtracks past the code. Again, I'm no mathematician, but I think this makes it very hard to use a DFA implementation for embedded code blocks. There seems to be a trade-off between expressiveness in regex syntax, versus fast worst-case execution time. Backtracking NFA implementations come down firmly in favour of expressiveness. Traditional DFA implementations prefer to minimise asymptotic time complexity. Hybrid implementations get to pick and choose. My own preference is for expressiveness, on balance. That's in line with my preference for Perl over more minimal languages; good code in a more expressive language is typically easier to understand than good code in a less expressive language. That said, Russ's article is definitely interesting, and perhaps it will move people to try out the algorithms he suggests in widely-used systems like Perl or PCRE. And this does seem to be a good time to go about doing that. Perl 5.10 will have many significant enhancements to its regex engine, including facilities that make it much easier to build and use pluggable regex engines. It's quite easy to imagine Perl continuing to offer its current expressive but sometimes slow regex engine as the default, but letting you switch to a DFA engine for individual regexes where that would make sense. -- Aaron Crane From nickwoolley at yahoo.co.uk Tue Jan 30 08:16:13 2007 From: nickwoolley at yahoo.co.uk (Nick Woolley) Date: Tue, 30 Jan 2007 16:16:13 +0000 Subject: [Edinburgh-pm] [OT] wrapping image captions in HTML sensibly Message-ID: <45BF6F4D.7010400@yahoo.co.uk> Hello, I don't suppose anyone knows the CSS required to get HTML like this to behave? I'm googling without much luck so far.

arbitrarily sized image

This is a caption which unwrapped is wider than the image and we want it to be wrapped to the width of the image without expanding the parent div.

So I want it to display (as described within) in a cross platform way, i.e. I want figure div elements like this to be able to flush right or left and have the text flow around them as normal. I don't know what size the images will be in advance, so I can't hard code that. However, by default the caption text will inflate the div to the width of the unwrapped caption, or the max width of the parent block, and that just looks really rubbish. Adding this CSS seems to do the trick in Firefox, but not IE7 (I can't currently test IE6). div.figure { display: table; width: 0; } Finally, I don't want to change the HTML if avoidable because it's automatically generated by something I don't want to muck with. This also means I don't want to insert javascript. It sounds like a fairly common problem, so perhaps someone here already solved it? Cheers, Nick ___________________________________________________________ All New Yahoo! Mail ? Tired of Vi at gr@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html From asmith9983 at gmail.com Wed Jan 31 06:04:29 2007 From: asmith9983 at gmail.com (Andrew Smith) Date: Wed, 31 Jan 2007 14:04:29 +0000 Subject: [Edinburgh-pm] united media site:brevity Message-ID: <45C0A1ED.5040205@gmail.com> Hi I saw this attachment today and laughed I'm sure our Aaron will see the funny side too as he is a good guy. I don't consider my day to have begun until I've read Dilbert. -- Andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: brevity070131.gif Type: image/gif Size: 26339 bytes Desc: not available Url : http://mail.pm.org/pipermail/edinburgh-pm/attachments/20070131/fea6b16c/attachment-0001.gif