[tpm] Threadbare

Sun Dec 12 08:13:17 PST 2010

We use OS processes rather than threads. We're running Perl on Windows, and although it can emulate fork(), the price is a significant performance hit, and little architectural advantage. 

We do handle scalability through a task management framework, which essentially wraps different types of OS processes into a model where processes can be started in parallel, and where the manager waits until all have been completed before continuing. We can nest parallel and sequential blocks. (I used the communicating sequential processes model as a base for this.) Data can be shared between them - I used the ludicrously simple approach of a SQLite database with exclusive transactions to coordinate, and the file system to share large amounts of data. It's worth stating that this task management framework is different from both UNIX and Windows, as well as Perl, in particular trying to make processes that split join again effectively. 

Some of this was driven by other reasons than threading in Perl. Our system can be very memory-hungry, and Windows Perl's were not 64-bit friendly at the time, so we had around 1.4Gb limit on memory for the process, including all threads. This was too low, and a second reason for using OS processes was to spread the memory load. A fair number of our processes are also written in Java, so we'd need to cross language boundaries too. 

Others may correct me: it seemed to me that Perl threading evolved towards a model that could provide a fork() emulation for Windows. While fork emulation for Windows might be useful, I can live without it. I'd rather have had very light (no memory copying, use at your own risk) OS threads in Perl, which would actually have been more useful in cases where we needed to spread CPU load. And I'm not sure that adding threading with a significant performance hit is a reasonable tradeoff. Threading shouldn't affect the performance of code that doesn't do threading stuff.

Ours is a web application, and we did consider threading for the front-end, handling requests in Catalyst, which would be a a very good application for prefork-style parallelism. Unfortunately, on Windows, we were using FastCGI, which will never forward more than a single request to a helper process at a time, so in practice, there would be no performance advantage to threading, just the same performance hit.

To conclude, on a Windows Perl at least, the main advantage of threading is it gives you fork, and that helps CPAN tests pass better. The price is reduced performance across the board, and we'd have had to use OS processes anyway because the 32-bit memory limits were too low. So we built a non-threaded Windows Perl and went with that and a home-brewed task coordination ad management component. 

--S
--
Stuart Watt
stuart at morungos.com

On 2010-12-11, at 2:29 PM, arocker at vex.net wrote:

> 
> In order not to embarrass myself (or Perl) at the GTALUG meeting on
> Tuesday, I'm researching the agenda. and I need help with this item:
> 
> * How does your language handle scalability issues?
>  * Applications that require many concurrent threads of execution?
>    * How does the language interact with threading?
>    * Does it offer other models for managing concurrent processing?
> 
> I've never had to deal with multiple concurrent processes, and frankly
> take a dim view of applications trying to do the OS's job for it. Has
> anybody any experience in this area that I can quote?
> 
> _______________________________________________
> toronto-pm mailing list
> toronto-pm at pm.org
> http://mail.pm.org/mailman/listinfo/toronto-pm

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/toronto-pm/attachments/20101212/e16d5c48/attachment.html>