[Phoenix-pm] Perl threads?

Thu Jan 26 11:49:04 PST 2006

Hi billn,

Yes. Strong ones. Having to :share everything as it is declared is a major
pain, as when building datastractures to be passed from thread to thread,
you don't know where the data is going to originate at. Trying to use threads,
I spend obscene amounts of time trying to track down the values that didn't
make it through the queue and figure out where they're declared... or you can
mark everything :shared. You can't just create anonymous arrays and hashes
with [ ] and { } because they aren't shared. You have go to share them.

"Fast" is relative. Creating a thread actually allocates a large amount
of thread-local-storage -- enough to accomodate every variable that
might at some point be allocated. Building Perl with threaded support
slows down the interpreter all the time, even when threads aren't in use,
as many datastructures must be locked, and many internal values must
be allocated and addressed per-thread, which introduces more overhead
in terms of CPU and RAM.

Coro lets you easily bounce control back and forth between two contexts,
passing data both ways if you like, using the argument/return construct
rather than awkward queues and locks. 

Coro only allocates a little bit of C stack (16k, I think the default is),
a few AVs to use to store internal data structures, and whatever lexicals
actually get created in a new context.

Since Coro only changes context at well specified times, it isn't necessary
to constantly lock Perl data structures.

It does take less time to change contexts using threads than to change 
Coro contexts, but Coro contexts are changed when needed (IO would otherwise
block, explicitly requested with yield, etc), rather than 100hz or whatever
the threading implementation likes, so Coro is likely to wind up being
faster anyway even in this regard.

On the other hand, only Event, Coro::Handle, and Coro::Socket and their 
subclasses automatically switch contexts rather than block, so,
dependign on the application, more work will be required to make Coro
actually work correctly -- you will often have to extract the filehandle
from whatever object (such as Tk) and then do an Event wait on it.

I'd boil it down to this: use threads if you have a CPU bound task that
needs to be closely coupled with the user interface (GUI, daemon, 
network socket, whatever). If it doesn't need to be closely coupled,
then you'd already be forking, of course. If the other tasks aren't
CPU bound, use Coro instead. It sucks far less.

But, on the other hand, using Coro merely as a replacement for threads
would be selling it short. You should use Coro to structure the logic in
your program and willy nilly give context to any subroutine or codeblock
that can be written more concisely, more elegantly, or more intuitively
using it. Then "multiprogramming" will seem more a simple matter of
structuring program logic than anything so horrid and heinous as "threading".

Buy my book. Damnit. 

-scott

P.S.: check out forks.pm -- it's an implemention of the thread API,
so Thread::Queue etc all work, but build on top of fork and IPC.
If you need threads because of CPU bound tasks but there isn't
too much variable sharing and locking, I'd go this route.

P.P.S.:

Buy my damned book, damnit.

On  0, Bill Nash <billn at odyssey.billn.net> wrote:
> 
> I hear tell that it's fairly fast and stable in modern versions. Any 
> opinions?
> 
> - billn
> _______________________________________________
> Phoenix-pm mailing list
> Phoenix-pm at pm.org
> http://mail.pm.org/mailman/listinfo/phoenix-pm