From djgoku at gmail.com Thu Nov 1 20:50:49 2012 From: djgoku at gmail.com (djgoku at gmail.com) Date: Thu, 1 Nov 2012 22:50:49 -0500 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) Message-ID: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> I haven't started writing code yet, but wanted input on where to start. What I want to do is read in an input file, splice the input into configurable amount of chunks. Create a number of workers also configurable that go off and do work returning results or maybe even saving results to a database. After finishing wait for more work until all work is done and the last worker is finished and exit. Input (variable number of inputs, but each line/row will be in a queue that the worker will get work from): 4,2,1,5,6 1,2,5,6,8,3 9,9,1,3,5,4 7,3,5,2,8 Output is sorted rows in ascending order for each line: 1,2,4,5,6 1,2,3,5,6,8 1,3,4,5,9,9 2,3,5,7,8 Jonathan Otsuka From peter at peknet.com Thu Nov 1 21:13:47 2012 From: peter at peknet.com (Peter Karman) Date: Thu, 01 Nov 2012 23:13:47 -0500 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> Message-ID: <5093487B.2010002@peknet.com> djgoku at gmail.com wrote on 11/1/12 10:50 PM: > I haven't started writing code yet, but wanted input on where to start. > > What I want to do is read in an input file, splice the input into > configurable amount of chunks. Create a number of workers also configurable > that go off and do work returning results or maybe even saving results to a > database. After finishing wait for more work until all work is done and the > last worker is finished and exit. > lots of ways to approach this, depending on how you want to define "worker." Gearman (and its ilk) is one way. For less-heavy infrastructure, I like Parallel::Iterator. -- Peter Karman . http://peknet.com/ . peter at peknet.com From garrett.goebel at gmail.com Thu Nov 1 22:37:23 2012 From: garrett.goebel at gmail.com (Garrett Goebel) Date: Fri, 2 Nov 2012 01:37:23 -0400 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> Message-ID: Questions... What is the expected range regarding the number of numbers to be sorted on each line? How many processors are available? What would serve as a primary key for database storage? If the number of numbers is small... you may want to look at using a quicksort. Otherwise mergesort. On Thu, Nov 1, 2012 at 11:50 PM, wrote: > I haven't started writing code yet, but wanted input on where to start. > > What I want to do is read in an input file, splice the input into > configurable amount of chunks. Create a number of workers also configurable > that go off and do work returning results or maybe even saving results to a > database. After finishing wait for more work until all work is done and the > last worker is finished and exit. > > Input (variable number of inputs, but each line/row will be in a queue > that the worker will get work from): > 4,2,1,5,6 > 1,2,5,6,8,3 > 9,9,1,3,5,4 > 7,3,5,2,8 > > Output is sorted rows in ascending order for each line: > 1,2,4,5,6 > 1,2,3,5,6,8 > 1,3,4,5,9,9 > 2,3,5,7,8 > > Jonathan Otsuka > _______________________________________________ > kc mailing list > kc at pm.org > http://mail.pm.org/mailman/listinfo/kc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From djgoku at gmail.com Fri Nov 2 07:18:46 2012 From: djgoku at gmail.com (djgoku at gmail.com) Date: Fri, 2 Nov 2012 09:18:46 -0500 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> Message-ID: <64C75F0B-7A0D-4CDC-9342-1927101E069E@gmail.com> On Nov 2, 2012, at 12:37 AM, Garrett Goebel wrote: > Questions... > > What is the expected range regarding the number of numbers to be sorted on each line? Sorry this is just an example. The same basic parts will be present in my program. > How many processors are available? All depends. From 1-2 processors with 2-4 cores per core. > What would serve as a primary key for database storage? Line number in this example could serve as the primary key. > If the number of numbers is small... you may want to look at using a quicksort. Otherwise merge sort. This is just an example still of what I am wanting to do, not an actually example. I was trying to think of some sort of work that completed with a list of numbers. One thing to add what I am looking for is to create something (class based) that has those configurable bits and I just override the worker function or provide the worker function and all the rest is executed by my class until everything is finished. Jonathan Otsuka From djgoku at gmail.com Sat Nov 3 20:37:59 2012 From: djgoku at gmail.com (djgoku at gmail.com) Date: Sat, 3 Nov 2012 22:37:59 -0500 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: <5093487B.2010002@peknet.com> References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> <5093487B.2010002@peknet.com> Message-ID: On Nov 1, 2012, at 11:13 PM, Peter Karman wrote: > djgoku at gmail.com wrote on 11/1/12 10:50 PM: >> I haven't started writing code yet, but wanted input on where to start. >> >> What I want to do is read in an input file, splice the input into >> configurable amount of chunks. Create a number of workers also configurable >> that go off and do work returning results or maybe even saving results to a >> database. After finishing wait for more work until all work is done and the >> last worker is finished and exit. >> > > lots of ways to approach this, depending on how you want to define "worker." > > Gearman (and its ilk) is one way. Reading: http://www.slideshare.net/andy.sh/gearman-and-perl Gearman setup sounds interesting, but is really overkill for what I want. > For less-heavy infrastructure, I like Parallel::Iterator. I have a threads/thread::queue hack "working". But I want most of the code to be reusable so I am not copying and pasting code around. Really the only think that will ever change is a worker() function most of the other stuff will be static. Jonathan Otsuka From davidnicol at gmail.com Mon Nov 5 10:17:54 2012 From: davidnicol at gmail.com (David Nicol) Date: Mon, 5 Nov 2012 12:17:54 -0600 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> <5093487B.2010002@peknet.com> Message-ID: an effective way to chunk and parallelize, using the OS instead of the language, is to write work units into a directory using DirDB, and fork many workers (or launch them independently) that consume and delete the work units. Everything gets its own process. You can have more control over locking if you use sqlite for the IPC; you can keep everything in memory instead of disk using anonymous pipes and select. manager node's code looks something like use DirDB; tie %Q, DirDB => 'QueueDir'; while(<>){ %{$Q{"$$wu".++$counter}} = ExpressLineAsWorkUnitPairs($_); $Q{"$$wu".++$counter}{READY} = 1; }; worker code looks something like use DirDB; tie %Q, DirDB => 'QueueDir'; fork;fork;fork; # now you've got 8 workers for(;;){ @WUs = keys %Q; for (@WUs){ $Q{$_}{READY} or next; mkdir "QueueDir/$_/GOTIT", 0777 or next; # this will succeed once DoWorkUnit(%{$Q{$_}}); delete $Q{$_}; }; sleep (2+rand 5); } Run the manager on new input as it appears, and the workers will consume it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidnicol at gmail.com Mon Nov 5 10:25:14 2012 From: davidnicol at gmail.com (David Nicol) Date: Mon, 5 Nov 2012 12:25:14 -0600 Subject: [Kc] Threads/fork/Event-based programming oh my (anyevent/coro) In-Reply-To: References: <28C27A31-1041-4D3C-96AB-1070512F917C@gmail.com> <5093487B.2010002@peknet.com> Message-ID: On Mon, Nov 5, 2012 at 12:17 PM, David Nicol wrote: > $Q{$_}{READY} or next; > mkdir "QueueDir/$_/GOTIT", 0777 or next; # this will > succeed once > sorry, it's probably better to move the work unit into a per-worker queue directory, which should also be atomic. rename "QueueDir/$_", "WorkerQs/$$/$_" or next; ... these things are easy to do, and easy to do reusably, which is why there are so many available. -------------- next part -------------- An HTML attachment was scrubbed... URL: From djgoku at gmail.com Mon Nov 12 12:48:28 2012 From: djgoku at gmail.com (Jonathan Otsuka) Date: Mon, 12 Nov 2012 14:48:28 -0600 Subject: [Kc] PM - meeting 11/13/2012 Message-ID: <972F5066-36BF-470D-9203-C98527268C7D@gmail.com> Who wants to meet tomorrow night at 75th street brewery? Jonathan Otsuka From amoore at mooresystems.com Mon Nov 12 12:50:51 2012 From: amoore at mooresystems.com (Andrew Moore) Date: Mon, 12 Nov 2012 14:50:51 -0600 Subject: [Kc] PM - meeting 11/13/2012 In-Reply-To: <972F5066-36BF-470D-9203-C98527268C7D@gmail.com> References: <972F5066-36BF-470D-9203-C98527268C7D@gmail.com> Message-ID: You bet! I wonder if they still have any of their pumpkin beer left. -Andy On Mon, Nov 12, 2012 at 2:48 PM, Jonathan Otsuka wrote: > Who wants to meet tomorrow night at 75th street brewery? > > Jonathan Otsuka > _______________________________________________ > kc mailing list > kc at pm.org > http://mail.pm.org/mailman/listinfo/kc