From jobs-noreply at seattleperl.org Tue Jul 9 13:22:38 2013 From: jobs-noreply at seattleperl.org (SPUG Jobs) Date: Tue, 9 Jul 2013 13:22:38 -0700 (PDT) Subject: SPUG: JOB: Scientific Programmer (Perl) in UW Microbiology Message-ID: I am leaving to attend medical school, vacating a position I have held for the last 8+ years developing a web-based scientific application for HIV research at the UW. We are looking for an experienced programmer to take over the project ASAP. A link to the full description, including instructions to apply, is below the summary, and I would be happy to answer any individual questions as well (bmaust at the standard UW domain). This is a great opportunity for someone with strong technical skills to contribute meaningfully to medical science with a significant public health impact. required skill-set: - Extensive experience with writing object-oriented Perl for the web with Catalyst, SQL, HTML/CSS/JS - Familiarity with C/C++, Python, Java - Apache (mod_perl), PostgreSQL desired skills: - experience in molecular biology or life sciences hiring details: - union-exempt permanent W-2 - standard UW benefits including 403(b) match, tuition exemption, highly subsidized health insurance location: - South Lake Union, Seattle with partial telecommute possible (majority on-site time required) company's product or service: - The lab focuses on HIV research and is part of the UW Department of Microbiology. full job description: https://uwhires.admin.washington.edu/eng/candidates/default.cfm?szCategory=jobprofile&szOrderID=96986 -- Brandon Maust Research Consultant Mullins Lab, UW Microbiology From m3047 at m3047.net Tue Jul 23 10:19:19 2013 From: m3047 at m3047.net (Fred Morris) Date: Tue, 23 Jul 2013 10:19:19 -0700 Subject: SPUG: print statement taking ridiculously long Message-ID: <201307231019.19113.m3047@m3047.net> We've got a script. It produces a huge string after much thrashing: $big_string = "60 Megs or so"; Then it tries to write it to a file: open OUT, ">$file"; print OUT $big_string; close OUT; Well that print statement takes over half an hour with one core running at 100% CPU! For that matter we've discovered that taking the length of $big_string takes an inordinate amount of time as well. The job doesn't appear to be swapping. This is happening on the Debian build of perl 5.14. It runs (the whole script) in 5 minutes or so on other (older) versions of perl on far more modest hardware. Thoughts? (BTW, anybody interested in some sort of perl party/meetup in Tacoma?) -- Fred Morris From sthoenna at gmail.com Tue Jul 23 10:32:55 2013 From: sthoenna at gmail.com (Yitzchak Scott-Thoennes) Date: Tue, 23 Jul 2013 10:32:55 -0700 Subject: SPUG: print statement taking ridiculously long In-Reply-To: <201307231019.19113.m3047@m3047.net> References: <201307231019.19113.m3047@m3047.net> Message-ID: How old is "(older)"? Does File::Slurp::write_file( $file, \$big_string ) do any better? From andrew at sweger.net Tue Jul 23 10:32:55 2013 From: andrew at sweger.net (Andrew Sweger) Date: Tue, 23 Jul 2013 10:32:55 -0700 (PDT) Subject: SPUG: print statement taking ridiculously long In-Reply-To: <201307231019.19113.m3047@m3047.net> Message-ID: On Tue, 23 Jul 2013, Fred Morris wrote: > Well that print statement takes over half an hour with one core running at > 100% CPU! For that matter we've discovered that taking the length of > $big_string takes an inordinate amount of time as well. The job doesn't > appear to be swapping. Well there's the problem: it's only using one core. You need to make your script threaded so that it can parallelelize across all your cores. That'll make it run more faster. > (BTW, anybody interested in some sort of perl party/meetup in Tacoma?) Isn't everyone in PDX this week? (I mean, I'm not, of course. But I thought everyone else was.) I'm guessing that $big_string is built up by appending/concatenating repeatedly. I'm also guessing ysth will know the cause. Oh, too late. -- Andrew B. Sweger -- The great thing about multitasking is that several things can go wrong at once. From tsibley at cpan.org Tue Jul 23 10:34:49 2013 From: tsibley at cpan.org (Thomas Sibley) Date: Tue, 23 Jul 2013 10:34:49 -0700 Subject: SPUG: print statement taking ridiculously long In-Reply-To: <201307231019.19113.m3047@m3047.net> References: <201307231019.19113.m3047@m3047.net> Message-ID: <51EEBEB9.8040707@cpan.org> On 07/23/2013 10:19 AM, Fred Morris wrote: > We've got a script. It produces a huge string after much thrashing: > > $big_string = "60 Megs or so"; > > Then it tries to write it to a file: > > open OUT, ">$file"; > print OUT $big_string; > close OUT; > > Well that print statement takes over half an hour with one core running at > 100% CPU! For that matter we've discovered that taking the length of > $big_string takes an inordinate amount of time as well. The job doesn't > appear to be swapping. > > This is happening on the Debian build of perl 5.14. It runs (the whole script) > in 5 minutes or so on other (older) versions of perl on far more modest > hardware. > > Thoughts? I'd take a look at your disk IO. (Are you writing to a network FS? Is your disk dying?) tom at whaam ~ $ cat long-string use strict; use warnings; my $big_string = "a" x (60 * 1024**2); open my $out, ">", "/tmp/big" or die $!; print { $out } $big_string; close $out or die $!; tom at whaam ~ $ perlbrew exec -- time --format "took %es" perl long-string perl-5.10.1 ========== took 0.20s perl-5.12.5 ========== took 0.17s perl-5.14.1 ========== took 0.17s perl-5.16.3 ========== took 0.16s perl-5.18 ========== took 0.17s perl-5.8.3 ========== took 0.19s perl-5.8.8 ========== took 0.19s From m3047 at m3047.net Tue Jul 23 10:39:14 2013 From: m3047 at m3047.net (Fred Morris) Date: Tue, 23 Jul 2013 10:39:14 -0700 Subject: SPUG: print statement taking ridiculously long In-Reply-To: References: Message-ID: <201307231039.14171.m3047@m3047.net> On Tuesday 23 July 2013 10:32, Andrew Sweger wrote: > I'm guessing that $big_string is built up by appending/concatenating > repeatedly. How did you guess? ;-) Yes, the wonders of the Fibonacci series... We changed that before discovering that the print statement was where the real problem was. It was actually stringifying XML::Generator output. Hrmmm... come to think of it I wonder if $big_string is *really* a string, or is really that XML::Generator output object which behaves like a string? Hrmmm... -- Fred From steve at baylis.org Tue Jul 23 10:55:06 2013 From: steve at baylis.org (Steve Baylis) Date: Tue, 23 Jul 2013 10:55:06 -0700 Subject: SPUG: print statement taking ridiculously long In-Reply-To: <201307231039.14171.m3047@m3047.net> References: <201307231039.14171.m3047@m3047.net> Message-ID: Are you sure the problem is the version of Perl and not something hardware related? You say it runs much faster in older versions of Perl on lesser hardware, what about older versions of Perl on the same hardware? Run: time head -c 60000000 /dev/urandom > file If that's relatively fast then you're issue is likely in Perl. If that's also very slow then you likely have something going on in your storage tier with that particular hardware. -Steve On Tue, Jul 23, 2013 at 10:39 AM, Fred Morris wrote: > On Tuesday 23 July 2013 10:32, Andrew Sweger wrote: > > I'm guessing that $big_string is built up by appending/concatenating > > repeatedly. > > How did you guess? ;-) > > Yes, the wonders of the Fibonacci series... We changed that before > discovering > that the print statement was where the real problem was. It was actually > stringifying XML::Generator output. > > Hrmmm... come to think of it I wonder if $big_string is *really* a string, > or > is really that XML::Generator output object which behaves like a string? > Hrmmm... > > -- > > Fred > > _____________________________________________________________ > Seattle Perl Users Group Mailing List > POST TO: spug-list at pm.org > SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list > MEETINGS: 3rd Tuesdays > WEB PAGE: http://seattleperl.org/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m3047 at m3047.net Tue Jul 23 10:57:44 2013 From: m3047 at m3047.net (Fred Morris) Date: Tue, 23 Jul 2013 10:57:44 -0700 Subject: SPUG: print statement taking ridiculously long In-Reply-To: <201307231039.14171.m3047@m3047.net> References: <201307231039.14171.m3047@m3047.net> Message-ID: <201307231057.44893.m3047@m3047.net> On Tuesday 23 July 2013 10:39, I wrote: > [...] really that XML::Generator output object which behaves like a string? > Hrmmm... Yup, it's a ref. Colleague opines that they ought to call it XML::Generator::pretty::slow. Anyway, I think we're on the right track. Thanks all! -- Fred