From mark at purdue.edu Sat Nov 8 20:39:19 2014 From: mark at purdue.edu (Mark Senn) Date: Sat, 08 Nov 2014 23:39:19 -0500 Subject: [Purdue-pm] Perl 5 exercise Message-ID: <10209.1415507959@pier.ecn.purdue.edu> The first half of this exercise shows the results of the computation. In the second half of the exercise replace "THIS" with ten characters that more clearly expresses the code in the first half of the exercise. #!/usr/new/bin/perl my $line = 'Al , Bob B. Bob , Bono '; $_ = $line; s/^.+?$//; my @email = split />.+? Greetings! This is an automated message to announce that Purdue Perl Mongers will meet TOMORROW at 11:30am in WSLR 116. Generally, we discuss computing, dynamic languages, Purdue and the overlap of those three (plus whatever else comes up) until Noon, then have one or more more formal presentations. We don't always discuss Perl -- we have had presentations on Python, Javascript, HTML4, Selenium, and many other topics -- so if you are not an experienced Perl user, there's still a place for you. Check our Twitter feed (@purduepm), our Google+ Community (https://plus.google.com/communities/100780979348959606696) or our web page (http://pm.purdue.org) for more information. From gizmo at purdue.edu Mon Nov 17 11:59:16 2014 From: gizmo at purdue.edu (Joe Kline) Date: Mon, 17 Nov 2014 14:59:16 -0500 Subject: [Purdue-pm] REMINDER: Purdue Perl Mongers meets TOMORROW! In-Reply-To: <5469f130.ca3d320a.1023.ffffce34@mx.google.com> References: <5469f130.ca3d320a.1023.ffffce34@mx.google.com> Message-ID: <546A5394.6040609@purdue.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I had some stuff come up that I'll need to take care of during our regular meeting time. I was going to discuss my thoughts on the recent Pittsburgh Perl Workshop. A number of the talks are already online at: https://www.youtube.com/channel/UCautJ6yqxYAHjUYYAdLc0pw That's about half the talks, the rest should be upload at some later date. joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iEYEARECAAYFAlRqU5MACgkQb0mzA2gRTpnLZgCfSJNwGJHaFP3rgJtokpV+xyPJ 6O4AnigLCY1mibdqzo1G5ZOiTTM5AFlg =Ob1E -----END PGP SIGNATURE----- From jacoby at purdue.edu Mon Nov 17 13:14:33 2014 From: jacoby at purdue.edu (Dave Jacoby) Date: Mon, 17 Nov 2014 16:14:33 -0500 Subject: [Purdue-pm] Alert: Room Change Message-ID: <546A6539.2010405@purdue.edu> Another group was scheduled into another room, one that was ill-suited to their purpose. So, they went to Rick and made a deal. 1) We will meet one floor down, in WSLR B008, which is more-or-less one floor down from our regular meeting room. That is Bee Zero Zero Eight. 2) Jimmy John's will be provided. See you there! Same Camel-Time! (Slightly) Different Camel-Location! -- Dave Jacoby Developer, Purdue Genomics Core Lab http://web.ics.purdue.edu/~djacoby/ 421 days using standing desk From derrick at csociety.org Mon Nov 17 13:16:51 2014 From: derrick at csociety.org (derrick) Date: Mon, 17 Nov 2014 16:16:51 -0500 Subject: [Purdue-pm] Alert: Room Change In-Reply-To: <546A6539.2010405@purdue.edu> References: <546A6539.2010405@purdue.edu> Message-ID: <546A65C3.50603@csociety.org> I probably won't be there, had a meeting scheduled till 12:30. I'll try to stop by after if it doesnt run over. dsk On 11/17/14 16:14, Dave Jacoby wrote: > Another group was scheduled into another room, one that was ill-suited > to their purpose. So, they went to Rick and made a deal. > > 1) We will meet one floor down, in WSLR B008, which is more-or-less one > floor down from our regular meeting room. That is Bee Zero Zero Eight. > > 2) Jimmy John's will be provided. > > See you there! Same Camel-Time! (Slightly) Different Camel-Location! > From jacoby at purdue.edu Mon Nov 17 13:20:06 2014 From: jacoby at purdue.edu (Dave Jacoby) Date: Mon, 17 Nov 2014 16:20:06 -0500 Subject: [Purdue-pm] Alert: Room Change In-Reply-To: <546A65C3.50603@csociety.org> References: <546A6539.2010405@purdue.edu> <546A65C3.50603@csociety.org> Message-ID: <546A6686.1000102@purdue.edu> On 11/17/2014 4:16 PM, derrick wrote: > I probably won't be there, had a meeting scheduled till 12:30. I'll try > to stop by after if it doesnt run over. > > dsk That'd be great. We'll try to save you a sub, but who knows... -- Dave Jacoby Developer, Purdue Genomics Core Lab http://web.ics.purdue.edu/~djacoby/ 421 days using standing desk From jacoby.david at gmail.com Tue Nov 18 08:40:57 2014 From: jacoby.david at gmail.com (Dave Jacoby) Date: Tue, 18 Nov 2014 11:40:57 -0500 Subject: [Purdue-pm] Meeting Message-ID: Am I going to have to eat all this Jimmy John's myself? Sent from my pocket supercomputer -------------- next part -------------- An HTML attachment was scrubbed... URL: From bradley.d.andersen at gmail.com Tue Nov 18 08:48:18 2014 From: bradley.d.andersen at gmail.com (Bradley Andersen) Date: Tue, 18 Nov 2014 11:48:18 -0500 Subject: [Purdue-pm] Meeting In-Reply-To: References: Message-ID: If I were not boycotting JJ since May, I would ask you to email me a sandwich :) On Tue, Nov 18, 2014 at 11:40 AM, Dave Jacoby wrote: > Am I going to have to eat all this Jimmy John's myself? > > Sent from my pocket supercomputer > > _______________________________________________ > Purdue-pm mailing list > Purdue-pm at pm.org > http://mail.pm.org/mailman/listinfo/purdue-pm > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacoby.david at gmail.com Tue Nov 18 10:01:13 2014 From: jacoby.david at gmail.com (Dave Jacoby) Date: Tue, 18 Nov 2014 13:01:13 -0500 Subject: [Purdue-pm] Google Testing Talk Suggested by Derrick Message-ID: Netflix introduces the Simian Army. https://www.youtube.com/watch?v=xkP70Zhhix4 -- David Jacoby jacoby.david at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacoby.david at gmail.com Tue Nov 18 10:03:25 2014 From: jacoby.david at gmail.com (Dave Jacoby) Date: Tue, 18 Nov 2014 13:03:25 -0500 Subject: [Purdue-pm] Netflix explains the Simian Army Message-ID: http://techblog.netflix.com/2011/07/netflix-simian-army.html -- David Jacoby jacoby.david at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at ecn.purdue.edu Tue Nov 18 10:43:00 2014 From: mark at ecn.purdue.edu (Mark Senn) Date: Tue, 18 Nov 2014 13:43:00 -0500 Subject: [Purdue-pm] Perl 5 exercise In-Reply-To: <10209.1415507959@pier.ecn.purdue.edu> References: <10209.1415507959@pier.ecn.purdue.edu> Message-ID: <35338.1416336180@pier.ecn.purdue.edu> > The first half of this exercise shows the results of the computation. > In the second half of the exercise replace "THIS" with ten characters > that more clearly expresses the code in the first half of the > exercise. > > > #!/usr/new/bin/perl > > my $line = 'Al , Bob B. Bob , Bono '; > > > $_ = $line; > s/^.+? s/>$//; > my @email = split />.+? > print "first try\n"; > map { print "($_)\n" } @email; > > > $_ = $line; > # Replace THIS with ten characters that will produce the same answer as > # above, and in my opinion more clearly expresses what you're trying to do. > @email = THIS; > > print "second try\n"; > map { print "($_)\n" } @email; Running this program #!/usr/new/bin/perl my $_ = 'Al , Bob B. Bob , Bono '; my @email = /<(.+?)>/g; map { print "($_)\n" } @email; prints (adam at a.com) (bob at e.gov) (bono at u2.com) The /<(.+?)>/g dissected a character at a time: / start matching (use, for example, "m/" in Perl 6) < match a "<" verbatim ( start remembering what matched . match any character + one or more times ? don't do greedy matching, only match up to the next ">", not to the last ">" ) stop remembering what matched / end matching g do the match globally so we get all the emai addresses -mark From markleightonfisher at gmail.com Wed Nov 19 03:26:29 2014 From: markleightonfisher at gmail.com (Mark Leighton Fisher) Date: Wed, 19 Nov 2014 06:26:29 -0500 Subject: [Purdue-pm] Netflix explains the Simian Army In-Reply-To: References: Message-ID: <546C7E65.1050403@gmail.com> Oooh! Lots of good stuff to chew on in this article. One of my favorites of the past few years. (Also helps explain to my family about the occasional Netflix glitches.) Mark On 11/18/2014 1:03 PM, Dave Jacoby wrote: > http://techblog.netflix.com/2011/07/netflix-simian-army.html > > -- > David Jacoby jacoby.david at gmail.com > > > _______________________________________________ > Purdue-pm mailing list > Purdue-pm at pm.org > http://mail.pm.org/mailman/listinfo/purdue-pm -- ================= Mark Leighton Fisher markleightonfisher at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From westerman at purdue.edu Wed Nov 19 05:41:22 2014 From: westerman at purdue.edu (Rick Westerman) Date: Wed, 19 Nov 2014 08:41:22 -0500 (EST) Subject: [Purdue-pm] Loading in unused modules In-Reply-To: <1174048348.79663.1416402903239.JavaMail.root@mailhub020.itcs.purdue.edu> Message-ID: <1897048252.79733.1416404482620.JavaMail.root@mailhub020.itcs.purdue.edu> At yesterday's meeting we talked a bit about the performance hit from loading in modules that are not actually used. Perhaps a common occurrence of this is doing a: use Data::Dumper ; And then never actually using the Dumper routine. Quite a few of the CPAN modules depend on 'Data::Dumper' ... http://deps.cpantesters.org/depended-on-by.pl?dist=Data-Dumper-2.154 ... but one has to wonder just exactly how many of those really use DD or just have leftover DD code in them. Anyway the question is if there is a performance hit and how many lines of a module get read in if not being used. Looking at DD and using NYTProf on a very simple "hello world" program Without 'use Data::Dumper' ... ~50 ms With 'use Data::Dumper' ... ~75 ms; 50 statements from Data::Dumper are executed So obviously a hit. As one might expect -- a file has to be read in and lines in the file parsed if for not other reason than to figure out what routines are exported. Of course in the overall scheme of things a 25ms increase is not that much although if the module is called a lot of times the performance could add up. Until recently we had 'use Data::Dumper' but no actual use of it in our database initialization module. However since our module also calls YAML and DBI modules the extra 25 ms is not that much. Running a very simple program that does a simple SELECT statement has a run time of around 300 ms. So the unneeded 'use Data::Dumper' is adding 8% to the run time and undoubtedly a lot less for more complex programs. -- Rick Westerman westerman at purdue.edu Bioinformatics specialist at the Genomics Facility. Phone: (765) 494-0505 FAX: (765) 496-7255 Department of Horticulture and Landscape Architecture 625 Agriculture Mall Drive West Lafayette, IN 47907-2010 Physically located in room S049, WSLR building From westerman at purdue.edu Thu Nov 20 11:52:58 2014 From: westerman at purdue.edu (Rick Westerman) Date: Thu, 20 Nov 2014 14:52:58 -0500 (EST) Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <2125930307.84618.1416512958195.JavaMail.root@mailhub020.itcs.purdue.edu> Message-ID: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> >From all I have read the following program should work: #!/bin/env PERL5OPT=-T perl warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); # End But when I run it, at least on the RCAC and on the Genomics systems, the program just hangs. Take away PERL5OPT and it works. Put in '-w' instead of '-T' and it fails in the same manner. Quoting does not seem to matter. Just running from the command line: /bin/env PERL5OPT=-T perl program.file Works fine. I am mystified. Anyone have an idea? Thanks, -- Rick Westerman westerman at purdue.edu Bioinformatics specialist at the Genomics Facility. Phone: (765) 494-0505 FAX: (765) 496-7255 Department of Horticulture and Landscape Architecture 625 Agriculture Mall Drive West Lafayette, IN 47907-2010 Physically located in room S049, WSLR building From gizmo at purdue.edu Thu Nov 20 16:50:26 2014 From: gizmo at purdue.edu (Joe Kline) Date: Thu, 20 Nov 2014 19:50:26 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> Message-ID: <546E8C52.40305@purdue.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 When you are ssh'd in is PERL5OPT defined? Is it defined in your shell config file? Hmmm....all I can think of off the top of my head. Also, make sure that env is in /bin in your path. joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlRujEsACgkQb0mzA2gRTpmyWgCeIiO4X6A3IxR7TaeMoAHWLTOH mW8AnRHxPzrXtLhVoWlgqLTfUhArRBSS =ZXxP -----END PGP SIGNATURE----- From mark at ecn.purdue.edu Thu Nov 20 18:04:37 2014 From: mark at ecn.purdue.edu (Mark Senn) Date: Thu, 20 Nov 2014 21:04:37 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> Message-ID: <18297.1416535477@pier.ecn.purdue.edu> Rick Westerman wrote on 2014-11-20 at 14:52 >From all I have read the following program should work: | #!/bin/env PERL5OPT=-T perl | | warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); | | # End | | But when I run it, at least on the RCAC and on the Genomics systems, the | program just hangs. Take away PERL5OPT and it works. Put in '-w' | instead of '-T' and it fails in the same manner. Quoting does not seem | to matter. Just running from the command line: | | /bin/env PERL5OPT=-T perl program.file | | Works fine. | | I am mystified. Anyone have an idea? From http://stackoverflow.com/questions/2528959/how-do-i-set-the-taint-mode-in-a-perl-script-with-a-usr-bin-env-perl-sheba You cannot actually specify a variable in a shebang with /usr/bin/env. Doing so will cause env to execve itself in an infinite loop, never even getting to the command requested. I tested this against both Linux and FreeBSD. ? Zed Rick, I got the sme results with these scripts that you got with yours #!/bin/env /bin/cat hello there and #!/bin/env TEMPVAR=hello /bin/cat hello there (I wanted to try it with something simpler than perl.) -mark From westerman at purdue.edu Thu Nov 20 19:27:25 2014 From: westerman at purdue.edu (Rick Westerman) Date: Thu, 20 Nov 2014 22:27:25 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <18297.1416535477@pier.ecn.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> Message-ID: <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> Mark: Your link from stack overflow is exactly the one I got my program idea from. The best answer ? marked up 10 times ? was from March 2010 so I thought that it would be correct. It is interesting that the ?Zed? you quoted who said the solution doesn?t work posted his comment only 5 hours ago. This stack overflow link is something to keep an eye on to see if someone counters ?Zed?. Others: Yes, I know that setting PERL5OPT outside the program will carry through. That isn?t possible in my scenario ? executing Perl programs via Apache (the only real reason to use taint in the first place) unless we make all programs use taint. If someone has a suggestion on how to run individual web programs using taint I am all ears. Or if people think that I should turn on taint for all of our web programs ? not a bad idea ? then speak up ? just be aware that Dave is already gnashing his teeth over my recent suggestions; I?d hate to see it get worse. :-) As for /bin/env vs. /usr/bin/env ? on RCAC the latter is a link to the former. Ergo at least on the RCAC system the former is the ?real? program. Perhaps for Linux as a whole this is not true but I have what I have and portability per se is not a big concern for us aside from Solaris to Linux. I will keep the /bin/env vs. /bin/usr/env in mind. -- Rick Westerman westerman at purdue.edu > On Nov 20, 2014, at 9:04 PM, Mark Senn wrote: > > Rick Westerman wrote on 2014-11-20 at 14:52 > From all I have read the following program should work: > | #!/bin/env PERL5OPT=-T perl > | > | warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); > | > | # End > | > | But when I run it, at least on the RCAC and on the Genomics systems, the > | program just hangs. Take away PERL5OPT and it works. Put in '-w' > | instead of '-T' and it fails in the same manner. Quoting does not seem > | to matter. Just running from the command line: > | > | /bin/env PERL5OPT=-T perl program.file > | > | Works fine. > | > | I am mystified. Anyone have an idea? > > From > http://stackoverflow.com/questions/2528959/how-do-i-set-the-taint-mode-in-a-perl-script-with-a-usr-bin-env-perl-sheba > > You cannot actually specify a variable in a shebang with > /usr/bin/env. Doing so will cause env to execve itself in an infinite > loop, never even getting to the command requested. I tested this against > both Linux and FreeBSD. ? Zed > > > Rick, I got the sme results with these scripts that you got with yours > #!/bin/env /bin/cat > hello there > and > #!/bin/env TEMPVAR=hello /bin/cat > hello there > (I wanted to try it with something simpler than perl.) > > -mark From westerman at purdue.edu Thu Nov 20 19:33:11 2014 From: westerman at purdue.edu (Rick Westerman) Date: Thu, 20 Nov 2014 22:33:11 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <18297.1416535477@pier.ecn.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> Message-ID: > If someone has a suggestion on how to run individual web programs using taint I am all ears. One suggestion that I did come across is to create a ?taintperl? program. I may ask Doug to do that. Or, as I said, I am open to votes on if to turn on taint for all of our web programs. Good idea? Or bad since it may cause massive program failures? -- Rick Westerman westerman at purdue.edu > On Nov 20, 2014, at 9:04 PM, Mark Senn wrote: > > Rick Westerman wrote on 2014-11-20 at 14:52 > From all I have read the following program should work: > | #!/bin/env PERL5OPT=-T perl > | > | warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); > | > | # End > | > | But when I run it, at least on the RCAC and on the Genomics systems, the > | program just hangs. Take away PERL5OPT and it works. Put in '-w' > | instead of '-T' and it fails in the same manner. Quoting does not seem > | to matter. Just running from the command line: > | > | /bin/env PERL5OPT=-T perl program.file > | > | Works fine. > | > | I am mystified. Anyone have an idea? > > From > http://stackoverflow.com/questions/2528959/how-do-i-set-the-taint-mode-in-a-perl-script-with-a-usr-bin-env-perl-sheba > > You cannot actually specify a variable in a shebang with > /usr/bin/env. Doing so will cause env to execve itself in an infinite > loop, never even getting to the command requested. I tested this against > both Linux and FreeBSD. ? Zed > > > Rick, I got the sme results with these scripts that you got with yours > #!/bin/env /bin/cat > hello there > and > #!/bin/env TEMPVAR=hello /bin/cat > hello there > (I wanted to try it with something simpler than perl.) > > -mark From westerman at purdue.edu Thu Nov 20 19:52:50 2014 From: westerman at purdue.edu (Rick Westerman) Date: Thu, 20 Nov 2014 22:52:50 -0500 Subject: [Purdue-pm] Yet more on: Problem with she-bang and PERL5OPT In-Reply-To: <18297.1416535477@pier.ecn.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> Message-ID: <9AA7920D-1402-41C8-822F-F7E245C738C1@purdue.edu> After reading more on she-bangs in general (and not just she-bangs with Perl and arguments using env) it appears that some systems env accept arguments and others it does not. And indeed on my Mac system using ?PERL5OPT=-T perl? works fine. However as Doug warned ?/bin/env? is not portable ? it is not found on my Mac. So I really need to start using ?/usr/bin/env?. In other words the following works fine: ???? #!/usr/bin/env PERL5OPT=-T perl warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); # End ????? [Returns: Taint mode is on at ./r.pl line 3. ] But unfortunately it is not portable to RCAC and RedHat systems. :-( > On Nov 20, 2014, at 9:04 PM, Mark Senn wrote: > > Rick Westerman wrote on 2014-11-20 at 14:52 > From all I have read the following program should work: > | #!/bin/env PERL5OPT=-T perl > | > | warn 'Taint mode is '.(${^TAINT} ? 'on' : 'off'); > | > | # End > | > | But when I run it, at least on the RCAC and on the Genomics systems, the > | program just hangs. Take away PERL5OPT and it works. Put in '-w' > | instead of '-T' and it fails in the same manner. Quoting does not seem > | to matter. Just running from the command line: > | > | /bin/env PERL5OPT=-T perl program.file > | > | Works fine. > | > | I am mystified. Anyone have an idea? > > From > http://stackoverflow.com/questions/2528959/how-do-i-set-the-taint-mode-in-a-perl-script-with-a-usr-bin-env-perl-sheba > > You cannot actually specify a variable in a shebang with > /usr/bin/env. Doing so will cause env to execve itself in an infinite > loop, never even getting to the command requested. I tested this against > both Linux and FreeBSD. ? Zed > > > Rick, I got the sme results with these scripts that you got with yours > #!/bin/env /bin/cat > hello there > and > #!/bin/env TEMPVAR=hello /bin/cat > hello there > (I wanted to try it with something simpler than perl.) > > -mark From mark at ecn.purdue.edu Fri Nov 21 05:24:11 2014 From: mark at ecn.purdue.edu (Mark Senn) Date: Fri, 21 Nov 2014 08:24:11 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> Message-ID: <27188.1416576251@pier.ecn.purdue.edu> Rick Westerman wrote on 2014-11-21 at 22:27: | If someone has a suggestion on how to run individual web programs using | taint I am all ears. SUMMARY Use mod_perl. See http://modperlbook.org/html/6-5-2-2-Taint-mode.html DETAILS >From http://perl.apache.org/start Accelerate your existing dynamic content The standard Apache::Registry module can provide 100x speedups for your existing CGI scripts and reduce the load on your server at the same time. A few changes to the web server's config is all that is required to run your existing CGI scripts at lightning speed. more ? Which links to http://perl.apache.org/start/tips/registry.html (excerpt here) Existing CGI scripts will run much faster under mod_perl. And converting existing CGI scripts to run under mod_perl is easy. For example, here's an existing CGI script called hello.cgi. #!/usr/local/bin/perl -w use strict; use CGI; my $q = CGI->new; print $q->header, $q->start_html, $q->h1('Hello World!'), $q->end_html; This script can now be run as-is under Apache::Registry by using the following configuration in httpd.conf: SetHandler perl-script PerlHandler Apache::Registry Options ExecCGI That's basically it. Your scripts do need to be well coded, but there's even the Apache::PerlRun module to help with those "less clean" programs. So how much faster do scripts run under Apache::Registry? Obviously, it depends on the script, but the hello.cgi script above ran at 7.3 requests per second as a CGI script and 243.0 requests per second with Apache::Registry. For more information on running CGI scripts under mod_perl please see the CGI to mod_perl Porting section of The Guide. >From http://modperlbook.org/html/6-5-2-2-Taint-mode.html Since the -Tswitch can't be turned on from within Perl (this is because when Perl is running, it's already too late to mark all external data as tainted), mod_perl provides the PerlTaintCheck directive to turn on taint checks globally. Enable this mode with: PerlTaintCheck On anywhere in httpd.conf (though it's better to place it as early as possible for clarity). I was a technucial editor for Sams Publishing's ``mod_perl Developer's Cookbook''. (I use logical punctuation---see http://www.slate.com/articles/life/the_good_word/2011/05/the_rise_of_logical_punctuation.html .) The book's website is at http://www.modperlcookbook.org . (Technical editors read text, run examples, give feedback on how to most clearly express ideas, check table of contents, check indices, etc.--if you're interested in money don't be a technical editor---it takes so long to do a good job and pays so little you'll make more money working for McDonald's.) When I used mod_perl over ten years ago I was very impressed with the software. -mark From mark at purdue.edu Fri Nov 21 05:39:16 2014 From: mark at purdue.edu (Mark Senn) Date: Fri, 21 Nov 2014 08:39:16 -0500 Subject: [Purdue-pm] mod_perl 2 information Message-ID: <30699.1416577156@pier.ecn.purdue.edu> The ``mod_perl 2 User's Guide'' covers mod_perl 2 and was written in 2007. (I didn't have anything to do with the production of this book.) (My general rule is see if I can figure out stuff using documentation on the net before considering buying a book. I don't buy paper books, only PDF---it saves space, makes updating to new editions easier, and makes accessing the book from other locations easier.) >From http://perl.apache.org/outstanding/index.html Listed here are success stories from people using mod_perl; also, world-wide statistics of mod_perl usage -mark From westerman at purdue.edu Fri Nov 21 06:01:54 2014 From: westerman at purdue.edu (Rick Westerman) Date: Fri, 21 Nov 2014 09:01:54 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <27188.1416576251@pier.ecn.purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> <27188.1416576251@pier.ecn.purdue.edu> Message-ID: <11182C02-ACE2-4BD4-B2BB-40D21EB32E41@purdue.edu> Thanks for the info on mod_perl Mark. I agree that we should migrate to mod_perl. We?ve been meaning to do that for years but since our performance is usually not that bad (except for when we code poorly) it has been low on the list and high on the ?let?s not break things? fear ? despite it reportedly being a safe thing to do. Unfortunately mod_perl does not allow individual programs to run in ?taint? mode so it is not an answer to my question of how to run programs in non-taint mode. However I?ll take your endorsement of mod_perl to be a vote in favor of running ?taint? globally. So far 1:for, 0:against. ===== Oh, I haven?t mentioned how I run ?taint? in my web-based programs. I do so by specifying explicitly the perl path. I.e., no use of ?/usr/bin/env perl?. But this means the program has to be changed to use newer versions of perl and is also vulnerable to its version of perl disappearing from the system. Something we recently ran into thus my recent questions. Dave, on the other hand, doesn?t use ?taint? so he can use /usr/bin/env. Since ?taint? ? similar to ?strict? and ?warnings? and even unit testing ? is just a crutch to help proper coding there is not an absolute need for it. -- Rick Westerman westerman at purdue.edu > On Nov 21, 2014, at 8:24 AM, Mark Senn wrote: > > Rick Westerman wrote on 2014-11-21 at 22:27: > | If someone has a suggestion on how to run individual web programs using > | taint I am all ears. > > SUMMARY > > Use mod_perl. See http://modperlbook.org/html/6-5-2-2-Taint-mode.html > > DETAILS > > From http://perl.apache.org/start > > Accelerate your existing dynamic content > > The standard Apache::Registry module can provide 100x speedups for > your existing CGI scripts and reduce the load on your server at the > same time. A few changes to the web server's config is all that is > required to run your existing CGI scripts at lightning speed. more ? > > Which links to http://perl.apache.org/start/tips/registry.html > (excerpt here) > > Existing CGI scripts will run much faster under mod_perl. And converting > existing CGI scripts to run under mod_perl is easy. > > For example, here's an existing CGI script called hello.cgi. > > #!/usr/local/bin/perl -w > use strict; > use CGI; > my $q = CGI->new; > print $q->header, > $q->start_html, > $q->h1('Hello World!'), > $q->end_html; > > This script can now be run as-is under Apache::Registry by using the > following configuration in httpd.conf: > > > SetHandler perl-script > PerlHandler Apache::Registry > Options ExecCGI > > > That's basically it. Your scripts do need to be well coded, but there's > even the Apache::PerlRun module to help with those "less clean" > programs. > > So how much faster do scripts run under Apache::Registry? Obviously, it > depends on the script, but the hello.cgi script above ran at 7.3 > requests per second as a CGI script and 243.0 requests per second with > Apache::Registry. > > For more information on running CGI scripts under mod_perl please > see the CGI to mod_perl Porting section of The Guide. > > From http://modperlbook.org/html/6-5-2-2-Taint-mode.html > > Since the -Tswitch can't be turned on from within Perl (this is because > when Perl is running, it's already too late to mark all external data as > tainted), mod_perl provides the PerlTaintCheck directive to turn on > taint checks globally. Enable this mode with: > > PerlTaintCheck On > > anywhere in httpd.conf (though it's better to place it as early as > possible for clarity). > > I was a technucial editor for Sams Publishing's ``mod_perl Developer's > Cookbook''. (I use logical punctuation---see > http://www.slate.com/articles/life/the_good_word/2011/05/the_rise_of_logical_punctuation.html > .) > The book's website is at http://www.modperlcookbook.org . > > (Technical editors read text, run examples, give feedback on how to most > clearly express ideas, check table of contents, check indices, etc.--if > you're interested in money don't be a technical editor---it takes so > long to do a good job and pays so little you'll make more money working > for McDonald's.) > > When I used mod_perl over ten years ago I was very impressed with the software. > > -mark From gizmo at purdue.edu Fri Nov 21 06:27:01 2014 From: gizmo at purdue.edu (Joe Kline) Date: Fri, 21 Nov 2014 09:27:01 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <11182C02-ACE2-4BD4-B2BB-40D21EB32E41@purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> <27188.1416576251@pier.ecn.purdue.edu> <11182C02-ACE2-4BD4-B2BB-40D21EB32E41@purdue.edu> Message-ID: <546F4BB5.9010704@purdue.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 While mod_perl is nice I think a lot of modern perl web devs have been moving away from it. I think mostly to decouple the apache/perl dependency (updating one requires doing something with the other). But, getting plack/psgi, nginx on RCAC might be difficult. Let alone Mojolicious, Dancer, Catalyst or the like. The down side of how far behind RHEL6 is. We haven't done much with RHEL7. joe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlRvS6oACgkQb0mzA2gRTpk0VACfRs0WXtn5WagMkMcUN6a721E1 MbUAmwXy3gIPxkxl+13Y1p0YIZVRRY8g =cvE/ -----END PGP SIGNATURE----- From yatcilla at purdue.edu Fri Nov 21 06:51:27 2014 From: yatcilla at purdue.edu (Doug Yatcilla) Date: Fri, 21 Nov 2014 09:51:27 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> Message-ID: <20141121145127.GC13461@purdue.edu> On Thu Nov 20 22:27:25 2014, Rick Westerman wrote: > Others: Yes, I know that setting PERL5OPT outside the program will > carry through. That isn?t possible in my scenario ? executing Perl > programs via Apache (the only real reason to use taint in the first > place) unless we make all programs use taint. If someone has a > suggestion on how to run individual web programs using taint I am > all ears. Rick, You are mistaken. You can use a wrapper script (with or without apache) to set environment variables that will be passed to perl or other scripts. Let your perl script be /opt/scripts/job1 Let your Apache CGI directory be /opt/cgi-bin Here is /opt/cgi-bin/taint: (not tested) ---------------------------------------------------------------------- #!/bin/bash # wrapper script for running perl with taint mode enabled # set any arbitrary shell environment vars # or assume sane ones are inherited from apache config PATH=/usr/bin:/any/other/safe/paths PERL5OPT=-T # directory containing perl programs cgiroot=/opt/scripts script=$cgiroot/$1 # run the script which is passed as an argument from apache if [[ -r $script ]]; then perl -T $script else echo "$0: script $script not found" exit 1; fi ---------------------------------------------------------------------- To invoke, use web address: http://my.server.com/cgi-bin/taint/job1 In the taint script given above, the actual perl script is invoked with "perl -T" so the shell script didn't need to set the PERL5OPT variable, nor does the /opt/scripts/job1 perl script even need to have its execute bit set or have a #! as the first line. The rationale for keeping the actual scripts outside the apache CGI directory is so someone cannot avoid the wrapper script and invoke the script directly from a web address (and bypass taint mode.) Nonetheless, you might as well put in a check in your perl script to stop if taint mode isn't enabled. Getting back to perl, I wonder why you can't just turn on taint mode with a "use taint;" directive along the lines of "use warnings;". I read that it is "too late" to enable it once the program starts, but don't understand why. That seems to be what this module provides: http://search.cpan.org/~sharyanto/tainting-0.01/ But, it appears to be a proof of concept. Another, older, probably abandoned module along similar lines is: http://search.cpan.org/~rhandom/Taint-Runtime-0.03/ Its documentation specifically mentions the use case of migrating lots of apache cgi scripts one at a time to using taint mode, which is exactly what you appear to want to do. -Doug From westerman at purdue.edu Fri Nov 21 07:39:17 2014 From: westerman at purdue.edu (Rick Westerman) Date: Fri, 21 Nov 2014 10:39:17 -0500 (EST) Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <20141121145127.GC13461@purdue.edu> Message-ID: <1530035043.86609.1416584357580.JavaMail.root@mailhub020.itcs.purdue.edu> ----- Original Message ----- > On Thu Nov 20 22:27:25 2014, Rick Westerman > wrote: > > Others: Yes, I know that setting PERL5OPT outside the program will > > carry through. That isn?t possible in my scenario ? executing Perl > > programs via Apache (the only real reason to use taint in the first > > place) unless we make all programs use taint. If someone has a > > suggestion on how to run individual web programs using taint I am > > all ears. > > Rick, > > You are mistaken. You can use a wrapper script (with or without > apache) to set environment variables that will be passed to perl or > other scripts. Just exactly what I said ... or at least what I meant to say. I know that setting PERL5OPT (or any env variable) outside my program and having it passed to my program is possible. Using a script to run my program is exactly that. Setting vars on the command line before running my program is exactly that. Setting vars inside Apache is exactly that. At least three different methods. What is not possible (for RedHat/RCAC but not for MacOS) is setting the variables on the shebang line. I am now convinced of that. Irritating but now to implement a work-around. Doug: If you want to install a 'taintperl' program in 5.20's path I'll use that. Simply taking the given program and executing it with 'perl -T' should be sufficient. Or I can do the install if you prefer. If I put '#!/usr/bin/env taintperl' on the shebang line then even if the program is in the 'cgi-bin' it will be taint-protected. No need to move it outside the directory. By not doing the move it makes the transition of other programs to 'taint' that much easier. Thanks! As for your comment about 'use taint;' similar to 'use strict' -- yeah, I wonder about that myself. There is some sort of deep internal-to-Perl-compiling timing issue here that I don't understand. As for Taint-Runtime-0.03. It indeed hasn't been developed for a while; since 2005. Does that mean it is obsolete? Or just so perfectly done that it does not need improving? Or is it just overkill? Certainly the comments in the documentation all address my concerns; i.e., migrating one program at a time, not being able to use '-T' from the shebang line. Anyway implementing a 'taintperl' program seems more straight-forward. -- Rick > > Let your perl script be /opt/scripts/job1 > Let your Apache CGI directory be /opt/cgi-bin > > Here is /opt/cgi-bin/taint: (not tested) > ---------------------------------------------------------------------- > #!/bin/bash > # wrapper script for running perl with taint mode enabled > > # set any arbitrary shell environment vars > # or assume sane ones are inherited from apache config > PATH=/usr/bin:/any/other/safe/paths > PERL5OPT=-T > > # directory containing perl programs > cgiroot=/opt/scripts > script=$cgiroot/$1 > > # run the script which is passed as an argument from apache > if [[ -r $script ]]; then > perl -T $script > else > echo "$0: script $script not found" > exit 1; > fi > ---------------------------------------------------------------------- > > To invoke, use web address: http://my.server.com/cgi-bin/taint/job1 > > In the taint script given above, the actual perl script is invoked > with "perl -T" so the shell script didn't need to set the PERL5OPT > variable, nor does the /opt/scripts/job1 perl script even need to > have its execute bit set or have a #! as the first line. > > The rationale for keeping the actual scripts outside the apache CGI > directory is so someone cannot avoid the wrapper script and invoke the > script directly from a web address (and bypass taint mode.) > Nonetheless, you might as well put in a check in your perl script to > stop if taint mode isn't enabled. > > Getting back to perl, I wonder why you can't just turn on taint mode > with a "use taint;" directive along the lines of "use warnings;". I > read that it is "too late" to enable it once the program starts, but > don't understand why. > > That seems to be what this module provides: > http://search.cpan.org/~sharyanto/tainting-0.01/ > But, it appears to be a proof of concept. > > Another, older, probably abandoned module along similar lines is: > http://search.cpan.org/~rhandom/Taint-Runtime-0.03/ > Its documentation specifically mentions the use case of migrating lots > of apache cgi scripts one at a time to using taint mode, which is > exactly what you appear to want to do. > > -Doug -- Rick Westerman westerman at purdue.edu Bioinformatics specialist at the Genomics Facility. Phone: (765) 494-0505 FAX: (765) 496-7255 Department of Horticulture and Landscape Architecture 625 Agriculture Mall Drive West Lafayette, IN 47907-2010 Physically located in room S049, WSLR building From westerman at purdue.edu Fri Nov 21 07:42:53 2014 From: westerman at purdue.edu (Rick Westerman) Date: Fri, 21 Nov 2014 10:42:53 -0500 (EST) Subject: [Purdue-pm] Frameworks, was: Problem with she-bang and PERL5OPT In-Reply-To: <546F4BB5.9010704@purdue.edu> Message-ID: <1831293055.86626.1416584573261.JavaMail.root@mailhub020.itcs.purdue.edu> We had a short informal discussion on frameworks at the last Mongers' meeting. We can get anything we want installed on RCAC. The 'use module' method works nicely. However, basically, none of us have wanted (or had the time) to overcome the initial and often steep learning curve of a framework in order to save us (potentially) hassles in the future. ----- Original Message ----- > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > While mod_perl is nice I think a lot of modern perl web devs have been > moving away from it. I think mostly to decouple the apache/perl > dependency (updating one requires doing something with the other). > > But, getting plack/psgi, nginx on RCAC might be difficult. Let alone > Mojolicious, Dancer, Catalyst or the like. > > The down side of how far behind RHEL6 is. We haven't done much with > RHEL7. > > joe > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iEYEARECAAYFAlRvS6oACgkQb0mzA2gRTpk0VACfRs0WXtn5WagMkMcUN6a721E1 > MbUAmwXy3gIPxkxl+13Y1p0YIZVRRY8g > =cvE/ > -----END PGP SIGNATURE----- > _______________________________________________ > Purdue-pm mailing list > Purdue-pm at pm.org > http://mail.pm.org/mailman/listinfo/purdue-pm -- Rick Westerman westerman at purdue.edu Bioinformatics specialist at the Genomics Facility. Phone: (765) 494-0505 FAX: (765) 496-7255 Department of Horticulture and Landscape Architecture 625 Agriculture Mall Drive West Lafayette, IN 47907-2010 Physically located in room S049, WSLR building From mark at ecn.purdue.edu Fri Nov 21 08:27:05 2014 From: mark at ecn.purdue.edu (Mark Senn) Date: Fri, 21 Nov 2014 11:27:05 -0500 Subject: [Purdue-pm] Problem with she-bang and PERL5OPT In-Reply-To: <11182C02-ACE2-4BD4-B2BB-40D21EB32E41@purdue.edu> References: <1092346367.84641.1416513178058.JavaMail.root@mailhub020.itcs.purdue.edu> <18297.1416535477@pier.ecn.purdue.edu> <6F6936D9-CE62-43E4-8DE5-03B00D39B977@purdue.edu> <27188.1416576251@pier.ecn.purdue.edu> <11182C02-ACE2-4BD4-B2BB-40D21EB32E41@purdue.edu> Message-ID: <36661.1416587225@pier.ecn.purdue.edu> Rick Westerman wrote on 2014-11-21 at 09:01 | Unfortunately mod_perl does not allow individual programs to run in | ?taint? mode so it is not an answer to my question of how to run | programs in non-taint mode. However I?ll take your endorsement of | mod_perl to be a vote in favor of running ?taint? globally. So far | 1:for, 0:against. My endorsement of mod_perl was not pro or anti 'taint'. | Oh, I haven?t mentioned how I run ?taint? in my web-based programs. I do | so by specifying explicitly the perl path. I.e., no use of | ?/usr/bin/env perl?. But this means the program has to be changed to | use newer versions of perl and is also vulnerable to its version of perl | disappearing from the system. Something we recently ran into thus my | recent questions. Dave, on the other hand, doesn?t use ?taint? so he | can use /usr/bin/env. Since ?taint? ? similar to ?strict? and | ?warnings? and even unit testing ? is just a crutch to help proper | coding there is not an absolute need for it. A not-to-good solution: make /link/perl a hard link or symbolic link to perl and run a cron job to make sure what /link/perl points to is still there. I keep coming across web frameworks (catalyst, dancer, mojolicious) and PSGI in my reading. I've never used any of them---I do very little web stuff---just a few static HTML pages. -mark