From dan at linder.org Fri May 1 10:42:01 2009 From: dan at linder.org (Dan Linder) Date: Fri, 1 May 2009 12:42:01 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> Message-ID: <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> Hello! Months ago I asked this list about peoples experience with XML handling modules. One handy reply was the Perl 5 Wiki listing their recommeded XML modules: http://www.perlfoundation.org/perl5/index.cgi?recommended_xml_modules and http://perl-xml.sourceforge.net/faq/ (Thanks Andy!) Others suggested XML::Twig which has my interest... (Thanks Jay!) Unfortunately, we've recently run into some issues where our code was using the Expat perl module which calls out to a binary Expat.so. In our system, this is a no-no - we're trying to stay pure Perl and not use binary modules. Anyway, I was poking around and I found the "XML::TreePP" (PP == Pure Perl). It looks like it's fairly active (last update was January), and it will parse XML to data structures, and write XML from hash/array structures. Anyone else have any experience with this module? Thoughts? Is there another pure-perl XML module I might have overlooked? Thanks, Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at jays.net Fri May 1 11:52:47 2009 From: jay at jays.net (Jay Hannah) Date: Fri, 1 May 2009 13:52:47 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> Message-ID: <79834678-5834-47E5-9F9E-FDBABC52FAD7@jays.net> On May 1, 2009, at 12:42 PM, Dan Linder wrote: > Unfortunately, we've recently run into some issues where our code > was using the Expat perl module which calls out to a binary > Expat.so. In our system, this is a no-no - we're trying to stay > pure Perl and not use binary modules. Why do you have to avoid Expat? It won't bite you, I promise. We've had zero trouble with that dependency. :) http://expat.sourceforge.net/ j From dan at linder.org Fri May 1 13:07:32 2009 From: dan at linder.org (Dan Linder) Date: Fri, 1 May 2009 15:07:32 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: <79834678-5834-47E5-9F9E-FDBABC52FAD7@jays.net> References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> <79834678-5834-47E5-9F9E-FDBABC52FAD7@jays.net> Message-ID: <3e2be50905011307x10f1f42ds2447a6924710e2a8@mail.gmail.com> On Fri, May 1, 2009 at 13:52, Jay Hannah wrote: > Why do you have to avoid Expat? It won't bite you, I promise. We've had > zero trouble with that dependency. :) > http://expat.sourceforge.net/ It's the simple fact that I (single developer) have to keep it updated on numerous binary platforms (Linux x86, Solaris SPARC, Solaris x86, HPUX PA-RISC, HPUX Itanium, OSF/1, AIX PowerPC, etc). I'm not afraid of the recompile issue, but I'd rather use Perl to write it once and use it anywhere. (Apologies to the Java catch-phrase!) Dan ************* *********** ******* ***** *** ** "Quis custodiet ipsos custodes?" (Who can watch the watchmen?) -- from the Satires of Juvenal "I do not fear computers, I fear the lack of them." -- Isaac Asimov (Author) ** *** ***** ******* *********** ************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at jays.net Fri May 1 13:28:11 2009 From: jay at jays.net (Jay Hannah) Date: Fri, 1 May 2009 15:28:11 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: <3e2be50905011307x10f1f42ds2447a6924710e2a8@mail.gmail.com> References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> <79834678-5834-47E5-9F9E-FDBABC52FAD7@jays.net> <3e2be50905011307x10f1f42ds2447a6924710e2a8@mail.gmail.com> Message-ID: <62C9FE34-D299-4BFE-B52C-3EC9BA432494@jays.net> On May 1, 2009, at 3:07 PM, Dan Linder wrote: > It's the simple fact that I (single developer) have to keep it > updated on numerous binary platforms (Linux x86, Solaris SPARC, > Solaris x86, HPUX PA-RISC, HPUX Itanium, OSF/1, AIX PowerPC, etc). > > I'm not afraid of the recompile issue, but I'd rather use Perl to > write it once and use it anywhere. (Apologies to the Java catch- > phrase!) Wow, that's a real bummer... I'm spoiled. We deploy to a few dozen SuSE Linux machines and we're done. I wonder how much your company spends every year maintaining the overhead of your diversity? (How many of those have your tried compiling Expat on? None, because you learned long ago not to play that game? Thank goodness Perl can compile on all those!! (Perl 5 Porters)++ woot!) j XML::Twig fanboi From dan at linder.org Fri May 1 14:46:33 2009 From: dan at linder.org (Dan Linder) Date: Fri, 1 May 2009 16:46:33 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: <62C9FE34-D299-4BFE-B52C-3EC9BA432494@jays.net> References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50905011042h686a05a0i3cfa69573f13f81d@mail.gmail.com> <79834678-5834-47E5-9F9E-FDBABC52FAD7@jays.net> <3e2be50905011307x10f1f42ds2447a6924710e2a8@mail.gmail.com> <62C9FE34-D299-4BFE-B52C-3EC9BA432494@jays.net> Message-ID: <3e2be50905011446l644695cg8274f581b1a2521c@mail.gmail.com> On Fri, May 1, 2009 at 15:28, Jay Hannah wrote: > I wonder how much your company spends every year maintaining the overhead > of your diversity? :-) (How many of those have your tried compiling Expat on? None, because you > learned long ago not to play that game? Thank goodness Perl can compile on > all those!! (Perl 5 Porters)++ woot!) Big shout out indeed! Dan ************* *********** ******* ***** *** ** "Quis custodiet ipsos custodes?" (Who can watch the watchmen?) -- from the Satires of Juvenal "I do not fear computers, I fear the lack of them." -- Isaac Asimov (Author) ** *** ***** ******* *********** ************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From JAY at JAYS.NET Sat May 2 08:50:58 2009 From: JAY at JAYS.NET (Jay Hannah) Date: Sat, 2 May 2009 10:50:58 -0500 Subject: [Omaha.pm] KiokuDB Message-ID: <6B8A8050-560D-448B-9999-59E386A2FB30@JAYS.NET> Huh. This looks neat for rapid prototyping. I might be scared for large-volume data, but maybe I could give it a try for some smaller new service... Looks like the learning curve and code overhead is tiny! http://www.iinteractive.com/kiokudb/ http://blog.woobling.org/ j From travis at travisbsd.org Sat May 2 09:17:23 2009 From: travis at travisbsd.org (Travis McArthur) Date: Sat, 02 May 2009 11:17:23 -0500 Subject: [Omaha.pm] KiokuDB In-Reply-To: <6B8A8050-560D-448B-9999-59E386A2FB30@JAYS.NET> References: <6B8A8050-560D-448B-9999-59E386A2FB30@JAYS.NET> Message-ID: <49FC7213.3000901@travisbsd.org> Jay Hannah wrote: > Huh. This looks neat for rapid prototyping. I might be scared for > large-volume data, but maybe I could give it a try for some smaller > new service... Looks like the learning curve and code overhead is tiny! > > http://www.iinteractive.com/kiokudb/ > http://blog.woobling.org/ > > j > > > _______________________________________________ > Omaha-pm mailing list > Omaha-pm at pm.org > http://mail.pm.org/mailman/listinfo/omaha-pm > Thanks for the link Jay, this looks really quite interesting. Kind of sounds like a RDBMS backend for FreezeThaw almost (The Linker/Collapser being Freeze/Thaw respectively). I'll have to check this out some and see how it performs, have a new in-house application I might try porting to use this for fun today. And yeah, I guess my primary concern might be speed of access compared to a traditional RDBMS since the indexing is done internally. But for small applications that would most benefit from this, probably not an issue. Best Regards, Travis From jay at jays.net Sun May 3 13:23:10 2009 From: jay at jays.net (Jay Hannah) Date: Sun, 3 May 2009 15:23:10 -0500 Subject: [Omaha.pm] CPAN dist release graph ... wow! Message-ID: <0DE3D82F-847F-4BFC-9218-90BDEF1E6798@jays.net> http://is.gd/wolD :) j From Twitter @merlyn : RT @burakgursoy: here is a nice graph: http://is.gd/wolD [showing CPAN dists over time... wow, far better than I imagined.] From jay at jays.net Sun May 3 14:52:14 2009 From: jay at jays.net (Jay Hannah) Date: Sun, 3 May 2009 16:52:14 -0500 Subject: [Omaha.pm] Tweak the Perl regex engine: assign to pos() Message-ID: http://headrattle.blogspot.com/search/label/perl OK, Perl is way too cool. I was minding my own business, searching for every occurrence of 'CCAGC' in E-coli, when I hit a snag. Several hundred of my known locations weren't showing up. Why? Because the Perl regular expression engine, by default, starts searching for the next occurrence of something after the end of the occurrence it just found. This is what most humans want. But you may notice that in the string 'CCAGCCAGC' the thing I'm searching for ('CCAGC') overlaps itself, so the regex engine doesn't see the second one. "Crap," I thought. But this is Perl -- maybe there's a way? 30 seconds in the documentation (perldoc -f pos) and it said I could assign to pos(). Really? Sweet! Problem solved! #!/usr/bin/perl use strict; open (IN, "E_coli.seq"); my $seq = ; chomp $seq; close IN; my $find_this = 'CCAGC'; while ($seq =~ /$find_this/g) { my $start = pos($seq) - length( $find_this ) + 1; my $stop = pos($seq); pos($seq) = $start; print " Found $find_this at [$start..$stop]\n"; } From jhannah at omnihotels.com Tue May 5 11:06:55 2009 From: jhannah at omnihotels.com (Jay Hannah) Date: Tue, 5 May 2009 13:06:55 -0500 Subject: [Omaha.pm] Installing Catalyst and all dependencies "in 30 seconds" Message-ID: <396CEDAA86B38646ACE2FEAA22C3FBF101A887C6@l3exchange.omnihotels.net> http://scsys.co.uk:8001/28116 # apt-get install libcatalyst-perl Debian++ (Ubuntu++) Why does $work use SuSE? j -------------- next part -------------- An HTML attachment was scrubbed... URL: From dthacker9 at cox.net Tue May 5 13:37:59 2009 From: dthacker9 at cox.net (Dave Thacker) Date: Tue, 5 May 2009 15:37:59 -0500 Subject: [Omaha.pm] Installing Catalyst and all dependencies "in 30 seconds" In-Reply-To: <396CEDAA86B38646ACE2FEAA22C3FBF101A887C6@l3exchange.omnihotels.net> References: <396CEDAA86B38646ACE2FEAA22C3FBF101A887C6@l3exchange.omnihotels.net> Message-ID: <200905051537.59335.dthacker9@cox.net> On Tuesday 05 May 2009 13:06:55 Jay Hannah wrote: > http://scsys.co.uk:8001/28116 > > # apt-get install libcatalyst-perl > > > Debian++ (Ubuntu++) > > Why does $work use SuSE? > > j AD integration. Did it happen yet? Dave From jay at jays.net Wed May 6 05:26:10 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 6 May 2009 07:26:10 -0500 Subject: [Omaha.pm] Installing Catalyst and all dependencies "in 30 seconds" In-Reply-To: <200905051537.59335.dthacker9@cox.net> References: <396CEDAA86B38646ACE2FEAA22C3FBF101A887C6@l3exchange.omnihotels.net> <200905051537.59335.dthacker9@cox.net> Message-ID: On May 5, 2009, at 3:37 PM, Dave Thacker wrote: > On Tuesday 05 May 2009 13:06:55 Jay Hannah wrote: >> Why does $work use SuSE? > > AD integration. Did it happen yet? - We have some Apaches doing Kerberos to Active Directory (AD) for Intranet stuff. - Sean built at least one box where ssh logins are controlled by AD and /home/username is created "automatically" when you log in the first time. Root still has to "adduser" first though (I think this is a feature). - We haven't moved to the AD enabled Subversion (Apache) yet. Not sure what else was supposed to get integrated... j From jay at jays.net Wed May 6 05:58:40 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 6 May 2009 07:58:40 -0500 Subject: [Omaha.pm] CPAN smokers rule! Message-ID: Man, I'm glad I didn't have to hire my own army of QA testers and admins for this! http://www.cpantesters.org/author/JHANNAH.html j From jay at jays.net Wed May 6 06:17:47 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 6 May 2009 08:17:47 -0500 Subject: [Omaha.pm] I've been syndicated! In-Reply-To: <4875EA78-8A71-4B78-B5C0-7317BE57F686@jays.net> References: <4875EA78-8A71-4B78-B5C0-7317BE57F686@jays.net> Message-ID: On Apr 27, 2009, at 5:10 PM, Jay Hannah wrote: > In response to mst's obscenity laden blogging challenge (http:// > xrl.us/beptq9) I blogged and poof! I've been syndicated: > > http://ironman.enlightenedperl.org/ Wow. 88 perl blogs in the feed so far. You should perl blog too! :) j From TELarson at west.com Wed May 6 13:37:55 2009 From: TELarson at west.com (Larson, Timothy E.) Date: Wed, 6 May 2009 15:37:55 -0500 Subject: [Omaha.pm] CPAN smokers rule! In-Reply-To: References: Message-ID: <226316B3E1F749498E28ACA66321D5BA206D0EBC@oma00cexmbx03.corp.westworlds.com> Interesting to look at the OS breakdown. Tim -- Tim Larson??????? AMT2 Unix Systems Administrator ??? InterCall, a division of West Corporation Be always sure you are right, then go ahead. - David Crockett From samuel.tesla at gmail.com Mon May 11 14:29:12 2009 From: samuel.tesla at gmail.com (Samuel Tesla) Date: Mon, 11 May 2009 16:29:12 -0500 Subject: [Omaha.pm] ODynUG Meeting: May 12 Message-ID: <4bd555f70905111429j2ba84ccs3d667719f3ce0082@mail.gmail.com> It's that time of the month again. On the heels of Randal's exciting Seaside talk last month, Stephen Wessels is going to be talking more about the Squeak Smalltalk environment. The meeting is from 7-9PM with social time for 30 minutes beforehand. As always, there will be pizza and soda. Speaker: Stephan Wessels Topic: Squeak Tour De Force Description: Jaws dropped and fainted developers and that was only the first 30 minutes of the last presentation that Stephan gave us! He's coming back yet again to give his mind blowing presentation on Squeak. This time he has guaranteed more spills and chills. Squeak it's not just for kids anymore! http://odynug.kicks-ass.org Meeting location is UNO's Peter Kiewit Institute (PKI) building Room 269 1110 South 67th Street Omaha, NE From jay at jays.net Tue May 12 04:36:14 2009 From: jay at jays.net (Jay Hannah) Date: Tue, 12 May 2009 06:36:14 -0500 Subject: [Omaha.pm] Perlbuzz: form data, Moose Message-ID: - Never trust web form data. http://bit.ly/EAhqF - Top ten great things about Moose http://bit.ly/139SHg Thanks, perlbuzz! http://twitter.com/perlbuzz j From jhannah at omnihotels.com Tue May 12 06:35:34 2009 From: jhannah at omnihotels.com (Jay Hannah) Date: Tue, 12 May 2009 08:35:34 -0500 Subject: [Omaha.pm] The Programming Language with the Happiest Users Message-ID: <396CEDAA86B38646ACE2FEAA22C3FBF101A887F1@l3exchange.omnihotels.net> http://blog.doloreslabs.com/2009/05/the-programming-language-with-the-happiest-users/ :D j -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.townley at gmail.com Wed May 13 10:23:21 2009 From: rob.townley at gmail.com (Rob Townley) Date: Wed, 13 May 2009 12:23:21 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50811301326y5507fcc0t1467d1b406fbcd84@mail.gmail.com> Message-ID: <7e84ed60905131023t5600a2d9s871d9bc74def9b6@mail.gmail.com> On Sun, Nov 30, 2008 at 5:27 PM, Christopher Cashell wrote: > On Sun, Nov 30, 2008 at 3:26 PM, Dan Linder wrote: >> I was looking at it a bit because our XML files have the potential to get >> quite large (>50GB dumps). ?On the other hand, the day-to-day files should >> stay quite manageable (between 100K to 10M), so XML::Twig's ability to >> process only a portion of an XML file might be overkill. > > Just a quick note of warning, it can be very surprising how much RAM > is required for processing XML documents as they get large. ?Loading > the entire document into memory has a way of balooning really fast. > We ran into some issues with that on a project at my previous > employer. > > As noted in the Perl XML FAQ: > > "The memory requirements of a tree based parser can be surprisingly > high. Because each node in the tree needs to keep track of links to > ancestor, sibling and child nodes, the memory required to build a tree > can easily reach 10-30 times the size of the source document. You > probably don't need to worry about that though unless your documents > are multi-megabytes (or you're running on lower spec hardware)." > > We had a couple of XML files that were under 10MB and they were > causing memory usage of nearly 500MB in the initial version of the > processing application. > >> Dan > > -- > Christopher > _______________________________________________ > Omaha-pm mailing list > Omaha-pm at pm.org > http://mail.pm.org/mailman/listinfo/omaha-pm > tree vs serial processing is one thing, but just curious if you have tried creating your own xml namespace to save memory? From jay at jays.net Thu May 14 05:47:25 2009 From: jay at jays.net (Jay Hannah) Date: Thu, 14 May 2009 07:47:25 -0500 Subject: [Omaha.pm] Suggested XML modules... In-Reply-To: <7e84ed60905131023t5600a2d9s871d9bc74def9b6@mail.gmail.com> References: <3e2be50811292231v6a25992i4264d0e26b2aea22@mail.gmail.com> <3e2be50811301326y5507fcc0t1467d1b406fbcd84@mail.gmail.com> <7e84ed60905131023t5600a2d9s871d9bc74def9b6@mail.gmail.com> Message-ID: <5E9A21DA-35D4-4ED1-97B0-E28D70BFD289@jays.net> On May 13, 2009, at 12:23 PM, Rob Townley wrote: > tree vs serial processing is one thing, but just curious if you have > tried creating your own xml namespace to save memory? Eeek. Save memory? How? I hate XML namespaces with a passion. How our @vendors use them here is Jay Tipper Then the FirstName node is defined multiple times in the XML schemas/ DTD/XSD/whatever, once for each of the "customer" and "pet" namespaces. Conceptually, maybe this isn't the worst idea ever. In practice, tons of XML tools* suddenly can't cope with cascading insanity at all, refuse to even parse XML, core dump, etc. As a dev it becomes insanely frustrating to track down what's going on. You lock yourself permanently into a massively bloated and expensive XML tool like XML Spy. Writing your own serializers? It can still be done, but you have to jump *so* many more hoops now. * XML::Twig seems to do OK, but we don't ask it to validate. Just say no. Ick. Has anyone had a good experience with XML namespaces? $0.02, j From jay at jays.net Mon May 18 05:43:22 2009 From: jay at jays.net (Jay Hannah) Date: Mon, 18 May 2009 07:43:22 -0500 Subject: [Omaha.pm] Perl, Template Toolkit, and GraphViz Message-ID: Perl, Template Toolkit, and GraphViz http://headrattle.blogspot.com/search/label/perl j http://ironman.enlightenedperl.org/ From jhannah at omnihotels.com Tue May 19 11:39:55 2009 From: jhannah at omnihotels.com (Jay Hannah) Date: Tue, 19 May 2009 13:39:55 -0500 Subject: [Omaha.pm] perl, git, github Message-ID: <396CEDAA86B38646ACE2FEAA22C3FBF101A88804@l3exchange.omnihotels.net> Here's a really good document about git and github, with all the perl blead specifics thrown in for free. :) http://dev.perl.org/perl5/docs/perlrepository.html j -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at jays.net Wed May 20 06:53:27 2009 From: jay at jays.net (Jay Hannah) Date: Wed, 20 May 2009 08:53:27 -0500 Subject: [Omaha.pm] YAPC::NA (YAPC|10) Message-ID: <18FD46AB-7723-4C42-A278-9E626341D82F@jays.net> Woot! Omaha.pm made the list again: http://yapc10.org/yn2009/stats Should be another fun YAPC. :) j From jay at jays.net Fri May 22 07:54:15 2009 From: jay at jays.net (Jay Hannah) Date: Fri, 22 May 2009 09:54:15 -0500 Subject: [Omaha.pm] Fwd: Adding Perl support for Google App Engine References: <4A16A699.9000504@goebel.ws> Message-ID: <08A828F0-679C-4AA5-AAA3-F784148C1A6B@jays.net> Ooo, that'd be neat. j Begin forwarded message: > From: Garrett Goebel > Date: May 22, 2009 8:20:25 AM CDT > To: kc at pm.org > > Votes for supporting Perl is in 3rd place behind PHP and Ruby in > Google App Engine's "Open List" of issues. Perl is less than 300 > votes behind Ruby. If you'd like to help change that... vote and > spread the word... > > See: http://ergoletterbag.blogspot.com/2009/05/adding-perl-support- > to-google-app.html From jay at jays.net Thu May 28 05:40:42 2009 From: jay at jays.net (Jay Hannah) Date: Thu, 28 May 2009 07:40:42 -0500 Subject: [Omaha.pm] Adding Perl support for Google App Engine In-Reply-To: <08A828F0-679C-4AA5-AAA3-F784148C1A6B@jays.net> References: <4A16A699.9000504@goebel.ws> <08A828F0-679C-4AA5-AAA3-F784148C1A6B@jays.net> Message-ID: <3B46ACA5-C1AB-4F54-8E98-414830BB9E19@jays.net> Perl passed Ruby. Way to click that star, Perl army! :) http://code.google.com/p/googleappengine/issues/list j On May 22, 2009, at 9:54 AM, Jay Hannah wrote: > Ooo, that'd be neat. > > j > > Begin forwarded message: >> From: Garrett Goebel >> Date: May 22, 2009 8:20:25 AM CDT >> To: kc at pm.org >> >> Votes for supporting Perl is in 3rd place behind PHP and Ruby in >> Google App Engine's "Open List" of issues. Perl is less than 300 >> votes behind Ruby. If you'd like to help change that... vote and >> spread the word... >> >> See: http://ergoletterbag.blogspot.com/2009/05/adding-perl-support- >> to-google-app.html From drazak at ingenii.com Fri May 29 13:40:57 2009 From: drazak at ingenii.com (Andrew Embury) Date: Fri, 29 May 2009 15:40:57 -0500 Subject: [Omaha.pm] Parsing a Text File Message-ID: <5261016f0905291340q170d6829xb049c9d7ec823848@mail.gmail.com> I have an array with multiple text patterns as its members. I also have a large text file with lines that may or may not have as part of the line the same patterns that are stored as values in the array I'm having problems figuring out an elegant way to iterate the file and do a regex match based on the values that are stored in the array. Can anyone point me in the right direction on a good way to regex match a file and only output the lines that contain one of the patterns in the array? The text file is large, so iterating the file mutiple times is not a good solution for me. Thanks, Drew -------------- next part -------------- An HTML attachment was scrubbed... URL: From stpierre at NebrWesleyan.edu Fri May 29 14:02:18 2009 From: stpierre at NebrWesleyan.edu (Chris St. Pierre) Date: Fri, 29 May 2009 16:02:18 -0500 (CDT) Subject: [Omaha.pm] Parsing a Text File In-Reply-To: <5261016f0905291340q170d6829xb049c9d7ec823848@mail.gmail.com> References: <5261016f0905291340q170d6829xb049c9d7ec823848@mail.gmail.com> Message-ID: On Fri, 29 May 2009, Andrew Embury wrote: > Can anyone point me in the right direction on a good way to regex match a > file and only output the lines that contain one of the patterns in the > array? The text file is large, so iterating the file mutiple times is not a > good solution for me. If I understand your question correctly, you might try using Regexp::Assemble. I.e., something like: use Regexp::Assemble; my $ra = Regexp::Assemble->new(); foreach (@regexps) { $ra->add($_); } my $re = $ra->re; open(my $FH, '<', "/path/to/file") or die "Couldn't read from /path/to/file: $!"; while (<$FH>) { /$re/ and print; } close($FH); Untested, YMMV, use only as directed, etc. :) Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University From jay at jays.net Fri May 29 14:56:40 2009 From: jay at jays.net (Jay Hannah) Date: Fri, 29 May 2009 16:56:40 -0500 Subject: [Omaha.pm] Parsing a Text File In-Reply-To: References: <5261016f0905291340q170d6829xb049c9d7ec823848@mail.gmail.com> Message-ID: <4A205A18.8020703@jays.net> Chris St. Pierre wrote: > my $ra = Regexp::Assemble->new(); > foreach (@regexps) { > $ra->add($_); > } > my $re = $ra->re; Ooo! That's slick! I was going to suggest using qr//o to compile each regex once and only once, then a foreach loop. perldoc perlop qr/STRING/imosx $0.02, j ---------------------------------- my @res = ( qr/AAA/, qr/BBB/, qr/CCC/ ); while () { foreach my $re (@res) { ($_ =~ $re) && print && last; } } __DATA__ blah blAAAh blCCCh blah