From gwadej at anomaly.org Tue Mar 7 16:47:00 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 7 Mar 2006 18:47:00 -0600 Subject: [pm-h] The March Meeting Message-ID: <20060307184700.1adee33a@sovvan> Here's your monthly reminder of the Houston.pm meeting. We'll be meeting at the regular location, the HAL-PC headquarters at 7pm next Tuesday night. The topic will once again be the USB/Delcom VSI project. It's possible that Paul and I will have mostly finished functionality by that point. We'll be able to go over what has been done and get feedback from the group on what may need to be changed. As usual, we'll have our normal discussions and Q&A that are off-topic. We should probably begin looking for a topic for next month. - Do we want to run another project? a. If so, does anyone have any ideas on what they would like to see? b. Does anyone want to take the lead on a project? - Do we want to have another lecture-type meeting next month? a. If so, what topic? b. Anyone have a topic they would like to cover? - Would we like to have a general help session? a. The idea would be to bring in code that you are having problems with and get help from the whole group. b. Any volunteers to bring in code or projects? c. Any volunteer "experts" in parrticular areas of Perl code? G. Wade -- A 'language' is a dialect with an army. From gwadej at anomaly.org Sun Mar 12 20:51:56 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sun, 12 Mar 2006 22:51:56 -0600 Subject: [pm-h] Projects page added to the website Message-ID: <20060312225156.51990d2b@sovvan> I have added a new section to the Houston.pm website that we can use to track projects for the group. This page has information about one active project, the Delcom VSI/LibUSB project, and two suggested projects, Sudoku solver and Bayou-Cam. If you would like to work on one of the projects feel free to contact the project lead, or me if the lead is not listed or not available. If you have a project you would like to list on the site in hopes of getting other members to help or to have a place to put docs and code, email me and I'll help you get set up. Feel free to comment on the new area to help me improve the look, the content, or the number of projects. G. Wade -- Bugs thrive on poor housekeeping and inadequate hygine. Where one is tolerated, many are found. -- Rick Hoselton From gwadej at anomaly.org Tue Mar 14 05:10:51 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 14 Mar 2006 07:10:51 -0600 Subject: [pm-h] Meeting Reminder Message-ID: <20060314071051.68e55546@sovvan> The March meeting is tonight, Tuesday 3/14. Once again we will be covering some of the progress made on the Delcom VSI/LibUSB project. I would also like to get some feedback on potential other projects that the group (or a portion of the group) would like to take on next. Although it may look that way, we have not abandoned the presentation-style meetings. If someone would like to see (or present) a topic at an upcoming meeting, feel free to email the group or me personally and we'll get you on the schedule. Hope to see you all at the meeting tonight. G. Wade -- Virtual is when it's not but it looks like it is and transparent is when it is but it looks like it isn't. -- Rick Hoselton From dillon1 at houston.oilfield.slb.com Tue Mar 14 15:35:08 2006 From: dillon1 at houston.oilfield.slb.com (Bill Dillon) Date: Tue, 14 Mar 2006 17:35:08 -0600 Subject: [pm-h] Meeting Reminder References: <20060314071051.68e55546@sovvan> Message-ID: <4417532C.50807@houston.oilfield.slb.com> Wade, I have a meeting conflict tonight, and won't be able to make it (same with Jay, I think). If we do spend time on projects, I think it would be good to also mix in some general instruction or talks as well. Have we talked about building GUIs in Perl, using Tk? Might be interesting. Even having something as light as top-ten Perl web-sites might be of interest. How about demos of favorite code editors? Regards, --Bill G. Wade Johnson wrote: > The March meeting is tonight, Tuesday 3/14. > > Once again we will be covering some of the progress made on the Delcom > VSI/LibUSB project. > > I would also like to get some feedback on potential other projects that the > group (or a portion of the group) would like to take on next. > > Although it may look that way, we have not abandoned the presentation-style > meetings. If someone would like to see (or present) a topic at an upcoming > meeting, feel free to email the group or me personally and we'll get you on > the schedule. > > Hope to see you all at the meeting tonight. > > G. Wade From gwadej at anomaly.org Tue Mar 14 16:16:29 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 14 Mar 2006 18:16:29 -0600 Subject: [pm-h] Meeting Reminder In-Reply-To: <4417532C.50807@houston.oilfield.slb.com> References: <20060314071051.68e55546@sovvan> <4417532C.50807@houston.oilfield.slb.com> Message-ID: <20060314181629.47b94501@sovvan> On Tue, 14 Mar 2006 17:35:08 -0600 Bill Dillon wrote: > Wade, > > I have a meeting conflict tonight, and won't be able to make it (same > with Jay, I think). > > If we do spend time on projects, I think it would be good to also mix in > some general instruction or talks as well. That's one vote for a mix. Any others? > Have we talked about building GUIs in Perl, using Tk? Might be interesting. We had a talk in September, 2004 on the basics of GUI development with Perl/Tk. We had been talking about a more advanced talk comparing different GUI toolkits, but that one has been stalled for a few months. > Even having something as light as top-ten Perl web-sites might be of > interest. How about demos of favorite code editors? More good ideas. Thanks, Bill. How about the rest of the group, any opinions for topics between projects or instead of projects? G. Wade -- A 'language' is a dialect with an army. From gwadej at anomaly.org Wed Mar 15 19:41:54 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Wed, 15 Mar 2006 21:41:54 -0600 Subject: [pm-h] Last night's meeting Message-ID: <20060315214154.2247b2b9@sovvan> I believe we have completed the presentations on the Delcom VSI and LibUSB project for the time being. There will still be development work continuing, but I think everyone would agree that we've pushed this one a little further than necessary. For next month, I'm thinking about something very different. In the months before this project, we have had a number of sessions on particular modules and a few on intermediate to advanced Perl topics. Looking back over the presentations, it seems that we haven't had a more basic talk in a while. Unless someone has a better idea, I'm going to suggest a talk entitled: "What every Perl programmer should know" This will be more of a beginning to intermediate talk with (hopefully) some audience participation to keep things lively. In order to do a good job with this, I'd like to ask all of you to post your recommendations on things every Perl programmer should know. Let's leave this very open-ended at the moment. We can narrow the topic later if we need to. Any opinions? G. Wade -- Perl isn't really about safety. It's about getting where you're going, and enjoying the trip. It's more important to be a good driver than to have seven feet of sponge rubber all around your car. -- Larry Wall From mikeflan at earthlink.net Thu Mar 16 19:04:11 2006 From: mikeflan at earthlink.net (Mike Flannigan) Date: Thu, 16 Mar 2006 21:04:11 -0600 Subject: [pm-h] Last night's meeting References: <20060315214154.2247b2b9@sovvan> Message-ID: <441A272B.997D19B2@earthlink.net> That sounds like a great idea. Sounds like what I was thinking about long ago. Stuff like: die "something is wrong at line 12.\n" unless ($this =~ s/ght/jj/g); Just simple rules like that. Mike "G. Wade Johnson" wrote: snip > > For next month, I'm thinking about something very different. In the months > before this project, we have had a number of sessions on particular modules > and a few on intermediate to advanced Perl topics. Looking back over the > presentations, it seems that we haven't had a more basic talk in a while. > > Unless someone has a better idea, I'm going to suggest a talk entitled: > > "What every Perl programmer should know" > > This will be more of a beginning to intermediate talk with (hopefully) some > audience participation to keep things lively. > > In order to do a good job with this, I'd like to ask all of you to post your > recommendations on things every Perl programmer should know. Let's leave this > very open-ended at the moment. We can narrow the topic later if we need to. > > Any opinions? > From gwadej at anomaly.org Mon Mar 20 17:32:37 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 20 Mar 2006 19:32:37 -0600 Subject: [pm-h] Sudoku Solver Message-ID: <20060320193237.73879a98@sovvan> There is an article in the February issue of Dr. Dobbs Journal showing some techniques for solving Sudoku puzzles in C++. Most of the article is pretty theoretical, but they do supply source for download. G. Wade -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth From gwadej at anomaly.org Mon Mar 20 17:35:38 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 20 Mar 2006 19:35:38 -0600 Subject: [pm-h] Next month's topic Message-ID: <20060320193538.6d3a9c09@sovvan> Does anyone else have opinions on tips for "things every Perl programmer should know"? Failing that, does anyone have a presentation they would like to give? Failing that, does anyone have suggestions for another topic? (Failing that, we're limited to what I can come up with.) G. Wade -- I like you. You're trouble. -- Draal - "Voices of Authority" From mikeflan at earthlink.net Tue Mar 21 05:34:00 2006 From: mikeflan at earthlink.net (Mike Flannigan) Date: Tue, 21 Mar 2006 07:34:00 -0600 Subject: [pm-h] Next month's topic References: <20060320193538.6d3a9c09@sovvan> Message-ID: <442000C8.66A69927@earthlink.net> I pass this on long ago. This reg expression method is valuable for reading many types of files: next unless ($_ =~ /^[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\tAR\t/i); Mike "G. Wade Johnson" wrote: > Does anyone else have opinions on tips for "things every Perl programmer > should know"? > > Failing that, does anyone have a presentation they would like to give? > > Failing that, does anyone have suggestions for another topic? > > (Failing that, we're limited to what I can come up with.) > G. Wade > -- > I like you. You're trouble. -- Draal - "Voices of Authority" > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston From gwadej at anomaly.org Tue Mar 21 18:17:03 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 21 Mar 2006 20:17:03 -0600 Subject: [pm-h] Next month's topic In-Reply-To: <442000C8.66A69927@earthlink.net> References: <20060320193538.6d3a9c09@sovvan> <442000C8.66A69927@earthlink.net> Message-ID: <20060321201703.60d4aa78@sovvan> On Tue, 21 Mar 2006 07:34:00 -0600 Mike Flannigan wrote: > > I pass this on long ago. This reg expression method is > valuable for reading many types of files: > > next unless ($_ =~ > /^[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\t[^\t]*\tAR\t/i); Oh, the "next unless line_test" technique. One of my favorite variations of this is: next unless /\S/; for skipping blank lines. That's a really good technique to remind people about. G. Wade -- Virtual is when it's not but it looks like it is and transparent is when it is but it looks like it isn't. -- Rick Hoselton From gwadej at anomaly.org Tue Mar 21 19:30:19 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Tue, 21 Mar 2006 21:30:19 -0600 Subject: [pm-h] March meeting Message-ID: <20060321213019.57728aad@sovvan> The notes on the March meeting are up. In the next few days, I hope to post a usable version of the Device::USB::LibUSB module on the project page, along with some of the test scripts I used for testing things. G. Wade -- I have this feeling, that my luck is none too good. -- "Black Blade", Blue Oyster Cult From gwadej at anomaly.org Sat Mar 25 19:31:50 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sat, 25 Mar 2006 21:31:50 -0600 Subject: [pm-h] Delcom VSI project page updated Message-ID: <20060325213150.1f583268@sovvan> I've updated the notes on the Delcom VSI/LibUSB project page to reflect the March meeting. I've also added a mostly-functional copy of the Device::USB::LibUSB module as a tarball ready for installation. This module can be installed with the normal perl Makefile.PL make make install routine. It will require access to a C compiler, but other than that it should be straight forward. If you find any problems, feel free to write to me directly or post on the list. G. Wade -- There are trivial truths and there are great Truths. The opposite of a trivial truth is obviously false. The opposite of a great Truth is also true. -- Neils Bohr From gwadej at anomaly.org Sun Mar 26 14:15:10 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Sun, 26 Mar 2006 16:15:10 -0600 Subject: [pm-h] Sudoku Solver in Perl Message-ID: <20060326161510.049d4218@sovvan> http://kw.pm.org/talks/2006-01-sudoku/slide1a.html appears to be an interesting discussion of solving Sudoku puzzles in Perl. The author appears not to have completed the solving the puzzles and has explored tangents on generating and grading puzzles. There is also information on terminology. G. Wade -- To vacillate or not to vacillate, that is the question ... or is it? From tigger at io.com Mon Mar 27 15:13:02 2006 From: tigger at io.com (Paul Archer) Date: Mon, 27 Mar 2006 17:13:02 -0600 (CST) Subject: [pm-h] complex data structure help Message-ID: <20060327170412.B21516@eris.io.com> I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog files. They're in the same format as wu-ftpd xferlog files. I'd use an existing solution, but I can't find anything that keeps track of reads vs writes, which is critical for us. Anyway, I need to be able to sort by filesystem, client machine, user, time (with a one-hour base period) read, write, or total usage. Can anyone suggest a data structure (or pointers to same) that will allow me to pull data out in an arbitrary fashion (ie users on X day sorted by data written)? Once I have the structure, I can deal with doing the reports, but I want to make sure I don't shoot myself in the foot with the structure. I was thinking of a hash of hashes, where the keys are filesystems pointing to hashes where the keys are client machines, etc, etc. But it seems that approach would be inefficent for lookups based on times or users (for example). Any help would be greatly appreciated. Paul From buu at erxz.com Mon Mar 27 15:41:39 2006 From: buu at erxz.com (buu@erxz.com) Date: Mon, 27 Mar 2006 17:41:39 -0600 Subject: [pm-h] complex data structure help In-Reply-To: <20060327170412.B21516@eris.io.com> References: <20060327170412.B21516@eris.io.com> Message-ID: <20060327234139.GC1951@erxz.com> On Mon, Mar 27, 2006 at 05:13:02PM -0600, Paul Archer wrote: > I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog > files. They're in the same format as wu-ftpd xferlog files. I'd use an > existing solution, but I can't find anything that keeps track of reads vs > writes, which is critical for us. > Anyway, I need to be able to sort by filesystem, client machine, user, time > (with a one-hour base period) read, write, or total usage. > Can anyone suggest a data structure (or pointers to same) that will allow me > to pull data out in an arbitrary fashion (ie users on X day sorted by data > written)? > Once I have the structure, I can deal with doing the reports, but I want to > make sure I don't shoot myself in the foot with the structure. > > I was thinking of a hash of hashes, where the keys are filesystems pointing > to hashes where the keys are client machines, etc, etc. But it seems that > approach would be inefficent for lookups based on times or users (for > example). > > Any help would be greatly appreciated. > > Paul > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston Um. Have you considered a relational database? Sounds ideal for your problem.. From tigger at io.com Mon Mar 27 15:13:02 2006 From: tigger at io.com (Paul Archer) Date: Mon, 27 Mar 2006 17:13:02 -0600 (CST) Subject: [pm-h] [SPAM] [PBML] complex data structure help Message-ID: <20060327170412.B21516@eris.io.com> I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog files. They're in the same format as wu-ftpd xferlog files. I'd use an existing solution, but I can't find anything that keeps track of reads vs writes, which is critical for us. Anyway, I need to be able to sort by filesystem, client machine, user, time (with a one-hour base period) read, write, or total usage. Can anyone suggest a data structure (or pointers to same) that will allow me to pull data out in an arbitrary fashion (ie users on X day sorted by data written)? Once I have the structure, I can deal with doing the reports, but I want to make sure I don't shoot myself in the foot with the structure. I was thinking of a hash of hashes, where the keys are filesystems pointing to hashes where the keys are client machines, etc, etc. But it seems that approach would be inefficent for lookups based on times or users (for example). Any help would be greatly appreciated. Paul Unsubscribing info is here: http://help.yahoo.com/help/us/groups/groups-32.html Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/perl-beginner/ <*> To unsubscribe from this group, send an email to: perl-beginner-unsubscribe at yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/ From gwadej at anomaly.org Mon Mar 27 17:16:05 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 27 Mar 2006 19:16:05 -0600 Subject: [pm-h] complex data structure help In-Reply-To: <20060327234139.GC1951@erxz.com> References: <20060327170412.B21516@eris.io.com> <20060327234139.GC1951@erxz.com> Message-ID: <20060327191605.7e0a2f44@sovvan> I'm not familiar with the that log format, but taking the detabase suggestion one step in an odd direction, DBD::CSV might be able to put a relational database front-end on a log file. In the past, my normal approach for this sort of thing was an array of hashes. (One hash perl line) The array is easily sorted with usung 'sort' and can be filtered using 'grep'. Depending on how big the data set is and how complicated the query a database might be a better choice. G. Wade On Mon, 27 Mar 2006 17:41:39 -0600 buu at erxz.com wrote: > On Mon, Mar 27, 2006 at 05:13:02PM -0600, Paul Archer wrote: > > I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog > > files. They're in the same format as wu-ftpd xferlog files. I'd use an > > existing solution, but I can't find anything that keeps track of reads vs > > writes, which is critical for us. > > Anyway, I need to be able to sort by filesystem, client machine, user, > > time (with a one-hour base period) read, write, or total usage. > > Can anyone suggest a data structure (or pointers to same) that will allow > > me to pull data out in an arbitrary fashion (ie users on X day sorted by > > data written)? > > Once I have the structure, I can deal with doing the reports, but I want > > to make sure I don't shoot myself in the foot with the structure. > > > > I was thinking of a hash of hashes, where the keys are filesystems > > pointing to hashes where the keys are client machines, etc, etc. But it > > seems that approach would be inefficent for lookups based on times or > > users (for example). > > > > Any help would be greatly appreciated. > > > > Paul > > _______________________________________________ > > Houston mailing list > > Houston at pm.org > > http://mail.pm.org/mailman/listinfo/houston > > Um. Have you considered a relational database? Sounds ideal for your > problem.. > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston -- No, no, you're not thinking, you're just being logical. -- Neils Bohr From gwadej at anomaly.org Mon Mar 27 20:22:20 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Mon, 27 Mar 2006 22:22:20 -0600 Subject: [pm-h] Device::USB Message-ID: <20060327222220.12947752@sovvan> The author of Device::USB has agreed to let us take over that module name for the new implementation. I'm currently converting the code to the new name and extending the tests to make a reasonable module. When I have something ready, I'll update the project page. G. Wade -- I have this feeling, that my luck is none too good. -- "Black Blade", Blue Oyster Cult From kevin at shaum.com Tue Mar 28 01:59:32 2006 From: kevin at shaum.com (Kevin Shaum) Date: Tue, 28 Mar 2006 03:59:32 -0600 Subject: [pm-h] [SPAM] [PBML] complex data structure help In-Reply-To: <20060327170412.B21516@eris.io.com> References: <20060327170412.B21516@eris.io.com> Message-ID: <200603280359.32640.kevin@shaum.com> On Monday 27 March 2006 5:13 pm, Paul Archer wrote: > I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog > files. They're in the same format as wu-ftpd xferlog files. I'd use an > existing solution, but I can't find anything that keeps track of reads vs > writes, which is critical for us. > Anyway, I need to be able to sort by filesystem, client machine, user, time > (with a one-hour base period) read, write, or total usage. > Can anyone suggest a data structure (or pointers to same) that will allow > me to pull data out in an arbitrary fashion (ie users on X day sorted by > data written)? > Once I have the structure, I can deal with doing the reports, but I want to > make sure I don't shoot myself in the foot with the structure. > > I was thinking of a hash of hashes, where the keys are filesystems pointing > to hashes where the keys are client machines, etc, etc. But it seems that > approach would be inefficent for lookups based on times or users (for > example). The simplest thing to do would be to store it all as a simple list of (references to) lists, then 'grep' and 'sort' the big list as the query requires. @result = sort { $a->[1] lt $b->[1] } grep { $_->[2] >= $time0 and $_->[2] <= $time1 } grep { $_->[0] eq 'myhost' } @dataset; A more readable (but possibly less efficient) version would store each entry in the big list as (a reference to) a hash: @result = sort { $a->{username} lt $b->{username1} } grep { $_->{time} >= $time0 and $_->{time} < $time1 } grep { $_->{hostname} eq 'myhost' } @dataset; If the data set is large enough that that's not practical, then the suggestion to go to a relational database (e.g., SQLite) makes sense. But it sounds like you're thinking of keeping it all in RAM anyway. Hope this helps. Kevin From tigger at io.com Tue Mar 28 07:32:40 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 09:32:40 -0600 (CST) Subject: [pm-h] [SPAM] [PBML] complex data structure help In-Reply-To: <200603280359.32640.kevin@shaum.com> References: <20060327170412.B21516@eris.io.com> <200603280359.32640.kevin@shaum.com> Message-ID: <20060328093019.Q21516@eris.io.com> Thanks for the suggestions. I'm torn between using Perl structures (easier in the short term) and a database (harder in the short term, but better for long-term storage). Since we're planning on being able to store anywhere from months to years worth of data, a database is probably my best bet. Now I just gotta put on my (tiny, ill-fitting) DBA hat, and scratch out a schema. 8-) Paul 3:59am, Kevin Shaum wrote: > On Monday 27 March 2006 5:13 pm, Paul Archer wrote: >> I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog >> files. They're in the same format as wu-ftpd xferlog files. I'd use an >> existing solution, but I can't find anything that keeps track of reads vs >> writes, which is critical for us. >> Anyway, I need to be able to sort by filesystem, client machine, user, time >> (with a one-hour base period) read, write, or total usage. >> Can anyone suggest a data structure (or pointers to same) that will allow >> me to pull data out in an arbitrary fashion (ie users on X day sorted by >> data written)? >> Once I have the structure, I can deal with doing the reports, but I want to >> make sure I don't shoot myself in the foot with the structure. >> >> I was thinking of a hash of hashes, where the keys are filesystems pointing >> to hashes where the keys are client machines, etc, etc. But it seems that >> approach would be inefficent for lookups based on times or users (for >> example). > > The simplest thing to do would be to store it all as a simple list of > (references to) lists, then 'grep' and 'sort' the big list as the query > requires. > > @result = sort { $a->[1] lt $b->[1] } > grep { $_->[2] >= $time0 and $_->[2] <= $time1 } > grep { $_->[0] eq 'myhost' } > @dataset; > > A more readable (but possibly less efficient) version would store each entry > in the big list as (a reference to) a hash: > > @result = sort { $a->{username} lt $b->{username1} } > grep { $_->{time} >= $time0 and $_->{time} < $time1 } > grep { $_->{hostname} eq 'myhost' } > @dataset; > > If the data set is large enough that that's not practical, then the suggestion > to go to a relational database (e.g., SQLite) makes sense. But it sounds like > you're thinking of keeping it all in RAM anyway. > > Hope this helps. > > Kevin > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > ----------------------------------------------- "Working with babies had its problems... but then I tried working with chickens." Jim Henson, talking about making "Labyrinth" ----------------------------------------------- From tigger at io.com Tue Mar 28 09:31:25 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 11:31:25 -0600 (CST) Subject: [pm-h] complex data structure help In-Reply-To: <20060327191605.7e0a2f44@sovvan> References: <20060327170412.B21516@eris.io.com> <20060327234139.GC1951@erxz.com> <20060327191605.7e0a2f44@sovvan> Message-ID: <20060328112133.B21516@eris.io.com> I guess the problem I'm having is that I need to consolidate information. Since this is an NFS log, each line represents a file read or written. That's too much information (hundreds of MBs a day). I need to be able to to distill it to just summary information. I'm just not sure how to handle that. I figure that the smallest unit I'd have is what one user on one machine read or wrote on one filesystem during an hour. Maybe a simple format: filesystem user client-machine time-to-the-hour read written Then for every line, I check to see if I have an entry that matches the first four parameters. If so, I add the number of bytes read or written. If not, I create a new entry. Then I can sort by whatever field I want, and limit my searches however I need to. Only that seems inefficient. Could I normalize that somehow? Paul Yesterday, G. Wade Johnson wrote: > I'm not familiar with the that log format, but taking the detabase suggestion > one step in an odd direction, DBD::CSV might be able to put a relational > database front-end on a log file. > > In the past, my normal approach for this sort of thing was an array of hashes. > (One hash perl line) The array is easily sorted with usung 'sort' and can be > filtered using 'grep'. > > Depending on how big the data set is and how complicated the query a database > might be a better choice. > > G. Wade > > On Mon, 27 Mar 2006 17:41:39 -0600 > buu at erxz.com wrote: > >> On Mon, Mar 27, 2006 at 05:13:02PM -0600, Paul Archer wrote: >>> I'm writing a log analyzer (a la Webalyzer) to analyze Solaris' nfslog >>> files. They're in the same format as wu-ftpd xferlog files. I'd use an >>> existing solution, but I can't find anything that keeps track of reads vs >>> writes, which is critical for us. >>> Anyway, I need to be able to sort by filesystem, client machine, user, >>> time (with a one-hour base period) read, write, or total usage. >>> Can anyone suggest a data structure (or pointers to same) that will allow >>> me to pull data out in an arbitrary fashion (ie users on X day sorted by >>> data written)? >>> Once I have the structure, I can deal with doing the reports, but I want >>> to make sure I don't shoot myself in the foot with the structure. >>> >>> I was thinking of a hash of hashes, where the keys are filesystems >>> pointing to hashes where the keys are client machines, etc, etc. But it >>> seems that approach would be inefficent for lookups based on times or >>> users (for example). >>> >>> Any help would be greatly appreciated. >>> >>> Paul >>> _______________________________________________ >>> Houston mailing list >>> Houston at pm.org >>> http://mail.pm.org/mailman/listinfo/houston >> >> Um. Have you considered a relational database? Sounds ideal for your >> problem.. >> _______________________________________________ >> Houston mailing list >> Houston at pm.org >> http://mail.pm.org/mailman/listinfo/houston > > > -- > No, no, you're not thinking, you're just being logical. > -- Neils Bohr > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > ----------------------------------------------------- "Somebody did say Swedish porn, there-- but someone always does..." --Clive Anderson, host of "Whose Line Is It, Anyway", after asking the audience for movie suggestions ----------------------------------------------------- From tigger at io.com Tue Mar 28 10:42:39 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 12:42:39 -0600 (CST) Subject: [pm-h] complex data structure help In-Reply-To: <20060328112133.B21516@eris.io.com> References: <20060327170412.B21516@eris.io.com> <20060327234139.GC1951@erxz.com> <20060327191605.7e0a2f44@sovvan> <20060328112133.B21516@eris.io.com> Message-ID: <20060328124103.C21516@eris.io.com> 11:31am, Paul Archer wrote: > > Maybe a simple format: > > filesystem user client-machine time-to-the-hour read written > > Then for every line, I check to see if I have an entry that matches the > first four parameters. If so, I add the number of bytes read or written. If > not, I create a new entry. Then I can sort by whatever field I want, and > limit my searches however I need to. > > Only that seems inefficient. Could I normalize that somehow? > > Paul > I talked this over with a coworker. He suggested one table each for users, client machines, and filesystems, just for lookups. Then one table for a combination of the three, and one table for timeperiods with the bytes written or read for each time period. I think that'll work... From sisk at mojotoad.com Tue Mar 28 11:30:16 2006 From: sisk at mojotoad.com (Matt Sisk) Date: Tue, 28 Mar 2006 13:30:16 -0600 Subject: [pm-h] complex data structure help In-Reply-To: <20060328124103.C21516@eris.io.com> References: <20060327170412.B21516@eris.io.com> <20060327234139.GC1951@erxz.com> <20060327191605.7e0a2f44@sovvan> <20060328112133.B21516@eris.io.com> <20060328124103.C21516@eris.io.com> Message-ID: <44298EC8.6050304@mojotoad.com> Paul Archer wrote: > 11:31am, Paul Archer wrote: > > >> >> >> Only that seems inefficient. Could I normalize that somehow? >> > I talked this over with a coworker. He suggested one table each for users, > client machines, and filesystems, just for lookups. Then one table for a > combination of the three, and one table for timeperiods with the bytes > written or read for each time period. > That's not just normalized...it's first normal form. ;) (more or less) Matt From tigger at io.com Tue Mar 28 13:11:02 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 15:11:02 -0600 (CST) Subject: [pm-h] complex data structure help In-Reply-To: <44298EC8.6050304@mojotoad.com> References: <20060327170412.B21516@eris.io.com> <20060327234139.GC1951@erxz.com> <20060327191605.7e0a2f44@sovvan> <20060328112133.B21516@eris.io.com> <20060328124103.C21516@eris.io.com> <44298EC8.6050304@mojotoad.com> Message-ID: <20060328151023.Q21516@eris.io.com> 1:30pm, Matt Sisk wrote: > Paul Archer wrote: >> 11:31am, Paul Archer wrote: >> >> >>> >>> >>> Only that seems inefficient. Could I normalize that somehow? >>> >> I talked this over with a coworker. He suggested one table each for users, >> client machines, and filesystems, just for lookups. Then one table for a >> combination of the three, and one table for timeperiods with the bytes >> written or read for each time period. >> > > That's not just normalized...it's first normal form. ;) > > (more or less) > Matt Hmmm...so what would it take to get it to 3rd normal form? From tigger at io.com Tue Mar 28 14:01:59 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 16:01:59 -0600 (CST) Subject: [pm-h] converting IP<->hostname and UID<->username Message-ID: <20060328155727.V21516@eris.io.com> Can anyone suggest modules that can convert IP addresses to and from hostnames, and UIDs to and from usernames? I've searched CPAN, but I didn't come up with anything. Or should I just call 'getent'? (That works for UID<->username, but not for IP addresses not in the host file...) --------Brady's First Law of Problem Solving:------- When confronted by a difficult problem, you can solve it more easily by reducing it to the question, "How would the Lone Ranger have handled this?" ---------------------------------------------------- From tigger at io.com Tue Mar 28 14:12:26 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 16:12:26 -0600 (CST) Subject: [pm-h] [PBML] converting IP<->hostname and UID<->username In-Reply-To: <20060328155727.V21516@eris.io.com> References: <20060328155727.V21516@eris.io.com> Message-ID: <20060328161137.X21516@eris.io.com> 4:01pm, Paul Archer wrote: > Can anyone suggest modules that can convert IP addresses to and from > hostnames, and UIDs to and from usernames? I've searched CPAN, but I > didn't come up with anything. > Or should I just call 'getent'? (That works for UID<->username, but not for > IP addresses not in the host file...) > Um, ignore that last comment. I was consistently fat fingering a test IP address. (127 != 172) ------------------------------------------------------------------- Perl elegant? Perl is like your grandfather's garage. Sure, he kept most of it tidy to please your grandmother but there was always one corner where you could find the most amazing junk. And some days, when you were particularly lucky, he'd show you how it worked. ----------Shawn Corey -------------- From tigger at io.com Tue Mar 28 14:12:26 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 16:12:26 -0600 (CST) Subject: [pm-h] [SPAM] Re: [PBML] converting IP<->hostname and UID<->username In-Reply-To: <20060328155727.V21516@eris.io.com> References: <20060328155727.V21516@eris.io.com> Message-ID: <20060328161137.X21516@eris.io.com> 4:01pm, Paul Archer wrote: > Can anyone suggest modules that can convert IP addresses to and from > hostnames, and UIDs to and from usernames? I've searched CPAN, but I > didn't come up with anything. > Or should I just call 'getent'? (That works for UID<->username, but not for > IP addresses not in the host file...) > Um, ignore that last comment. I was consistently fat fingering a test IP address. (127 != 172) ------------------------------------------------------------------- Perl elegant? Perl is like your grandfather's garage. Sure, he kept most of it tidy to please your grandmother but there was always one corner where you could find the most amazing junk. And some days, when you were particularly lucky, he'd show you how it worked. ----------Shawn Corey -------------- Unsubscribing info is here: http://help.yahoo.com/help/us/groups/groups-32.html Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/perl-beginner/ <*> To unsubscribe from this group, send an email to: perl-beginner-unsubscribe at yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/ From alan at ajackson.org Tue Mar 28 17:15:02 2006 From: alan at ajackson.org (Alan Jackson) Date: Tue, 28 Mar 2006 19:15:02 -0600 Subject: [pm-h] converting IP<->hostname and UID<->username In-Reply-To: <20060328155727.V21516@eris.io.com> References: <20060328155727.V21516@eris.io.com> Message-ID: <20060328191502.d6fb7472.alan@ajackson.org> On Tue, 28 Mar 2006 16:01:59 -0600 (CST) Paul Archer wrote: > Can anyone suggest modules that can convert IP addresses to and from > hostnames, and UIDs to and from usernames? I've searched CPAN, but I > didn't come up with anything. > Or should I just call 'getent'? (That works for UID<->username, but not for > IP addresses not in the host file...) > Not an easy problem. I've written a lot of anti-spam software, and have beaten on this problem several times. Here is a snippet of code that does a pretty good job... $host = `/usr/bin/host $ip`; my $foo = (split(/\s+/, $host))[4]; if ($host =~ /not found/ || length($foo)<4) { $host = `zcw -h $ip | grep Abuse`; } if (length $host < 1) {$host = 'UNK';} if ($host eq 'reached') {$host = 'UNK';} $host = (split(/\s+/,$host))[-1]; $host = (split(/@/,$host))[-1]; zcw is a nice bit of code I got from www.cyberabuse.org. Of course, before I even go through these gyrations, I have a simple database that I keep every IP address I have received e-mail from, the domain, and a counter, and I check that first. -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From tigger at io.com Tue Mar 28 17:25:40 2006 From: tigger at io.com (Paul Archer) Date: Tue, 28 Mar 2006 19:25:40 -0600 (CST) Subject: [pm-h] converting IP<->hostname and UID<->username In-Reply-To: <20060328191502.d6fb7472.alan@ajackson.org> References: <20060328155727.V21516@eris.io.com> <20060328191502.d6fb7472.alan@ajackson.org> Message-ID: <20060328192340.L94279@fnord.io.com> Interesting stuff there. But if I'm going to call an external program, I think I'll stick to getent. It's better than host, dig, etc, for IP/hostname lookups 'cause it uses the kernel resolver which pays attention to /etc/nsswitch.conf (which means you can get lookups out of local files, nis, nis+, dns, or ldap, depending on your machine's configuration). Thanks, Paul 7:15pm, Alan Jackson wrote: > On Tue, 28 Mar 2006 16:01:59 -0600 (CST) > Paul Archer wrote: > >> Can anyone suggest modules that can convert IP addresses to and from >> hostnames, and UIDs to and from usernames? I've searched CPAN, but I >> didn't come up with anything. >> Or should I just call 'getent'? (That works for UID<->username, but not for >> IP addresses not in the host file...) >> > > Not an easy problem. I've written a lot of anti-spam software, and have beaten > on this problem several times. Here is a snippet of code that does a pretty > good job... > > $host = `/usr/bin/host $ip`; > my $foo = (split(/\s+/, $host))[4]; > if ($host =~ /not found/ || length($foo)<4) { > $host = `zcw -h $ip | grep Abuse`; > } > if (length $host < 1) {$host = 'UNK';} > if ($host eq 'reached') {$host = 'UNK';} > $host = (split(/\s+/,$host))[-1]; > $host = (split(/@/,$host))[-1]; > > zcw is a nice bit of code I got from www.cyberabuse.org. > > Of course, before I even go through these gyrations, I have a simple > database that I keep every IP address I have received e-mail from, > the domain, and a counter, and I check that first. > > > -- > ----------------------------------------------------------------------- > | Alan K. Jackson | To see a World in a Grain of Sand | > | alan at ajackson.org | And a Heaven in a Wild Flower, | > | www.ajackson.org | Hold Infinity in the palm of your hand | > | Houston, Texas | And Eternity in an hour. - Blake | > ----------------------------------------------------------------------- > _______________________________________________ > Houston mailing list > Houston at pm.org > http://mail.pm.org/mailman/listinfo/houston > --------------------------------------------------- Tech Support: "I need you to boot the computer." Customer: (THUMP! Pause.) "No, that didn't help." ---------(http://www.rinkworks.com/stupid)--------- From tigger at io.com Wed Mar 29 08:04:13 2006 From: tigger at io.com (Paul Archer) Date: Wed, 29 Mar 2006 10:04:13 -0600 (CST) Subject: [pm-h] Bayou project (or: you had too much time on your hands anyway, Wade) Message-ID: <20060329100159.P21516@eris.io.com> http://ronja.twibright.com/ Open source plans for home-built FSO ("free space optics") point-to-point connectivity. Would be a great fit for the bayou project. And would probably be really enticing to any former EE's out there... Paul ------------------------------------------------------------- Note the obsessive use of abbreviations and avoidance of capital letters; this is a system invented by people to whom repetitive stress disorder is what black lung is to miners. Long names get worn down to three-letter nubbins, like stones smoothed by a river. ---Neal Stephenson, on Unix filesystem naming conventions---- From tigger at io.com Thu Mar 30 10:47:12 2006 From: tigger at io.com (Paul Archer) Date: Thu, 30 Mar 2006 12:47:12 -0600 (CST) Subject: [pm-h] print on warning tip Message-ID: <20060330124649.P21516@eris.io.com> I had a problem where I needed to see what a few values where if I got a 'use of unititalized value' warning, but I'm parsing a logfile that can easily be over a million lines long (Solaris' nfslog), so I couldn't really just print everything a million times. I found this is perfaq8: BEGIN { $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; } This hints at an answer, but the problem is that I need to print a lexically scoped variable. So I changed things up a bit: our $warnflag; BEGIN { $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; $warnflag=1;}; } And in my code: print "username is $username\tid is $id, UID is $UID\n" if $warnflag; $warnflag =0 if $warnflag; Hope that helps someone. And if there's a better way to do this, I'd love to see that, too. Paul From gwadej at anomaly.org Thu Mar 30 17:44:31 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 30 Mar 2006 19:44:31 -0600 Subject: [pm-h] Deja vu Message-ID: <20060330194431.0d5cc7b2@sovvan> As a wonderful example of coincidence in action, Paul brings up the idea of a group project building a Sudoku Solver in Perl. (BTW, Paul, thanks for introducing me to another addiction Even my little boy is getting into them.) Today, I received the Spring 2006 issue of the Perl Review. In it are three articles relating to Sudoku, including both a generator and a solver. We might need to pick another project, I think this one is soon to be done to death. (Unless, of course, we can think of a truly unique way to solve them.) Oh, well. G. Wade -- In theory, theory and practice are the same. In practice, they're not. From gwadej at anomaly.org Thu Mar 30 17:45:46 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 30 Mar 2006 19:45:46 -0600 Subject: [pm-h] Tips and Tricks Message-ID: <20060330194546.63c71cd4@sovvan> We've had a few suggestions for tips for next month's meeting, but I was really hoping for more. Any other things that you really thnk every Perl programmer should know (and might not). G. Wade -- Cannot say. Saying I would know, do not know, so cannot say. -- Zathras - "The War without End" From tigger at io.com Thu Mar 30 18:59:24 2006 From: tigger at io.com (Paul Archer) Date: Thu, 30 Mar 2006 20:59:24 -0600 (CST) Subject: [pm-h] Tips and Tricks In-Reply-To: <20060330194546.63c71cd4@sovvan> References: <20060330194546.63c71cd4@sovvan> Message-ID: <20060330205627.V12345@eris.io.com> 7:45pm, G. Wade Johnson wrote: > We've had a few suggestions for tips for next month's meeting, but I was > really hoping for more. > > Any other things that you really thnk every Perl programmer should know (and > might not). > Well, there was my tip today about warnings. I wish I had known that a long time ago. And I'm finishing up a log parser for work that I could share. It's got some fun data-caching tricks and a nice way of parsing only part of a log file. From tigger at io.com Thu Mar 30 19:01:30 2006 From: tigger at io.com (Paul Archer) Date: Thu, 30 Mar 2006 21:01:30 -0600 (CST) Subject: [pm-h] Deja vu In-Reply-To: <20060330194431.0d5cc7b2@sovvan> References: <20060330194431.0d5cc7b2@sovvan> Message-ID: <20060330205933.T12345@eris.io.com> 7:44pm, G. Wade Johnson wrote: > As a wonderful example of coincidence in action, Paul brings up the idea of a > group project building a Sudoku Solver in Perl. (BTW, Paul, thanks for > introducing me to another addiction Even my little boy is getting > into them.) > > Today, I received the Spring 2006 issue of the Perl Review. In it are three > articles relating to Sudoku, including both a generator and a solver. > > We might need to pick another project, I think this one is soon to be done to > death. (Unless, of course, we can think of a truly unique way to solve them.) > Hmmm...randomly populate the tables and then check to see if it's solved? That'd be unique. 8-) From gwadej at anomaly.org Thu Mar 30 19:48:54 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 30 Mar 2006 21:48:54 -0600 Subject: [pm-h] Deja vu In-Reply-To: <20060330205933.T12345@eris.io.com> References: <20060330194431.0d5cc7b2@sovvan> <20060330205933.T12345@eris.io.com> Message-ID: <20060330214854.2b32aa27@sovvan> On Thu, 30 Mar 2006 21:01:30 -0600 (CST) Paul Archer wrote: > 7:44pm, G. Wade Johnson wrote: > > > As a wonderful example of coincidence in action, Paul brings up the idea > > of a group project building a Sudoku Solver in Perl. (BTW, Paul, thanks > > for introducing me to another addiction Even my little boy is > > getting into them.) > > > > Today, I received the Spring 2006 issue of the Perl Review. In it are > > three articles relating to Sudoku, including both a generator and a > > solver. > > > > We might need to pick another project, I think this one is soon to be done > > to death. (Unless, of course, we can think of a truly unique way to solve > > them.) > > > > Hmmm...randomly populate the tables and then check to see if it's solved? > That'd be unique. 8-) The hard part would be getting the algorithm to complete before the computer evaporates into component atoms. G. Wade -- The three principal virtues of a programmer are Laziness, Impatience, and Hubris. -- Larry Wall From gwadej at anomaly.org Thu Mar 30 19:51:10 2006 From: gwadej at anomaly.org (G. Wade Johnson) Date: Thu, 30 Mar 2006 21:51:10 -0600 Subject: [pm-h] Tips and Tricks In-Reply-To: <20060330205627.V12345@eris.io.com> References: <20060330194546.63c71cd4@sovvan> <20060330205627.V12345@eris.io.com> Message-ID: <20060330215110.0d12e1b0@sovvan> On Thu, 30 Mar 2006 20:59:24 -0600 (CST) Paul Archer wrote: > 7:45pm, G. Wade Johnson wrote: > > > We've had a few suggestions for tips for next month's meeting, but I was > > really hoping for more. > > > > Any other things that you really thnk every Perl programmer should know > > (and might not). > > > > Well, there was my tip today about warnings. I wish I had known that a long > time ago. > > And I'm finishing up a log parser for work that I could share. It's got some > > fun data-caching tricks and a nice way of parsing only part of a log file. As I said, we've got a few tricks in from people. There are a lot of people on this list, I wonder how many of them know one or two tips that are just what someone else is looking for. Any takers? G. Wade -- You write code as if the person who will maintain your code is a violent psychopath who knows where you live. -- John F. Woods From kevin at shaum.com Fri Mar 31 00:02:56 2006 From: kevin at shaum.com (Kevin Shaum) Date: Fri, 31 Mar 2006 02:02:56 -0600 Subject: [pm-h] print on warning tip In-Reply-To: <20060330124649.P21516@eris.io.com> References: <20060330124649.P21516@eris.io.com> Message-ID: <200603310202.56323.kevin@shaum.com> On Thursday 30 March 2006 12:47 pm, Paul Archer wrote: > This hints at an answer, but the problem is that I need to print a > lexically scoped variable. Would the plain print statement form work if you put the BEGIN block within the scope of the variable you want to print? If that doesn't work, try making it an INIT block instead of a BEGIN block. (Executes at the beginning of program execution, instead of at the beginning of program compilation.) And if that doesn't work, try enclosing the print statement within an eval {...} or eval "...". (The latter is slower, but is compiled at execution time instead of compile time, so it might be able to capture the lexical variable.) And if that doesn't work... well, I'm out of ideas; go back to the way you were doing it. :-) Kevin