From James.Genus.Jr at nsc.com  Wed Aug  4 10:24:21 1999
From: James.Genus.Jr at nsc.com (James Genus Jr)
Date: Wed Aug  4 23:57:31 2004
Subject: Programmer's File Editor
Message-ID: <"0591637A85B2516E*/c=US/admd= /prmd=National/o=notes/ou=Americas/s=Genus Jr/g=James/"@MHS>

Sorry could make it last night, but my wife was not feeling well so I stayed 
home.

I share the info on the Programmer's File Editor, which for anyone that does 
scripting on win32, is a pleasant change from notepad, word, etc. For those who 
don't have anything and don't want to spend any money, check it out.

The following quote is from the PFE website run by Alan Phillips:

"PFE is a large-capacity, multi-file editor that runs on Windows 98, Windows 
95, Windows NT
4.0 and Windows 2000 on Intel-compatible processors, and on Windows 3.1x. 
Although it's
primarily oriented towards program developers and contains features like the 
ability to run
compilers and development applications, it also makes a very good general 
purpose editor for any
function at all " -  Alan Phillips


Alan Phillips	( A.Phillips@lancaster.ac.uk )
		( http://www.lancs.ac.uk/people/cpaap/pfe )

See everyone in sept.
James

From freter at freter.com  Mon Aug  9 14:53:29 1999
From: freter at freter.com (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: object Perl
Message-ID: <37AF31B9.25D4E703@admc.com>

At last weeks meeting we discussed the difference between using the CGI
module with either the object interface, or the procedural interface.  I
created a simple CGI using both the object interface and procedural
interface of CGI.pm.

I think the procedural interface is easier to read.  Let find out what
the rest of you think.

First the object interface.
# import nothing
use CGI;

# create CGI object
my $cgi_obj = new CGI;

# call CGI object methods inside of print function
print
    $cgi_obj->header,
    $cgi_obj->start_html(-title   => 'Baltimore Perl Mongers',
                         -BGCOLOR => 'white'),
    $cgi_obj->h1('Baltimore Perl Mongers'),
    $cgi_obj->hr,
    $cgi_obj->h3('Perl Mongers is a not-for-profit organization
        whose mission is to establish Perl user groups'),
    $cgi_obj->end_html;

...and now the procedural interface.
# import functions in tags 'standard' and 'html3'
use CGI qw(:standard :html3);

# use CGI functions inside of print function
print
    header,
    start_html(-title   => 'Baltimore Perl Mongers',
               -BGCOLOR => 'white'),
    h1('Baltimore Perl Mongers'),
    hr,
    h3('Perl Mongers is a not-for-profit organization
        whose mission is to establish Perl user groups'),
    end_html;

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov

From sandoz at umbc.edu  Mon Aug  9 15:12:39 1999
From: sandoz at umbc.edu (James W. Sandoz; (BIO;FAC))
Date: Wed Aug  4 23:57:31 2004
Subject: object Perl
In-Reply-To: <37AF31B9.25D4E703@admc.com>
Message-ID: <Pine.SGI.3.96A.990809160756.1249757A-100000@umbc7.umbc.edu>

> I think the procedural interface is easier to read.  Let find out what
> the rest of you think.

Craig,
	Thanks for the two scripts. I agree that the procedural format is
easier to read and use it exclusively myself. I've never taken the
time and effort to test drive the oo format.

Is one 'better' to use than the other for more complicated things:
	saving state?
	directing to different pages depending on value of param()?
	other things?

	Jim

Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
				 1000 Hilltop Circle
				 Catonsville, MD 21250
voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu


From cfreter at digex.net  Mon Aug  9 15:19:56 1999
From: cfreter at digex.net (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: object Perl
References: <Pine.SGI.3.96A.990809160756.1249757A-100000@umbc7.umbc.edu>
Message-ID: <37AF37EC.85796992@digex.net>

James,

> > I think the procedural interface is easier to read.  Let find out what
> > the rest of you think.
> 
> Craig,
>         Thanks for the two scripts. I agree that the procedural format is
> easier to read and use it exclusively myself. I've never taken the
> time and effort to test drive the oo format.
> 
> Is one 'better' to use than the other for more complicated things:
>         saving state?
>         directing to different pages depending on value of param()?
>         other things?

The CGI module does a good job of preserving state across requests. 
This works the same with either the object or procedural interface.


-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov

From rmanning at erols.com  Mon Aug  9 19:56:36 1999
From: rmanning at erols.com (Rob Manning)
Date: Wed Aug  4 23:57:31 2004
Subject: object Perl
References: <Pine.SGI.3.96A.990809160756.1249757A-100000@umbc7.umbc.edu>
Message-ID: <37AF78C3.214C5BE@erols.com>


I've spent a good deal of time developing object-oriented
applications in Java, and if I had to guess what topics were
going to be covered in the new OO Perl Programming book, I'd say
the following at least (which might answer the 'which is better' question
you posed, Jim):

Data Encapsulation/Extraction - fancy way of saying you don't have
to keep track of where your global vars are and whose using them.
This is very useful when you have methods with many arguments
or ones that take data structures as arguments.  This technique also
provides methods/functions for manipulating the internal data
(variables) "safely" - that is according to the object's purpose and
not the whim of the user which could put you into a dangerous state.

Inheritance - you can do alot by extending the class you need to suit
your purposes and changing very little of the original code.

Polymorphism - your code can do "smart" things like providing an
implementation specific version of a method that it's ancestors
define.  This allows you to focus your efforts on what makes an
object different from its relatives instead of re-writing the code
that is the shared.

Readability - people use objects and interfaces each day in every
area of life.  Interfaces are common such as steering wheels on cars
or typewriters on keyboards.  We don't have to understand how a
combustion engine coverts fuel into energy and then applies that
energy through the transmission to the wheels in what we know
as a car.  We just get in, start the car, and before we know it we're
down the highway, concentrating on how we can use this object
we're driving and the map or directions objects we have to get us
to our destination.  So it's natural for us to objects in programming
just as in day-to-day life.

As far as which one to use, sometimes you can choose and sometimes
you can't.  If you're using someone else's module this may be determined
for you, but if you are implementing a module you might want to give
this some thought.  Certainly larger, built-to-scale applications would
benefit greatly from an object-oriented approach.  But if it's a script that
you'll use to save you a half an hour of selective text editing you might
decide that even arranging your code in the form of a module is too
cumbersome.  I think Steve's point the other night was valid in that
when you start passing alot of data structures as arguments into your
functions you might want to see if it wouldn't be better to create some
class definitions.

Food for thought...

Rob

"James W. Sandoz; (BIO;FAC)" wrote:

> > I think the procedural interface is easier to read.  Let find out what
> > the rest of you think.
>
> Craig,
>         Thanks for the two scripts. I agree that the procedural format is
> easier to read and use it exclusively myself. I've never taken the
> time and effort to test drive the oo format.
>
> Is one 'better' to use than the other for more complicated things:
>         saving state?
>         directing to different pages depending on value of param()?
>         other things?
>
>         Jim
>
> Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,
>                                  1000 Hilltop Circle
>                                  Catonsville, MD 21250
> voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu


From swaldman at mchange.com  Mon Aug  9 23:11:26 1999
From: swaldman at mchange.com (Steve Waldman)
Date: Wed Aug  4 23:57:31 2004
Subject: object Perl
References: <37AF31B9.25D4E703@admc.com>
Message-ID: <37AFA66E.C5A7BB19@mchange.com>

Web applications are an instructive example.

As you suggest, in a simple CGI model, all this "OO" crap just seems
like extra syntax mucking up the works.

But consider -- CGI was rev 0, the very simplest possible interface for
using the web as an interface to applications rather than a glorified
way of ftp'ing text files.

In the simple CGI model, there is a single HTTP request, and it is your
job to construct a single HTTP response. Treating the request and
response info as global state works fine under this model.

But CGI's simple approach -- one request, one response, one process --
is inadequate for high-performance web applications. If you'd like many
requests to share expensive resources -- database connection handles
spring to mind -- all of that request/response state for all those
clients is gonna have to live in the address space of a single running
process. All of a sudden, you have to start asking the question "which
response am I manipulating?" or "which request am I extracting
parameters from?"

Of course, you could make a global hashtable and manage this yourself.
But you'd find the procedural syntax looking ugly, fast. Being able to
simply create a $cgi_obj and manipulate its variables without affecting
the state of the hundred other simultaneous requests will seem very
elegant by comparison.

OO stuff is all about structure -- there is nothing you can do with OO
that you couldn't do without it. For very simple, one-off problems, OO
stuff is just extra baggage. But as problems get more complex, or if
your simple things might be reused in large, complicated systems, it
starts to make a lot of sense to create walls that segregate data and to
organize subsystems in to types that people can understand without
remembering all the internals. The marginal cost of the extra syntax
diminishes as the size and complexity of the system grows.

    smiles,
       Steve


Craig Freter wrote:
> 
> At last weeks meeting we discussed the difference between using the CGI
> module with either the object interface, or the procedural interface.  I
> created a simple CGI using both the object interface and procedural
> interface of CGI.pm.
> 
> I think the procedural interface is easier to read.  Let find out what
> the rest of you think.
> 
> First the object interface.
> # import nothing
> use CGI;
> 
> # create CGI object
> my $cgi_obj = new CGI;
> 
> # call CGI object methods inside of print function
> print
>     $cgi_obj->header,
>     $cgi_obj->start_html(-title   => 'Baltimore Perl Mongers',
>                          -BGCOLOR => 'white'),
>     $cgi_obj->h1('Baltimore Perl Mongers'),
>     $cgi_obj->hr,
>     $cgi_obj->h3('Perl Mongers is a not-for-profit organization
>         whose mission is to establish Perl user groups'),
>     $cgi_obj->end_html;
> 
> ...and now the procedural interface.
> # import functions in tags 'standard' and 'html3'
> use CGI qw(:standard :html3);
> 
> # use CGI functions inside of print function
> print
>     header,
>     start_html(-title   => 'Baltimore Perl Mongers',
>                -BGCOLOR => 'white'),
>     h1('Baltimore Perl Mongers'),
>     hr,
>     h3('Perl Mongers is a not-for-profit organization
>         whose mission is to establish Perl user groups'),
>     end_html;
> 
> --
> All that is complex is not useful,
> and all that is useful is simple.
>                        -- Mikhail Kalashnikov

From sandoz at umbc.edu  Wed Aug 11 18:14:37 1999
From: sandoz at umbc.edu (James W. Sandoz; (BIO;FAC))
Date: Wed Aug  4 23:57:31 2004
Subject: Job Posting
Message-ID: <Pine.SGI.3.96A.990811191203.1518419A-100000@umbc7.umbc.edu>

Greetings!
	I hope I'm not stepping on any toes by posting this.  If so, let
me know and I'll refrain from doing so in the future. (Rob, is there a
policy?)
	The following was posted on comp.lang.perl.misc

comp.lang.perl.misc #171207 (0 + 104 more)
From: "Marc Seldin" <mseldin@clark.net>
[1] Position Open At SunSpot
Date: Wed Aug 11 17:37:16 EDT 1999
Organization: Intermedia Business Internet - Beltsville, MD

SunSpot, the web site of The Baltimore Sun, has a Programmer Analyst
position open. I'm looking for a perl/sql/solaris junkie with a few years
of experience to assume a lot of responsibility for the site. Casual
atmosphere, telecommuting days, Quake after hours. Send me a resume, be
sure to let me know how much you're looking for.

mseldin@sunspot.net

--
________________________
Marc Seldin
Chief Technology Officer, Online
(410) 468-2634 office


Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
				 1000 Hilltop Circle
				 Catonsville, MD 21250
voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu


From Rob_Manning at mail.ci.baltimore.md.us  Thu Aug 12 07:11:44 1999
From: Rob_Manning at mail.ci.baltimore.md.us (Manning, Rob)
Date: Wed Aug  4 23:57:31 2004
Subject: Job Posting
Message-ID: <118CFE9A2035D111AE7B0060081C75131EB105@finance.ci.baltimore.md.us>

All,

I've been taking job offer posting requests from various recruiters
and placing them on the website - http://baltimore.pm.org/jobs.html

My thoughts on posting ads on the list are thus:

- No one likes to be spammed - sent mail that is irrelevant to the topic
  of the forum.

- People may decide to unsubscribe from a list which fosters this
  type of traffic - which defeats the purpose of having a list.

I think this posting is relevant to our group and it 
has come from a member as opposed to someone
whose only interest in the group is $$.

I'll place it on the website and if anyone else has thoughts
on this - agree or disagree - please sound off!  :-)

Rob

Rob Manning                                     manningr@tcsnet.net
Senior Systems Analyst                      Work (410) 396-4963
TeleCommunication Systems               Fax  (410) 837-0546

> -----Original Message-----
> From:	James W. Sandoz; (BIO;FAC) [SMTP:sandoz@umbc.edu]
> Sent:	Wednesday, August 11, 1999 7:15 PM
> To:	baltimore-pm-list@happyfunball.pm.org
> Subject:	Job Posting
> 
> Greetings!
> 	I hope I'm not stepping on any toes by posting this.  If so, let
> me know and I'll refrain from doing so in the future. (Rob, is there a
> policy?)
> 	The following was posted on comp.lang.perl.misc
> 
> comp.lang.perl.misc #171207 (0 + 104 more)
> From: "Marc Seldin" <mseldin@clark.net>
> [1] Position Open At SunSpot
> Date: Wed Aug 11 17:37:16 EDT 1999
> Organization: Intermedia Business Internet - Beltsville, MD
> 
> SunSpot, the web site of The Baltimore Sun, has a Programmer Analyst
> position open. I'm looking for a perl/sql/solaris junkie with a few
> years
> of experience to assume a lot of responsibility for the site. Casual
> atmosphere, telecommuting days, Quake after hours. Send me a resume,
> be
> sure to let me know how much you're looking for.
> 
> mseldin@sunspot.net
> 
> --
> ________________________
> Marc Seldin
> Chief Technology Officer, Online
> (410) 468-2634 office
> 
> 
> Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
> 				 1000 Hilltop Circle
> 				 Catonsville, MD 21250
> voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu

From dan_a_jacobson at yahoo.com  Thu Aug 12 23:23:49 1999
From: dan_a_jacobson at yahoo.com (Dan Jacobson)
Date: Wed Aug  4 23:57:31 2004
Subject: Job Posting
Message-ID: <19990813042350.12741.rocketmail@web105.yahoomail.com>


I agree - those are reasonable criteria.  If it is relevant to the
local perl community and a member of the group is the one who posts it
then it's probably okay.  If this gets abused we can change the
'policy.'

Dan

--- "Manning, Rob" <Rob_Manning@mail.ci.baltimore.md.us> wrote:
> ...
> 
> I think this posting is relevant to our group and it 
> has come from a member as opposed to someone
> whose only interest in the group is $$.
> 
> I'll place it on the website and if anyone else has thoughts
> on this - agree or disagree - please sound off!  :-)
> 
> Rob
> 
> Rob Manning                                     manningr@tcsnet.net
> Senior Systems Analyst                      Work (410) 396-4963
> TeleCommunication Systems               Fax  (410) 837-0546
> 
> > -----Original Message-----
> > From:	James W. Sandoz; (BIO;FAC) [SMTP:sandoz@umbc.edu]
> > Sent:	Wednesday, August 11, 1999 7:15 PM
> > To:	baltimore-pm-list@happyfunball.pm.org
> > Subject:	Job Posting
> > 
> > Greetings!
> > 	I hope I'm not stepping on any toes by posting this.  If so, let
> > me know and I'll refrain from doing so in the future. (Rob, is
> there a
> > policy?)
> > 	The following was posted on comp.lang.perl.misc
> > 
> > comp.lang.perl.misc #171207 (0 + 104 more)
> > From: "Marc Seldin" <mseldin@clark.net>
> > [1] Position Open At SunSpot
> > Date: Wed Aug 11 17:37:16 EDT 1999
> > Organization: Intermedia Business Internet - Beltsville, MD
> > 
> > SunSpot, the web site of The Baltimore Sun, has a Programmer
> Analyst
> > position open. I'm looking for a perl/sql/solaris junkie with a few
> > years
> > of experience to assume a lot of responsibility for the site.
> Casual
> > atmosphere, telecommuting days, Quake after hours. Send me a
> resume,
> > be
> > sure to let me know how much you're looking for.
> > 
> > mseldin@sunspot.net
> > 
> > --
> > ________________________
> > Marc Seldin
> > Chief Technology Officer, Online
> > (410) 468-2634 office
> > 
> > 
> > Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
> > 				 1000 Hilltop Circle
> > 				 Catonsville, MD 21250
> > voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu
> 

_________________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com


From Rob_Manning at mail.ci.baltimore.md.us  Tue Aug 17 15:49:39 1999
From: Rob_Manning at mail.ci.baltimore.md.us (Manning, Rob)
Date: Wed Aug  4 23:57:31 2004
Subject: Job announcements
Message-ID: <118CFE9A2035D111AE7B0060081C75131EB117@finance.ci.baltimore.md.us>


I've seen members bio's on other PM sites and I don't have a problem
with it.
If you'll send me your resume in html I can have a page that contains
links
for all members who are interested.  Alternatively, these links could
reference
personal web pages.  Anyone else have thoughts on this?

Rob
	

Rob Manning                                     manningr@tcsnet.net
Senior Systems Analyst                      Work (410) 396-4963
TeleCommunication Systems               Fax  (410) 837-0546
--
Looking for a  perl user's group in Baltimore? - http://baltimore.pm.org

> -----Original Message-----
> From:	Archibald Warnock [SMTP:warnock@awcubed.com]
> Sent:	Tuesday, August 17, 1999 3:47 PM
> To:	manningr@tcsnet.net
> Subject:	Job announcements
> 
> Hi Rob,
> 
> I'm back from travel.  I'm going to try to make the September meeting
> - I
> hope things stay clear so I can manage.
> 
> In addition to looking for job postings, I occasionally have need of
> subcontractors to do short-term development projects, often in perl.
> Think
> it would be worthwhile to post resumes of local perl programmers on
> the web
> site, too?
> 
> Archie
> 
> -- Archie Warnock                       Internet: warnock@awcubed.com
> -- A/WWW Enterprises                    Phone/FAX: 301-854-2987
> --                      http://www.awcubed.com
> --       As a matter of fact, I _do_ speak for my employer.
> 

From sandoz at umbc.edu  Thu Aug 26 09:37:33 1999
From: sandoz at umbc.edu (James W. Sandoz; (BIO;FAC))
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
Message-ID: <Pine.SGI.4.10A.B3.9908261029130.3766044-100000@umbc7.umbc.edu>

OK.  I've written a script which parses a class list sent by our
registrar.  The email is exported and then parsed so that the students can
be entered into a spreadsheet (It's 'prettified' regarding case and each
field becomes comma-separated).  The class list can contain up to 400
students, but in my case contains fewer.
The script works just fine, but it looks awful (that is, not 'perlish').
It's evolved over the past half year. I probably should re-write it from
scratch, but that's a later problem.

If anyone has suggestions I'd appreciate them.  Below is a typical class
list email (i hope linewrapping doesn't interfere. Each new line begins
with the SSN). and below that is the perl script.

Email from registrar:

From classlists at umbc.edu  Thu Aug 12 14:57:00 1999
From: classlists at umbc.edu (classlists@umbc.edu)
Date: Wed Aug  4 23:57:31 2004
Subject: Class List for Fall 1999 BIOL302L0402             
Message-ID: <mailman.0.1091681851.5604.baltimore-pm@mail.pm.org>

123-45-6789  APPLEJAK, ABBLE                                                       SOPHOMORE     BIOL       Reg   2.00  301-555-5555 applej1@umbc.edu
321-54-9876  EINSTEIN, ALBERT I.                                                   JUNIOR        BIOL       Reg   2.00  410-555-1234 aeinst1@umbc.edu
111-22-3333  KUBLE-KAHN, KRIS K.                                                   JUNIOR        VPAV/BIOL  Reg   2.00  301-555-2222 kkuble1@umbc.edu


Script:
#!/usr/local/bin/perl5 -wi.bak

#=============================
#"format_classlist" by JW Sandoz, Department of Biology, UMBC  
# August 25, 1999
# Normal disclaimers: Worked fine for me.  Should for you. No guarantees,
# though.
#=============================

# This script formats classlists at UMBC as mailed through EASI/myUMBC 
# to a format more easily parsed into a spreadsheet.
# One needs to export the email to a file in home directory (easily done
# in Pine with 'export').  Then type 
# "parse_classlist <filename_of_classlist_that_you_exported>"
# The parsed file contains all the fields that the email class list
# holds, EXCEPT the SSN is parsed to the last four digits (I use these as
# the Password for the student).
# In addition, the umbc username is parsed so that the unique id is
# captured in the field "Login ID".  The entire email address is retained
# as well.

$x = shift;
unshift @ARGV, $x;  # capturing filename and then returning it to argv

while (<>) {
	if (m/[A-Z]+\d\d\d[A-Z]?(\d\d\d\d)/) {$sect = $1}# capture sect
		#else {}
	local $_  = lc();	# lowercases everything
	s/^\D.*//g; 	 	# removes lines without starting number
	s/^\s*$//;			# removes blank lines
	s/\d\d\d-\d\d-//g;	# leaves last 4 digits of ssn
	s/(\w+.*,)\s(\w+\.?)?(\s+)(\w)/$1$2$3$4/; #capturing names
	s/\s{2,}\b/,/g; #replace multiple space with ','
	s/\s+,/,/g; 	#remove spaces before existing ','s
	s/(\d)\s/$1,/;	#puts a comma at the end of the phone number
	s/  / /;		#removes one of the spaces if two exist
	s/(_{9,12}) /$1,/; #puts comma at end of dashes (no phone #)
	s/([-])([a-z])/$1\u$2/g; #Caps second (hyphenated) name
	s/ (\w)/ \u$1/;	#removes leading space from MI and uppercases it
	s/(\w+)(\@.*)/\l$1,\l$1$2/; #separate username into Login ID + username
	if ($1) {
	print "$sect," . $_;  # prints changes to file
	}		#prepends the section number to each student's record. Useful
			#if a course has more than one section.
}
	
open (FILE, "$x") or die $!;
	@a = <FILE>; 
close FILE;

#the following prettifies the text: leading caps instead of all caps

@proper = map {(my $y = $_) =~ s/\,(.)/\,\u$+/g; $y } @a;
	#oops. It uppercases the username as well.
@proper2 = map {(my $y = $_) =~ s/\,([A-Z]\w{1,6})\,([A-Z]\w{1,6}\@)/\,\l$1,\l$2/; $y } @proper;
# @proper2 fixes (lowercases) username 

unshift @proper2, ('Sect,','SSN,','Last Name,','First Name,','Standing,','Major,','Grade_Method,','Credits,','Phone,','Login ID,','email',"\n");
# above adds a heading for each field

open (FH2, ">$x") or die $!;
	print FH2 @proper2; # writes to the file
close FH2;


Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
				 1000 Hilltop Circle
				 Catonsville, MD 21250
voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu


From cfreter at digex.net  Thu Aug 26 14:19:16 1999
From: cfreter at digex.net (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
References: <Pine.SGI.4.10A.B3.9908261029130.3766044-100000@umbc7.umbc.edu>
Message-ID: <37C59334.F99EC4D2@digex.net>

James,

I modified your student parser script.  I used a different approach, in
that I try to match the entire student line with a single regular
expression.  I don't know if that makes the script more 'perlish', but
you might find my approach interesting.

"James W. Sandoz; (BIO;FAC)" wrote:
> OK.  I've written a script which parses a class list sent by our
> registrar.  The email is exported and then parsed so that the students can
> be entered into a spreadsheet (It's 'prettified' regarding case and each
> field becomes comma-separated).  The class list can contain up to 400
> students, but in my case contains fewer.
> The script works just fine, but it looks awful (that is, not 'perlish').
> It's evolved over the past half year. I probably should re-write it from
> scratch, but that's a later problem.
> 
> If anyone has suggestions I'd appreciate them.  Below is a typical class
> list email (i hope linewrapping doesn't interfere. Each new line begins
> with the SSN). and below that is the perl script.
> 
> Email from registrar:
> 
> >From classlists@umbc.edu Thu Aug 12 21:22:56 1999
> Date: Thu, 12 Aug 1999 14:57:00
> From: classlists@umbc.edu
> To: sandoz@umbc.edu
> Cc: dina@umbc.edu
> Subject: Class List for Fall 1999 BIOL302L0402
> 
> 123-45-6789  APPLEJAK, ABBLE                                                       SOPHOMORE     BIOL       Reg   2.00  301-555-5555 applej1@umbc.edu
> 321-54-9876  EINSTEIN, ALBERT I.                                                   JUNIOR        BIOL       Reg   2.00  410-555-1234 aeinst1@umbc.edu
> 111-22-3333  KUBLE-KAHN, KRIS K.                                                   JUNIOR        VPAV/BIOL  Reg   2.00  301-555-2222 kkuble1@umbc.edu
> 
> Script:
> #!/usr/local/bin/perl5 -wi.bak
> 
> #=============================
> #"format_classlist" by JW Sandoz, Department of Biology, UMBC
> # August 25, 1999
> # Normal disclaimers: Worked fine for me.  Should for you. No guarantees,
> # though.
> #=============================
> 
> # This script formats classlists at UMBC as mailed through EASI/myUMBC
> # to a format more easily parsed into a spreadsheet.
> # One needs to export the email to a file in home directory (easily done
> # in Pine with 'export').  Then type
> # "parse_classlist <filename_of_classlist_that_you_exported>"
> # The parsed file contains all the fields that the email class list
> # holds, EXCEPT the SSN is parsed to the last four digits (I use these as
> # the Password for the student).
> # In addition, the umbc username is parsed so that the unique id is
> # captured in the field "Login ID".  The entire email address is retained
> # as well.
> 
> $x = shift;
> unshift @ARGV, $x;  # capturing filename and then returning it to argv
> 
> while (<>) {
>         if (m/[A-Z]+\d\d\d[A-Z]?(\d\d\d\d)/) {$sect = $1}# capture sect
>                 #else {}
>         local $_  = lc();       # lowercases everything
>         s/^\D.*//g;             # removes lines without starting number
>         s/^\s*$//;                      # removes blank lines
>         s/\d\d\d-\d\d-//g;      # leaves last 4 digits of ssn
>         s/(\w+.*,)\s(\w+\.?)?(\s+)(\w)/$1$2$3$4/; #capturing names
>         s/\s{2,}\b/,/g; #replace multiple space with ','
>         s/\s+,/,/g;     #remove spaces before existing ','s
>         s/(\d)\s/$1,/;  #puts a comma at the end of the phone number
>         s/  / /;                #removes one of the spaces if two exist
>         s/(_{9,12}) /$1,/; #puts comma at end of dashes (no phone #)
>         s/([-])([a-z])/$1\u$2/g; #Caps second (hyphenated) name
>         s/ (\w)/ \u$1/; #removes leading space from MI and uppercases it
>         s/(\w+)(\@.*)/\l$1,\l$1$2/; #separate username into Login ID + username
>         if ($1) {
>         print "$sect," . $_;  # prints changes to file
>         }               #prepends the section number to each student's record. Useful
>                         #if a course has more than one section.
> }
> 
> open (FILE, "$x") or die $!;
>         @a = <FILE>;
> close FILE;
> 
> #the following prettifies the text: leading caps instead of all caps
> 
> @proper = map {(my $y = $_) =~ s/\,(.)/\,\u$+/g; $y } @a;
>         #oops. It uppercases the username as well.
> @proper2 = map {(my $y = $_) =~ s/\,([A-Z]\w{1,6})\,([A-Z]\w{1,6}\@)/\,\l$1,\l$2/; $y } @proper;
> # @proper2 fixes (lowercases) username
> 
> unshift @proper2, ('Sect,','SSN,','Last Name,','First Name,','Standing,','Major,','Grade_Method,','Credits,','Phone,','Login ID,','email',"\n");
> # above adds a heading for each field
> 
> open (FH2, ">$x") or die $!;
>         print FH2 @proper2; # writes to the file
> close FH2;
> 
> Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,
>                                  1000 Hilltop Circle
>                                  Catonsville, MD 21250
> voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.pl
Type: application/x-perl
Size: 1712 bytes
Desc: not available
Url : http://mail.pm.org/archives/baltimore-pm/attachments/19990826/d1fa4c42/test2.bin
From dharris at drh.net  Thu Aug 26 14:26:24 1999
From: dharris at drh.net (David Harris)
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
In-Reply-To: <37C59334.F99EC4D2@digex.net>
Message-ID: <005a01beeff8$e2a75ea0$0500a8c0@delf>


I've done this a lot in perl.. using a single large regular expression.
However, sometimes I think that I am pushing it because more than one time I
have created a regular expression that does so much backtracking that it chews
up 20 seconds of CPU before I kill it.

For example, I got into lots of trouble with this regular expression, which was
designed to parse zone files. You read the whole zone file into $stuff and loop
grabbing records off the top with this regex and removing blank links and
comments with others.

                       $stuff =~
                       s/
                               ^
                               \s*
                               (?:(\S+)\s+)?
                               (?:\d+\s+)?
                               (?:IN\s+)?
                               (?:([a-zA-Z]+)\s+)
                               (
                               (?:
                                       [^\(\n]+ (?: \; .* )?
                                       |
                                       \(
                                       (?:
                                               [^\)\n\;]*
                                               (?: \; .* )? \n
                                               |
                                               [^\)\n\;]+
                                       )*?
                                       \)
                               )+?
                               )
                               \n
                       //x

Anybody know anything about this?

 - David Harris
   Principal Engineer, DRH Internet Services


-----Original Message-----
From:	owner-baltimore-pm-list@happyfunball.pm.org
[mailto:owner-baltimore-pm-list@happyfunball.pm.org] On Behalf Of Craig Freter
Sent:	Thursday, August 26, 1999 3:19 PM
To:	James W. Sandoz; (BIO;FAC)
Cc:	baltimore-pm-list@happyfunball.pm.org
Subject:	Re: Script Question

 << File: test2.pl >> James,

I modified your student parser script.  I used a different approach, in
that I try to match the entire student line with a single regular
expression.  I don't know if that makes the script more 'perlish', but
you might find my approach interesting.


From cfreter at digex.net  Thu Aug 26 15:40:35 1999
From: cfreter at digex.net (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
References: <005a01beeff8$e2a75ea0$0500a8c0@delf>
Message-ID: <37C5A643.F61D8916@digex.net>

David,

Because a DNS zone file contains different kinds of records (e.g. A,
CNAME, MX, NS, SOA), you may want to break up the regular expression.  A
separate regular expression for each record type would eliminate much of
the backtracking, since you are now matching more specific data.

My 2 cents worth.

> I've done this a lot in perl.. using a single large regular expression.
> However, sometimes I think that I am pushing it because more than one time I
> have created a regular expression that does so much backtracking that it chews
> up 20 seconds of CPU before I kill it.
> 
> For example, I got into lots of trouble with this regular expression, which was
> designed to parse zone files. You read the whole zone file into $stuff and loop
> grabbing records off the top with this regex and removing blank links and
> comments with others.
> 
>                        $stuff =~
>                        s/
>                                ^
>                                \s*
>                                (?:(\S+)\s+)?
>                                (?:\d+\s+)?
>                                (?:IN\s+)?
>                                (?:([a-zA-Z]+)\s+)
>                                (
>                                (?:
>                                        [^\(\n]+ (?: \; .* )?
>                                        |
>                                        \(
>                                        (?:
>                                                [^\)\n\;]*
>                                                (?: \; .* )? \n
>                                                |
>                                                [^\)\n\;]+
>                                        )*?
>                                        \)
>                                )+?
>                                )
>                                \n
>                        //x
> 
> Anybody know anything about this?
> 
>  - David Harris
>    Principal Engineer, DRH Internet Services
> 
> -----Original Message-----
> From:   owner-baltimore-pm-list@happyfunball.pm.org
> [mailto:owner-baltimore-pm-list@happyfunball.pm.org] On Behalf Of Craig Freter
> Sent:   Thursday, August 26, 1999 3:19 PM
> To:     James W. Sandoz; (BIO;FAC)
> Cc:     baltimore-pm-list@happyfunball.pm.org
> Subject:        Re: Script Question
> 
>  << File: test2.pl >> James,
> 
> I modified your student parser script.  I used a different approach, in
> that I try to match the entire student line with a single regular
> expression.  I don't know if that makes the script more 'perlish', but
> you might find my approach interesting.

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov

From dharris at drh.net  Thu Aug 26 15:53:41 1999
From: dharris at drh.net (David Harris)
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
In-Reply-To: <37C5A643.F61D8916@digex.net>
Message-ID: <005c01bef005$141b78c0$0500a8c0@delf>


Craig Freter wrote:
> David,
>
> Because a DNS zone file contains different kinds of records (e.g. A,
> CNAME, MX, NS, SOA), you may want to break up the regular expression.  A
> separate regular expression for each record type would eliminate much of
> the backtracking, since you are now matching more specific data.
>
> My 2 cents worth.

I don't think splitting regex into multiple ones for each kind of record would
help, because each record is still allowed to specify or not specify the name,
the address class, and the time to live. Each record is also allowed to use the
( .. ) syntax to span newlines. This causes all the back tracking.

I guess replacing "([a-zA-Z]+)" with something like "(a|cname|mx|ns|soa)" could
help reduce the backtracking.

My solution was just to get rid of all the junk to deal with the ( .. ) line
continuation, and just made it so if a record had multiline data, I got the
first line of data, and the rest of the lines were not parsed. I didn't care
about the actual data, so this worked for me.

However, I'm more interested to find out why the regex caused huge amounts of
backtracking.

 - David Harris
   Principal Engineer, DRH Internet Services


From dharris at drh.net  Thu Aug 26 15:57:11 1999
From: dharris at drh.net (David Harris)
Date: Wed Aug  4 23:57:31 2004
Subject: fun with regular expressions
Message-ID: <005d01bef005$9143e080$0500a8c0@delf>


I wrote this one night for the heck of it, because some guy asked if I could do
it. It's a cute regular expression, so I though you all might be interested for
the novelty of it.

$_="JOIN THE JAX PERL MONGERS J zbjrnmuhkkd Pdqk Mnmfdqr J\n";
tr/za-y/a-z/; / ((.)) /; do { print substr($_,25); }
while ( s/((.).{23}) $2 (.)(.*) (.)$/$1$5 $2 $4 $3/ )

Ya just gotta love perl!

 - David Harris
   Principal Engineer, DRH Internet Services


From cfreter at digex.net  Thu Aug 26 16:24:55 1999
From: cfreter at digex.net (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
References: <005c01bef005$141b78c0$0500a8c0@delf>
Message-ID: <37C5B0A7.21C55482@digex.net>

David Harris wrote:
> My solution was just to get rid of all the junk to deal with the ( .. ) line
> continuation, and just made it so if a record had multiline data, I got the
> first line of data, and the rest of the lines were not parsed. I didn't care
> about the actual data, so this worked for me.

Since the SOA record is *usually* the only multiline record in a zone
file, this should work for all other record types.

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov

From sandoz at umbc.edu  Fri Aug 27 05:01:48 1999
From: sandoz at umbc.edu (James W. Sandoz; (BIO;FAC))
Date: Wed Aug  4 23:57:31 2004
Subject: Script Question
In-Reply-To: <37C59334.F99EC4D2@digex.net>
Message-ID: <Pine.SGI.4.10A.B3.9908270448270.4197705-100000@umbc7.umbc.edu>

Wow!
I've embarrassed myself with clunky code.

Thanks Craig.  It not only looks better, it runs faster.  I constructed a
student list of about 33000 records and benchmarked your script and mine.  

Test2.pl took 13 wallclock secs
Format took 46 wallclock secs

	Jim Sandoz

Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,  
				 1000 Hilltop Circle
				 Catonsville, MD 21250
voice: (410) 455-3497; fax: 455-3875; net: sandoz@umbc.edu


From mjd at plover.com  Mon Aug 30 00:41:15 1999
From: mjd at plover.com (Mark-Jason Dominus)
Date: Wed Aug  4 23:57:31 2004
Subject: `Perl Hardware Store' talk
Message-ID: <19990830054115.28967.qmail@plover.com>


Hi, folks.  Rob Manning has graciously invited me to come visit, and
I'll be giving my `Perl Hardware Store' talks, which were very popular
at the big Perl Confererences last year and last month.  The subtitle
of the talk is `Tools you didn't know you needed'.  Each talk presents
six programming techniques, some big and some small, that might be
useful to you in your Perl programming life.

Complete notes for the first of the two talks are available at

	http://www.plover.com/~mjd/perl/TPC/1998/Hardware.html

if you want to get some advance warning of what I'll be discussing.
The tools I actually present will depend on how much time I have and
what grabs me at the moment.

Afterwards I hope we'll get to hang out somewhere.

I'm looking forward to my visit.  Thanks for inviting me!


Mark-Jason Dominus 	  			               mjd@plover.com

From cshannon at mdo.net  Tue Aug 31 06:51:05 1999
From: cshannon at mdo.net (Chris Shannon)
Date: Wed Aug  4 23:57:31 2004
Subject: CGI.pm Question
Message-ID: <002101bef3a7$1c462680$1b6c8acf@p1m9w6>

Hi.

I've been trying to get the below script to work, but so far, this is what
happens:

After doing chmod 777 on the directory where the files are to be uploaded,
the script will create an empty file in the upload directory, the name
returned by
param('filename') really is the file name, however, nothing uploads.  The
script is activated by a web browser using a form with an upload file field
in a form.

Using perl -c -w, the syntax checks out OK and there are no warnings except
that the $bytesread scalar is pointed out as occuring only once, which can
be forgiven here.

What could be wrong?

Thanks in advance, Chris Shannon (Shannon Design)

#!/usr/bin/perl
use lib '/http/yourwebsite/mylib';
use CGI qw(:standard);

$ofn = '/http/yourwebsite.com/upload/upfile.dmp';

$| = 1;

$fh=param('filename');
open OUTFILE, ">$ofn";

while ($bytesread=read($fh,$buffer,1024)) {
   print OUTFILE $buffer;
}

close($fh);
close(OUTFILE);

chmod (0666, "$ofn");


From cfreter at digex.net  Tue Aug 31 08:48:44 1999
From: cfreter at digex.net (Craig Freter)
Date: Wed Aug  4 23:57:31 2004
Subject: CGI.pm Question
References: <002101bef3a7$1c462680$1b6c8acf@p1m9w6>
Message-ID: <37CBDD3C.5DC16688@digex.net>

Chris,

Since this script runs as a CGI, you need to print out an HTTP header.

> Hi.
> 
> I've been trying to get the below script to work, but so far, this is what
> happens:
> 
> After doing chmod 777 on the directory where the files are to be uploaded,
> the script will create an empty file in the upload directory, the name
> returned by
> param('filename') really is the file name, however, nothing uploads.  The
> script is activated by a web browser using a form with an upload file field
> in a form.
> 
> Using perl -c -w, the syntax checks out OK and there are no warnings except
> that the $bytesread scalar is pointed out as occuring only once, which can
> be forgiven here.
> 
> What could be wrong?
> 
> Thanks in advance, Chris Shannon (Shannon Design)
> 
> #!/usr/bin/perl
> use lib '/http/yourwebsite/mylib';
> use CGI qw(:standard);
> 
> $ofn = '/http/yourwebsite.com/upload/upfile.dmp';
> 
> $| = 1;
> 
> $fh=param('filename');
> open OUTFILE, ">$ofn";
> 
> while ($bytesread=read($fh,$buffer,1024)) {
>    print OUTFILE $buffer;
> }
> 
> close($fh);
> close(OUTFILE);
> 
> chmod (0666, "$ofn");

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov