Script Question
Craig Freter
cfreter at digex.net
Thu Aug 26 14:19:16 CDT 1999
James,
I modified your student parser script. I used a different approach, in
that I try to match the entire student line with a single regular
expression. I don't know if that makes the script more 'perlish', but
you might find my approach interesting.
"James W. Sandoz; (BIO;FAC)" wrote:
> OK. I've written a script which parses a class list sent by our
> registrar. The email is exported and then parsed so that the students can
> be entered into a spreadsheet (It's 'prettified' regarding case and each
> field becomes comma-separated). The class list can contain up to 400
> students, but in my case contains fewer.
> The script works just fine, but it looks awful (that is, not 'perlish').
> It's evolved over the past half year. I probably should re-write it from
> scratch, but that's a later problem.
>
> If anyone has suggestions I'd appreciate them. Below is a typical class
> list email (i hope linewrapping doesn't interfere. Each new line begins
> with the SSN). and below that is the perl script.
>
> Email from registrar:
>
> >From classlists at umbc.edu Thu Aug 12 21:22:56 1999
> Date: Thu, 12 Aug 1999 14:57:00
> From: classlists at umbc.edu
> To: sandoz at umbc.edu
> Cc: dina at umbc.edu
> Subject: Class List for Fall 1999 BIOL302L0402
>
> 123-45-6789 APPLEJAK, ABBLE SOPHOMORE BIOL Reg 2.00 301-555-5555 applej1 at umbc.edu
> 321-54-9876 EINSTEIN, ALBERT I. JUNIOR BIOL Reg 2.00 410-555-1234 aeinst1 at umbc.edu
> 111-22-3333 KUBLE-KAHN, KRIS K. JUNIOR VPAV/BIOL Reg 2.00 301-555-2222 kkuble1 at umbc.edu
>
> Script:
> #!/usr/local/bin/perl5 -wi.bak
>
> #=============================
> #"format_classlist" by JW Sandoz, Department of Biology, UMBC
> # August 25, 1999
> # Normal disclaimers: Worked fine for me. Should for you. No guarantees,
> # though.
> #=============================
>
> # This script formats classlists at UMBC as mailed through EASI/myUMBC
> # to a format more easily parsed into a spreadsheet.
> # One needs to export the email to a file in home directory (easily done
> # in Pine with 'export'). Then type
> # "parse_classlist <filename_of_classlist_that_you_exported>"
> # The parsed file contains all the fields that the email class list
> # holds, EXCEPT the SSN is parsed to the last four digits (I use these as
> # the Password for the student).
> # In addition, the umbc username is parsed so that the unique id is
> # captured in the field "Login ID". The entire email address is retained
> # as well.
>
> $x = shift;
> unshift @ARGV, $x; # capturing filename and then returning it to argv
>
> while (<>) {
> if (m/[A-Z]+\d\d\d[A-Z]?(\d\d\d\d)/) {$sect = $1}# capture sect
> #else {}
> local $_ = lc(); # lowercases everything
> s/^\D.*//g; # removes lines without starting number
> s/^\s*$//; # removes blank lines
> s/\d\d\d-\d\d-//g; # leaves last 4 digits of ssn
> s/(\w+.*,)\s(\w+\.?)?(\s+)(\w)/$1$2$3$4/; #capturing names
> s/\s{2,}\b/,/g; #replace multiple space with ','
> s/\s+,/,/g; #remove spaces before existing ','s
> s/(\d)\s/$1,/; #puts a comma at the end of the phone number
> s/ / /; #removes one of the spaces if two exist
> s/(_{9,12}) /$1,/; #puts comma at end of dashes (no phone #)
> s/([-])([a-z])/$1\u$2/g; #Caps second (hyphenated) name
> s/ (\w)/ \u$1/; #removes leading space from MI and uppercases it
> s/(\w+)(\@.*)/\l$1,\l$1$2/; #separate username into Login ID + username
> if ($1) {
> print "$sect," . $_; # prints changes to file
> } #prepends the section number to each student's record. Useful
> #if a course has more than one section.
> }
>
> open (FILE, "$x") or die $!;
> @a = <FILE>;
> close FILE;
>
> #the following prettifies the text: leading caps instead of all caps
>
> @proper = map {(my $y = $_) =~ s/\,(.)/\,\u$+/g; $y } @a;
> #oops. It uppercases the username as well.
> @proper2 = map {(my $y = $_) =~ s/\,([A-Z]\w{1,6})\,([A-Z]\w{1,6}\@)/\,\l$1,\l$2/; $y } @proper;
> # @proper2 fixes (lowercases) username
>
> unshift @proper2, ('Sect,','SSN,','Last Name,','First Name,','Standing,','Major,','Grade_Method,','Credits,','Phone,','Login ID,','email',"\n");
> # above adds a heading for each field
>
> open (FH2, ">$x") or die $!;
> print FH2 @proper2; # writes to the file
> close FH2;
>
> Mr. James W. Sandoz, Instructor, UMBC Dept of Biol Sciences,
> 1000 Hilltop Circle
> Catonsville, MD 21250
> voice: (410) 455-3497; fax: 455-3875; net: sandoz at umbc.edu
--
All that is complex is not useful,
and all that is useful is simple.
-- Mikhail Kalashnikov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.pl
Type: application/x-perl
Size: 1712 bytes
Desc: not available
Url : http://mail.pm.org/archives/baltimore-pm/attachments/19990826/d1fa4c42/test2.bin
More information about the Baltimore-pm
mailing list