SPUG: Fw: greetings

Ken McGlothlen mcglk at artlogix.com
Wed Oct 17 21:29:22 CDT 2001


"Russell Miller" <duskglow2000 at yahoo.com> writes:

| ok, we've got a six million line file to read.  I had written a program that
| scaled just fine for smaller files, but it choked on this file, and took an
| extreme amount of time. [...]

| open (FILE, "<$INPUT");
| @array = <FILE>;
| close FILE;
| foreach $k (@array) { ... }

Well, Russell, the problem here is that you're reading the entire file into an
array.  It's no wonder your system got rather bogged down---the process sucked
in enough RAM to store the entire file.

| So, we changed it to read sequentially:
| 
| open (FILE, "<$input");
| foreach $k (<FILE>) { ... }

The problem here, though, is that you're not actually reading it sequentially.
Believe it or not, this is pretty much exactly the same as the previous
example.  Why?  Because <FILE> is being read in an list context.  Granted, it's
not a named array, but it's exactly the same as if you'd written

        foreach $k ( "line 1", "line 2", "line 3", ... "line 6,000,000" ) {...}

| So I changed the foreach line to:
| 
| while ($k = <FILE>) { ... }

Which is exactly right; because <FILE> is being read in a scalar context this
time, it only returns one line at a time.  That's gonna save you a lot of RAM,
and therefore swapping time.

This is a fairly common gotcha that hits everyone from time to time.  Implied
contexts can be confusing.  If you try to remember what *should* be in a
particular location, you'll know how the <...> operator works.  For example:

        while( <FILE> ) { ... }

will read one line at a time, but

        for( <FILE> ) { ... }

will read the entire file up front, and then process it.

Hope that helps.

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://zipcon.net/spug/





More information about the spug-list mailing list