APM: Perl, Win32 OLE, and Excel

John Warner jwarner at texas.net
Thu Jul 23 10:20:19 PDT 2009


All,

I have a project where I am trying to filter through a large amount of data
from an Excel spreadsheet.  Since I don't have access to the databases where
the data actually resides, I have to use a spreadsheet that was given to me.
The spreadsheet contains 79 columns and approximately 113k rows.  The data
are customer satisfaction survey results along with a plethora of other
garbage I don't need.  I am only interested in a few columns.

My code goes like this...

Create an Excel Object
Use Object to open Source and Destination spreadsheets
Find the column and row boundaries of where data is within the source.

my @ArrayOfNames = ('Bill', 'Bob', 'Jane', 'Tom', 'Dick', 'Harry');

#Columns
#	Source		Destination		Description
#	Column		Column
#		28			3			Responsible
Tech
#		55			5			Tech Sat
Score
#		57			6			Overall Sat
Score
#
foreach my $row (2..$LastRow) #skip header row on row 1
{
	#check the responsible tech 
	foreach my $t (@ArrayOfNames)
	{
		my $cellObj = $srcSheet->Cells($row,28);
		print "Current:  $t  \t Incident tech:  $cellObj->{Value} ";

		if ($t =~ m/$srcSheet->Cells($row,28)->{Value}/)
		{
			print "found a match!\n";
			if ($srcSheet->Cells($row,55)->{Value} < 7 ||
$srcSheet->Cells($row,57)->{Value} < 7)
			{
				#copy data from source to destination
			}
		}else{
			#print "not a match \n";
			next;
		}
	}
}	  

My question:  With 113k rows to go through, Perl runs out of memory and the
processing takes quite a while.  How can I be more efficient?  


John Warner
jwarner at texas.net
H:  512.251.1270
C:  512.426.3813






More information about the Austin mailing list