[Purdue-pm] Loading in unused modules

Wed Nov 19 05:41:22 PST 2014

At yesterday's meeting we talked a bit about the performance hit from loading in modules that are not actually used.  Perhaps a common occurrence of this is doing a:

use Data::Dumper ;

And then never actually using the Dumper routine. Quite a few of the CPAN modules depend on 'Data::Dumper' ... http://deps.cpantesters.org/depended-on-by.pl?dist=Data-Dumper-2.154 ... but one has to wonder just exactly how many of those really use DD or just have leftover DD code in them.

Anyway the question is if there is a performance hit and how many lines of a module get read in if not being used. Looking at DD and using NYTProf on a very simple "hello world" program

Without 'use Data::Dumper' ... ~50 ms
With 'use Data::Dumper' ... ~75 ms; 50 statements from Data::Dumper are executed

So obviously a hit.  As one might expect -- a file has to be read in and lines in the file parsed if for not other reason than to figure out what routines are exported.

Of course in the overall scheme of things a 25ms increase is not that much although if the module is called a lot of times the performance could add up.  Until recently we had 'use Data::Dumper' but no actual use of it in our database initialization module.  However since our module also calls YAML and DBI modules the extra 25 ms is not that much.  Running a very simple program that does a simple SELECT statement has a run time of around 300 ms.  So the unneeded 'use Data::Dumper' is adding 8% to the run time and undoubtedly a lot less for more complex programs.

-- 
Rick Westerman 
westerman at purdue.edu

Bioinformatics specialist at the Genomics Facility.
Phone: (765) 494-0505           FAX: (765) 496-7255
Department of Horticulture and Landscape Architecture
625 Agriculture Mall Drive
West Lafayette, IN 47907-2010
Physically located in room S049, WSLR building