[LA.pm] perl CGI querying of directory filenames most efficient method?

Fri Sep 23 02:42:27 PDT 2005

On 9/22/05, Peter Benjamin <pete at peterbenjamin.com> wrote:
>
> I hope I've written an email that can be understood.
> My advise is to read it all the way through before
> replying, as it is a complex "overall" efficiency
> question, involving just not the perl CGI code,
> but also the web server needing the same directory.
>
> --
>
> What would be the most CPU/IO efficient method to
> test whether the following filenames exist to
> get images to display on a web page, where any
> matching filenames should be displayed?
>
> Only one file from the following possibilities
> the client would be uploading:

You really could generate a static page right after the upload instead of
generating a page on demand when requested.

skunumber.jpg
> skunumber.gif
> skunumber.medium.jpg
> skunumber.medium.gif
> skunumber.large.jpg
> skunumber.large.gif
>
> Currently, it uses if-elsif-elsif-elsif... using the -e test
> against the full pathname.
>
> Plus any matching names from this list:
>
> skunumber.A.jpg (second image to display with the one above)
> skunumber.B.jpg (third, etc...)
> skunumber.C.jpg
> skunumber.D.jpg
>
> This uses 4 if statements using the -e test against the full pathname.

my @displayable = grep { -e $_ } @candidate_files ;

So, either 0, 1, or 2 to 5 images might be displayed.
>
> Would it be faster to get the entire directory listing of
> 12,000 images (and growing) with this type of statement:

12,000 images in one directory is suspicious isn't it? Why not have a
directory per sku number? or a directory per user? There must be some way to
partition your data set so that one directory is not holding everything.

cd pathname;
> foreach $filename ( <skunumber*.*> ) {
> if-elsif-else
> }

foreach my $file (<skunumber/*.*>) {
# note the "/" so that each sku's files are in a separate directory... I'm
not sure about efficiency, but you might look into inodes and how they
relate to storing files in a directory. Each inode can only hold X amount of
file info. On ext2 and ext3 you get 3 options when creating an ext
filesystem, one for a directory holding a bunch of itty-bitty teensy weensy
files, one for people downloading a huge files (like videos and music) and
one for in-between.

Maybe a readdir would be even faster?

Why has speed become an issue? Correctness is much more important in my
view... get a few more machines and load-balance. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/losangeles-pm/attachments/20050923/e9f3e3dc/attachment.html