SPUG:CGI header question

Thu Jun 19 13:46:20 CDT 2003

Aaron,
	Good points...

	It's true that I don't store my CGI scripts in the database. :)  I do
however use CVS to distribute my scripts to the web servers.  I wasn't
thinking about these when I was talking about data.  I guess I think of the
system as having two components, scripts and data.  I don't have any static
content that's not stored in the database.  So, I guess I've got two things
to backup (The CVS Repository which should really be stored in a database
too, and the database), which is still better than having three things to
back up (CVS, Database, File System stored data)

	I don't know about you, but when I get an error on a drive I throw it out
and get a new one. :)  I don't really plan for losing part of the data on a
drive, so I only am prepared to deal with recovering entire machines at
once.  It is easier to pull a file from a .tar than restore an entire
database into a temp space so you can pull out one image, but I do far more
backups than restores, so it seems like a good trade off to me.

	I'm not sure what your devils advocate remark is referencing.  Which part
do you disagree with?  That file systems have limits to number of files in a
directory?  That I have directories with too many files in them to do a
rm -rf *?  That you can have millions of records in a database table?  That
it's a pain in the butt to have to deal with trees of directories to store
files that really should be in the same place?  I believe that all of these
are true, with the possible exception of the limited number of files in a
single directory, which is just something I've been told.  I wasn't being
toung in cheek.

	I understand what you're saying about the hammer and screwdriver, but I
think we're talking more about hand cranked drills vs. power drills.  Both
the file system and the database are tools for organizing and manipulating
data.
	I never work with a database and think, "If only the db would do thing X
that file systems do", but often when I'm looking for files in the file
system, screwing with piping file names and data round to grep or whatnot, I
think "If only I could write a nice query to get this information it would
make my life so much easier!"

	What are the benefits to storing the data in the file system, other than
the two you mentioned (it's easier to restore partial backups, don't have to
mess with telling a browser that a file has changed because Apache will take
care of it)?  These two don't seem so compelling to me, but I'm sure there
are benefits that I don't know about.

	I admit that I'm far more familiar with databases than I am with unix (or
windows or macos) system tools, so I find it far easier to work within.  I
concede that I might find the system tools/file system rout as, or more,
satisfying if I was more familiar with them, but there's a lot of value in
using what you're comfortable with.

Thanks,
Peter Darley
-----Original Message-----
From: spug-list-admin at mail.pm.org [mailto:spug-list-admin at mail.pm.org]On
Behalf Of Aaron Salo
Sent: Thursday, June 19, 2003 10:46 AM
To: SPUG
Subject: RE: SPUG:CGI header question

At 09:56 AM 6/19/2003 -0700, Peter Darley wrote:
>	A partial rebuttal :)

Much appreciated. I am open-minded to changing my opinion, just have never
heard a compelling argument to do so. Let the discussion begin!
<g>

>		I want to have central data storage, meaning a single machine that
servers
>all the data/files/etc.  If I store stuff in the database I don't have to
>mess with other paths of data from my central storage to the web server(s).
>It may be easy to set up a nfs share or whatever (which isn't actually the
>case), but then I need to maintain twice as many services on the server,
>secure twice as many access points, and rely on systems that are harder to
>secure (NFS vs. database access).

Understood, hypothetically, but in the instance we're talking about I'm
having trouble getting traction here. The only way you could gain a benefit
from this would be to eliminate the file system based content entirely,
thereby removing the need for a place to store files completely.  We are
talking about a web application, right? Are you also going to store all
HTML static content in the database as CLOBs? What about your CGI scripts?
Probably not unless you are the most awesome mod_rewrite god of all times
and you're dynamically generating your perl scripts from database calls and
mod_rewrite skullduggery of Damianesque proportions. That stuff is in a
filesystem. So by definition you already have a filesystem and a database.

Even in a load balanced distributed front end system with a central
database, you either have a central place for your static content to reside
(NFS) or you're replicating it out to each of the front end systems from a
staging server. Either way you have content in a filesystem already.
Putting the images into the filesystem where they can be served up by
apache along with the HTML and other static content is no stretch, and no
increase in services or security concerns beyond the status quo. The only
exception would be if you had a fully database driven system with ALL
content in the database and NO content in a file system, and that is highly
unlikely. Even fully dynamic CMS systems have content in file systems, even
if that content consists of templates.

>		When performing backups I only have one thing to back up; the database.
>If data is stored in the FS and in the DB I have to back up both and it
gets
>to be significantly more work backup and restore.

Not sure I agree with you here, given the fact that if you're a prudent
sort you're backing up your filesystem and your database anyhow, your
server has files on it you're already backing up. I hope. If you need to
restore, because you had disk failure and you need to get some missing
files, do you want to go through the entire database volume and find your
restore point to get those blobs back, or do you want to granularly (word?)
untar the missing files out of the filesystem backup?

>		I don't have to have extra functions/etc to work with/delete/whatever
>data.  I can treat all data the same and have the same functions for
working
>with them.  There's nothing about binary data that is magically different
>from other data, so why have a whole second system to work with it?

Actually, depending on your database, there are some arcane and proprietary
ops you need to do to work with blobs in a general sense. Let's skip that
for now.

I don't
>know a lot about files systems, but it's my understanding that you are
>limited to the number of files in a single directory.  I do know a lot
about
>databases however, and I know that there's no problem storing millions and
>millions of records in a single table.  If I exceeded the limits of the
file
>system in a single directory I then have to start making trees of
>directories to hold all my files, which makes for a huge mess.  I currently
>have directories for a system I wrote that didn't store stuff in the
>database, and I can't go into the directory and do a 'rm -rf *' because
>there are more files than rm can deal with.

I'll presume that you are being the devil's advocate here and have your
tongue firmly in cheek.

>		In general databases are far easier to work with.  That's why people talk
>about 'making the file system work like a database' and never talk about
>'making a database work like the file system'.

I have a hammer and a screwdriver. The reason why is that they are suited
to different tasks. Although I have occasionally used the handle of a
screwdriver to pound in a nail, or used the blade to pry one out, the
hammer is best suited to those jobs. I don't remember ever wishing that I
had a "hammer that worked like a screwdriver". That's why I have both.

_____________________________________________________________
Seattle Perl Users Group Mailing List
POST TO: spug-list at mail.pm.org
ACCOUNT CONFIG: http://mail.pm.org/mailman/listinfo/spug-list
MEETINGS: 3rd Tuesdays, U-District, Seattle WA
WEB PAGE: www.seattleperl.org