[Omaha.pm] Mapping Perl structures to a SQL table...

Fri Oct 30 14:06:57 PDT 2009

Heya Dan,

It would help to have a bit of information about the general structure you
are currently using, before giving any ideas about storing the data in a SQL
Database.  Obviously your doing Directories and Files, but the structure
helps in determining the way to make things work in the SQL Database.  Poor
design of the SQL Database, will lead to poor execution of SQL instructions,
as much more is required to get the data you want.

To give you some idea of how I would convert a file system database into a
SQL Database, I'll give you an example of a file structure database, and a
description of what entry means, then show the resulting database structure
for SQL.
An example, of a simple design using Folders:

db/
  cust_records/
    cust1_info.dat
    cust1_purchase.dat
    cust2_info.dat
    cust2_purchase.dat
  inventory/
    item1_info.dat
    item2_info.dat

cust#_info.dat contains:  Customer Name, Address, Phone Number, Shipping
Info, and such.
cust#_purchase.dat contains: Customer's a record of all the purchases that a
customer has made.

item#_info.dat contains: Name of Item, Description, Price, Quantity,
Shipping price, Shipping Weight.

Now, to convert this into a SQL Database, I would formulate it as such:

db
  customers
    id                   - INTEGER, PRIMARY KEY, AUTOINCREMENT
    name                 - STRING
    address              - STRING
    phone                - STRING
    ship_to              - STRING
  transactions
    id                   - INTEGER, PRIMARY KEY, AUTOINCREMENT
    cust_id              - INTEGER, PRIMARY KEY
    item_id              - INTEGER
    quantity             - INTEGER
    purchased_date       - INTEGER
  inventory
    id                   - INTEGER, PRIMARY KEY, AUTOINCREMENT
    item_name            - STRING
    description          - STRING
    price                - STRING
    total_per_quantity   - INTEGER
    ship_price           - INTEGER
    ship_weight          - INTEGER

Now some explaining about what the right hand side is all about in the above
layout. The first field in all tables are 'id', which is marked as INTEGER,
PRIMARY KEY, AUTOINCREMENT.  Integer denotes a number, of course, Primary
Key tells the SQL engine to make quick look ups based upon this field being
one of the more often checked fields to look up records in a database.
Finally the Auto increment (Which is one word in SQL), denotes the fact that
each new record put into the SQL database, should take the total number of
rows, and add one to that number, to assign the identification number for
this record.   And lastly, a String is a variable length of text data to be
stored.  Most SQL engines will allow for 5 or 6 paragraphs worth of text,
but this can be expensive in storage and retrieval.  If you know that a
field is only going to be so many characters, such as Phone, maximum being
13 characters, then you can use VARCHAR(13) as the maximum length of the
data that is going to be stored in that field.

There's also FLOAT, which allows for decimal points, but Integers in most
SQL Engines will take decimal numbers, and keep the decimals.  But it's
always best to see what data types a SQL engine supports, before making a
final decision.  Most DBI's will automatically provide a way to store common
data types in the database, at their best formulation to save as much space
for the database engine to handle.  So look at Perl's DBI for Constructing
Tables to see what assistance it will bring you.  Lastly, one other data
type I didn't cover in the above database, is the BLOB data type.  BLOB data
types are for storing Binary data in, should you find the need to store some
binary data in the database.

With blobs, there are no conversions done to store the data in the database,
it's stored as is (As in, as you provide it to the SQL Database), and can
contain any valid byte sequence in it.  Meaning, anything between 0 and 255
can be stored here.  Most SQL Engines will store UTF-8/16 characters in
strings without stripping them, but when in doubt, you can use the Blob data
type.

Now, with the explanations of the data types out of the way, the structure
is efficiently designed, for the simple fact, that if you have the ID of the
customer, you can get all the items that they purchased, and get each items
information from the id's that you get from the transaction table.  You can
even use the SQL Instruction JOIN to get all the data you need in a single
execute, for example, if you wanted to get the name of the person, the name
of the item, and the total cost, you could simply do:

SELECT name, item_name, price
FROM transactions
JOIN customers
  ON customers.id = transactions.cust_id
JOIN inventory
  ON inventory.id = transactions.item_id;

This will return a list of all transactions in the format of:
name | item_name | price

Examples being:

"John Doe","ASUS PC",299.99
"Mary Johnson", "Microsoft Mouse", 19.99
etc, etc.

It makes cross-table look ups a lot easier, to get the relevant data for
what you need, and only what you need.  And all of it is handled by the SQL
Engine, not Perl, or whatever high level language you use, so the execution
speed is greatly improved.

HTH,

Mario

On Fri, Oct 30, 2009 at 1:31 PM, Dan Linder <dan at linder.org> wrote:

> I'm taking on the task of converting our in-house tool to use the Perl
> DBI module to replace the Data::Dumper/eval() it currently uses to
> store and retrieve data.  Not pretty, but it has worked pretty well
> for the small data sets we've been using.
>
> We now have some people commenting on the speed - some have pages take
> 7+ minutes to bring up waiting for the back-end perl code to ripple
> through the directory structure and eval() the necessary files to
> build the page.  The "eval" function seems to be the bulk of the time
> as I expected...
>
> What I'm looking for is some general comments and discussion about the
> mental task of mapping these hash tables into a SQL table.  I'm not
> really looking for a tool, more a high level discussion about ways to
> store the data and still remain flexible.
>
> Dan
>
> --
> ******************* ***************** ************* ***********
> ******* ***** *** **
> "Quis custodiet ipsos custodes?" (Who can watch the watchmen?) -- from
> the Satires of Juvenal
> "I do not fear computers, I fear the lack of them." -- Isaac Asimov
> (Author)
> ** *** ***** ******* *********** ************* *****************
> *******************
> _______________________________________________
> Omaha-pm mailing list
> Omaha-pm at pm.org
> http://mail.pm.org/mailman/listinfo/omaha-pm
>

-- 
Mario Steele
http://www.trilake.net
http://www.ruby-im.net
http://rubyforge.org/projects/wxruby/
http://rubyforge.org/projects/wxride/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/omaha-pm/attachments/20091030/552822a1/attachment.html>