[VPM] [RFC] SQL::AST (SQL Abstract Syntax Tree) - DBI related

Mon Aug 11 16:58:59 CDT 2003

Hello, this request for comment is in particular aimed at those of you who use databases and/or DBI with Perl.

Thanks to a topic covered in Piers Cawley's most recent Perl 6 Summary (generic Parrot code generators), I think I have finally figured out what is the standard industry terminology to describe and/or use to name a new module I am making: "AST" or "Abstract Syntax Tree".

P.S.  Please CC your reply directly to my address (darren at DarrenDuncan.net) in addition to sending it to the discussion list.

I am writing you today to get second opinions for proposed names on my new module, which I would like to register as soon as possible.

The current working title I have, which came out of a productive RFC I did on May 29th, is "SQL::ObjectModel".  I did submit that name on June 2nd to get registered; it never got registered, but on the bright side it gave me time to come up with the possibly better title.

I propose something like "SQL::AST" or "SQL::AbstractSyntax" or "SQL::AbstractSyntaxTree".  I like the first one best, because it is short.  Or maybe "SQL::AS".

SQL::ObjectModel is currently on CPAN in alpha development status, and I plan to upload a major code/design revision in the next few days.  But if possible I would like to know if the new module name is workable, in which case I will upload under that name.

So my question for you is, does "SQL::AST" sound good, and is the term "AST" well known enough that people who would likely use this module would know its meaning?  Or would I attract more users by expanding the title.  Or should I stick with ObjectModel?  In my mind, AST seems to be a lot more descriptive to what the module actually does, or rather "is".

On a related question, after I have finished the PurePerl implementation of this module, I plan to make a C library implementation, and so what would be a good name for that?  In my experience, C libraries often have names like "libABC" (such as libXML); might I be best to have a name like that, such as "libSQLAST".   Or would it sound better without that, such as "SQLAST"?

Thank you very much in advance for any feedback.

-- Darren Duncan

P.S.  For your convenience, below is a copy of some pod from SQL::ObjectModel v0.03, which describes it.  It will be updated/rewritten after the rename.

---------------------------------

This Perl 5 object class is intended to be a powerful but easy to use
replacement for SQL strings (including support for placeholders), which you can
use to make queries against a database.  Each SQL::ObjectModel object can
represent a non-ambiguous structured command for a database to execute, or one
can be a non-ambiguous structured description of a database schema object.  This
class supports all types of database operations, including both data
manipulation and schema manipulation, as well as managing database instances
and users.  You typically construct a database query by setting appropriate
attributes of these objects, and you execute a database query by evaluating the
same attributes.  SQL::ObjectModel objects are designed to be equivalent to SQL
in both the type of information they carry and in their conceptual structure.
This is analagous to how XML DOMs are objects that are equivalent to XML
strings, and they can be converted back and forth at will.  If you know SQL, or
even just relational database theory in general, then this module should be
easy to learn.

SQL::ObjectModels are intended to represent all kinds of SQL, both DML and DDL,
both ANSI standard and RDBMS vendor extensions.  Unlike basically all of the
other SQL generating/parsing modules I know about, which are limited to basic
DML and only support table definition DDL, this class supports arbitrarily
complex select statements, with composite keys and unions, and calls to stored
functions; this class can also define views and stored procedures and triggers.
Some of the existing modules, even though they construct complete SQL, will
take/require fragments of SQL as input (such as "where" clauses)  By contrast,
SQL::ObjectModel takes no SQL fragments.  All of its inputs are atomic, which
means it is also easier to analyse the objects for implementing a wider range
of functionality than previously expected; for example, it is much easier to
analyse any select statement and generate update/insert/delete statements for
the virtual rows fetched with it (a process known as updateable views).

Considering that each database product has its own dialect of SQL which it
implements, you would have to code SQL differently depending on which database
you are using.  One common difference is the syntax for specifying an outer
join in a select query.  Another common difference is how to specify that a
table column is an integer or a boolean or a character string.  Moreover, each
database has a distinct feature set, so you may be able to do tasks with one
database that you can't do with another.  In fact, some databases don't support
SQL at all, but have similar features that are accessible thorough alternate
interfaces. SQL::ObjectModel is designed to represent a normalized superset of
all database features that one may reasonably want to use.  "Superset" means
that if even one database supports a feature, you will be able to invoke it
with this class. You can also reference some features which no database
currently implements, but it would be reasonable for one to do so later.
"Normalized" means that if multiple databases support the same feature but have
different syntax for referencing it, there will be exactly one way of referring
to it with SQL::ObjectModel.  So by using this class, you will never have to
change your database-using code when moving between databases, as long as both
of them support the features you are using (or they are emulated).  That said,
it is generally expected that if a database is missing a specific feature that
is easy to emulate, then code which evaluates SQL::ObjectModels will emulate it
(for example, emulating "left()" with "substr()"); in such cases, it is
expected that when you use such features they will work with any database.  For
example, if you want a model-specified boolean data type, you will always get
it, whether it is implemented  on a per-database-basis as a "boolean" or an
"int(1)" or a "number(1,0)".  Or a model-specified "str" data type you will
always get it, whether it is called "text" or "varchar2" or "sql_varchar".

SQL::ObjectModel is intended to be just a stateless container for database
query or schema information.  It does not talk to any databases by itself and
it does not generate or parse any SQL; rather, it is intended that other third
party modules or code of your choice will handle this task.  In fact,
SQL::ObjectModel is designed so that many existing database related modules
could be updated to use it internally for storing state information, including
SQL generating or translating modules, and schema management modules, and
modules which implement object persistence in a database.  Conceptually
speaking, the DBI module itself could be updated to take SQL::ObjectModel
objects as arguments to its "prepare" method, as an alternative (optional) to
the SQL strings it currently takes.  Code which implements the things that
SQL::ObjectModel describes can do this in any way that they want, which can
mean either generating and executing SQL, or generating Perl code that does the
same task and evaling it, should they want to (the latter can be a means of
emulation).  This class should make all of that easy.

SQL::ObjectModel is especially suited for use with applications or modules that
make use of data dictionaries to control what they do.  It is common in
applications that they interpret their data dictionaries and generate SQL to
accomplish some of their work, which means making sure generated SQL is in the
right dialect or syntax, and making sure literal values are escaped correctly.
By using this module, applications can simply copy appropriate individual
elements in their data dictionaries to SQL::ObjectModel properties, including
column names, table names, function names, literal values, bind variable names,
and they don't have to do any string parsing or assembling.

Now, I can only imagine why all of the other SQL generating/parsing modules
that I know about have excluded privileged support for more advanced database
features like stored procedures.  Either the authors didn't have a need for it,
or they figured that any other prospective users wouldn't need it, or they
found it too difficult to implement so far and maybe planned to do it later. As
for me, I can see tremendous value in various advanced features, and so I have
included privileged support for them in SQL::ObjectModel.  You simply have to
work on projects of a significant size to get an idea that these features would
provide a large speed, reliability, and security savings for you.  Look at many
large corporate or government systems, such as those which have hundreds of
tables or millions of records, and that may have complicated business logic
which governs whether data is consistent/valid or not.  Within reasonable
limits, the more work you can get the database to do internally, the better.  I
believe that if these features can also be represented in a database-neutral
format, such as what SQL::ObjectModel attempts to do, then users can get the
full power of a database without being locked into a single vendor due to all
their investment in vendor-specific SQL stored procedure code.  If customers
can move a lot more easily, it will help encourage database vendors to keep
improving their products or lower prices to keep their customers, and users in
general would benefit.  So I do have reasons for trying to tackle the advanced
database features in SQL::ObjectModel.