[VPM] FYI - modern Muldis D code examples

Thu Apr 9 03:59:12 PDT 2009

Hello,

This message is an update to my post from yesterday of the same subject.  If you 
don't have it handy, then a copy is archived at 
http://mm.darrenduncan.net/pipermail/muldis-db-devel/2009-April/000023.html .

I have significantly updated the Muldis D code samples of yesterday and the new 
versions are included at the end of this message.

The new versions have exactly the same meaning (that is, translate the same SQL) 
but are much better formed; please ignore the old versions in favor of the new.

The first main difference is correcting an unintentional large error.  In all 
places except the NOW() example where I used a "var Type $foo", such as:

     procedure proc1 () {
         main {
             var Tuple $t;
             inn.init_t( &t => $t );
             assign_insertion( &r => $fed.data.db1.foo, t => $t );
         }
         inner_updater init_t (Tuple &$t) {
             $t := Tuple:{ col1 => 1, col2 => 2 };
         }
     }

... I actually meant to not use a var, such as:

     procedure proc1 () {
         main {
             inn.do_insert( &foo => $fed.data.db1.foo );
         }
         inner_updater do_insert (Relation &$foo) {
             assign_insertion( &r => $foo,
                 t => Tuple:{ col1 => 1, col2 => 2 } );
         }
     }

So for those cases please treat the latter style as what I intended in the first 
place.

Aside from that, what was posted before is what I intended to say; however I've 
now made a variety of improvements:

1.  No "main { }" is necessary when the routine has no inner routines.
2.  An arg name can be omitted when invoking a routine with just one param.
3.  The ":=" syntax that used to mean bind a name/alias to a value expression (a 
"with" clause) is now spelled "=", same as it is in the Haskell language.
4.  Now ":=" only is an alt syntax for invoking the "assign" updater routine.
5.  Added alternate / terser syntax for several more built-in functions; for 
example, "semijoin" now has an infix alternate named "matching", and 
"restriction" now has an infix alternate named "where".
6.  Inlined most value exprs so very few explicitly named ones are left.

I also expect many more such improvements, particularly like #5, will come 
about, but I haven't figured out yet what format they would take.

Now, following some feedback I got mainly from one person, I'll offer a few FAQ 
items in case you're confused by some things:

1.  In regards to how compact these Muldis D code examples look compared to the 
original SQL, code compactness varies depending on use cases, and sometimes the 
Muldis D code would be smaller than the SQL code.  You would notice this more as 
the SQL examples get more complicated (submissions of sample SQL code to 
translate welcome); SQL would no longer seem smaller.  The Muldis D code is also 
huffmanized more around better practices, for example using bind parameters 
versus hard-coding your data.

2.  Muldis D tries to be more authentic with its terminology, so words like 
"relation" are used here in the way that DBAs and makers of DBMSs use them, or 
people who are familiar with the logic or math behind the relational model.  See 
http://en.wikipedia.org/wiki/Relational_model if it helps.  So a "relation" *is* 
a value, same as an "array" is a value, and a relation type can be the type of a 
parameter or result or variable.  When you see "relation" think "rowset", or 
when you see "relvar / relation-typed variable" think "table".  Note that to try 
and head off such understandings, the Basics.pod of the Muldis D spec starts off 
with a glossary of common or commonly misinterpreted terms.  (Don't be confused 
by terminology used with some ORMs that give a different meaning to "relation".)

3.  Muldis D does not have any reserved words at all, and doesn't stop you from 
naming your types/routines/vars/attrs/etc whatever you want (with very few 
exceptions).  However, all named things live in namespaces, and thus this 
freedom doesn't cause any interference between things a user defines and things 
that are built-in to the language.  In many cases, spelling out the names of 
things in full is optional, and so the code examples just use the unqualified 
versions for brevity.  To be specific, names must be invoked in their full form 
except in two specific cases; the first is if they are system-defined entities 
that aren't variables, and the other is if they are lexicals (that are 
variables).  So, wherever you see a plain "$foo" it can be read as "$lex.foo", 
and any plain "op()" can be read as "sys.std.*.op()".  And so, for example this:

     procedure proc1 (Instant &$time) {
         fetch_current_instant( &$time );
     }

... with fully spelled out names is:

     procedure proc1 (sys.std.Core.Type.Instant &$time) {
         sys.std.Temporal.Instant.fetch_current_instant( $lex.time );
     }

4.  The concept of "inner routines" is just an artifact of Muldis D's design, 
which enforces clear separation of pure and impure code (functions and updaters 
are pure, deterministic, and atomic; procedures are not), and how it represents 
nested distinct lexical scopes/blocks, or conceptually inline-defined closures 
etc, as well as how it supports internal recursion, as well as one way to keep 
the grammar/parsers/generators simpler.  So when you see "inner routine", think 
"inlined closure" or "nested block" or "pure section".  Now, you don't actually 
have to use inner routines; those items could be declared as full routines 
instead, but inners save you from polluting public namespaces, and really are 
the closest analogy to languages with actual inlining.

5.  Muldis D can inference types, but it still uses explicit type declarations 
in code partly to aid type checking of code in isolation as well aid in 
self-documentation.  Types like "Relation" and "Tuple" are actually quite broad 
and in practice you may often declare with explicit subset types of those.

6.  Muldis D code *is* declarative.  While routines might look sometimes like an 
ordered sequence of steps to follow, they really are just a description of what 
result is desired.  In particular, Muldis D is designed such that most code 
would be forced into pure sections, and so a DBMS is very much empowered to 
optimize it.

7.  As a general explanation for why the Muldis D examples looks the way they do ...

One of Muldis D's main features is that it is homoiconic, meaning that its code 
is also data, and you can introspect or alter code at runtime.  This manifests 
mainly with the special global variables called the system catalog, which serve 
the same function as SQL's INFORMATION_SCHEMA but that mine breaks down 
everything and doesn't just store routines etc as code strings.

Now I designed Muldis D starting at the AST level, meaning as it is represented 
in the system catalog, figuring out how to effectively represent all types and 
data and routines as an abstract syntax tree, which can also be the target 
format of parsers and the source format for code generators.

Using this syntax tree form as a starting point, I have then been creating a 
concrete Muldis D syntax, "PTMD_Tiny" that can map to the AST / system catalog 
more or less 1:1, and this concrete syntax is what the code examples are in.

So the concrete syntax is really just a layer of sugar over the AST.  Over time 
I will continue to add more sugar to make the concrete syntax sufficiently terse 
and easier to use.

Meanwhile, what you see is how far along that road of adding sugar I got to.

8.  Keep in mind that my example code is meant to be a more literal translation 
of what the SQL is saying.  If the SQL spelled out columns to return, so does my 
example.  If the SQL said return all columns, then mine doesn't spell them out.

Anyway, thank you for your time.

-- Darren Duncan

==========

SQL:

     SELECT 1

Muldis D Text:

     function func1 (NNInt <--) {
         1;
     }

or:

     function func1 (Relation <--) {
         Relation:{ { attr1 => 1 } };
     }

==========

SQL:

     SELECT NOW() AS time FROM dual AS duality

Muldis D Text:

     procedure proc1 (Instant &$time) {
         fetch_current_instant( &$time );
     }

or:

     procedure proc1 (Relation &$r) {
         main {
             var Instant $now;
             fetch_current_instant( &$now );
             inn.wrapup( &r => $r, now => $now );
         }
         inner_updater wrapup (Relation &$r, Instant $now) {
             $r := Relation:{ { time => $now } };
         }
     }

==========

SQL:

     SELECT 1 FROM foo LEFT OUTER JOIN bar ON ( foo.col1 = bar.col2 )

Muldis D Text:

     # Note: This example assumes neither of foo|bar have a col named colJ,
       and moreover that foo|bar have no other col names in common. #

     procedure proc1 (Relation &$result) {
         main {
             inn.query( &result => $result, db => $fed.data.db1 );
         }
         inner_updater query (Relation &$result, Database $db) {
             $outer_join = outer_join_with_maybes(
                 primary => ($db.foo{col1=>colJ}),
                 secondary => ($db.bar{col2=>colJ}) );
             $result := static_extension(
                 topic => ($outer_join{}),
                 attrs => Tuple:{ attr1 => 1 } );
         }
     }

or, if we want the result of the outer join itself:

     procedure proc1 (Relation &$result) {
         main {
             inn.query( &result => $result, db => $fed.data.db1 );
         }
         inner_updater query (Relation &$result, Database $db) {
             $result := outer_join_with_maybes(
                 primary => ($db.foo{col1=>colJ}),
                 secondary => ($db.bar{col2=>colJ}) );
         }
     }

or, if we simply want to know if there were matching rows in foo and bar:

     procedure proc1 (Bool &$result) {
         main {
             inn.query( &result => $result, db => $fed.data.db1 );
         }
         inner_updater query (Bool &$result, Database $db) {
             $result := is_not_empty(
                 (($db.foo{col1=>colJ}) join ($db.bar{col1=>colJ})) );
         }
     }

==========

SQL:

     SELECT * FROM foo WHERE name = 'John'

Muldis D Text (with the data to match on hard-coded):

     procedure proc1 (Relation &$result) {
         main {
             inn.selfoo( &result => $result, foo => $fed.data.db1.foo );
         }
         inner_updater selfoo (Relation &$result, Relation $foo) {
             $result := ($foo matching Relation:{ { name => 'John' } });
         }
     }

or, with the data to match on in a bind var:

     procedure proc1 (Relation &$result, Text $name) {
         main {
             inn.selfoo( &result => $result,
                 foo => $fed.data.db1.foo, name => $name );
         }
         inner_updater selfoo
                 (Relation &$result, Relation $foo, Text $name) {
             $result := ($foo matching Relation:{ { name => $name } });
         }
     }

or:

     procedure proc1 (Relation &$result, Relation $filter) {
         main {
             inn.selfoo( &result => $result, foo => $fed.data.db1.foo,
                 filter => $filter );
         }
         inner_updater selfoo
                 (Relation &$result, Relation $foo, Relation $filter) {
             $result := ($foo matching $filter);
         }
     }

==========

SQL:

     SELECT COUNT(*) FROM foo
     WHERE name = 'John' AND ( title = 'Mr' OR abbrev = 'Dr' )

Muldis D Text:

     procedure proc1 (NNInt &$count) {
         main {
             inn.selfoo( &count => $count, foo => $fed.data.db1.foo );
         }
         inner_updater selfoo (NNInt &$count, Relation $foo) {
             $count := (cardinality of ($foo where F->inn.filter));
         }
         inner_function filter (Bool <-- Tuple $topic) {
             (($topic.name === 'John')
                 and (($topic.title === 'Mr') or ($topic.abbrev === 'Dr')));
         }
     }

==========

SQL:

     SELECT COUNT(DISTINCT(*)) FROM foo
     WHERE ( name = 'John' AND title = 'Mr' ) OR abbrev = 'Dr'

Muldis D Text:

     # Note: Above DISTINCT not translated as all ops are set-based. #

     procedure proc1 (NNInt &$count) {
         main {
             inn.selfoo( &count => $count, foo => $fed.data.db1.foo );
         }
         inner_updater selfoo (NNInt &$count, Relation $foo) {
             $count := (cardinality of ($foo where F->inn.filter));
         }
         inner_function filter (Bool <-- Tuple $topic) {
             ((($topic.name === 'John') and ($topic.title === 'Mr'))
                 or ($topic.abbrev === 'Dr'));
         }
     }

==========

SQL:

     SELECT foo, bar, baz FROM foo ORDER BY bar, baz DESC GROUP BY 1,3,2

Muldis D Text:

     # Note: Above GROUP BY not translated as it's a no-op. #

     procedure proc1 (Array &$result) {
         main {
             inn.selfoo( &result => $result, foo => $fed.data.db1.foo );
         }
         inner_updater selfoo (Array &$result, Relation $foo) {
             $result := (array from ($foo{foo,bar,baz})
                 ordered using F->inn.sortf);
         }
         inner_function sortf (Order <-- Tuple $topic, Tuple $other) {
             Order_reduction( Array:[
                 ($topic.bar <=> $other.bar),
                 ($other.baz <=> $topic.baz)
             ] );
         }
     }

==========

SQL:

     SELECT * FROM ( SELECT 1 ) AS foo

Muldis D Text:

     function func1 (Relation <--) {
         $foo = Relation:{ { attr1 => 1 } };
         $'' = $foo;
     }

==========

SQL:

     INSERT INTO foo ( col1, col2 ) VALUES ( 1, 3 )

Muldis D Text (with data to insert hard-coded):

     procedure proc1 () {
         main {
             inn.do_insert( &foo => $fed.data.db1.foo );
         }
         inner_updater do_insert (Relation &$foo) {
             assign_insertion( &r => $foo,
                 t => Tuple:{ col1 => 1, col2 => 2 } );
         }
     }

or, with the data to insert as a bind var:

     procedure proc1 (type.tuple_from.var.fed.data.db1.foo $t) {
         assign_insertion( &r => $fed.data.db1.foo, t => $t );
     }

==========

SQL:

     INSERT INTO foo ( col1, col2 ) VALUES ( 1, 3 ), ( 2, 4 )

Muldis D Text (with data to insert hard-coded):

     procedure proc1 () {
         main {
             inn.do_insert( &foo => $fed.data.db1.foo );
         }
         inner_updater do_insert (Relation &$foo) {
             assign_union( &topic => $foo,
                 other => Relation:[col1,col2];{ [1,3], [2,4] } );
         }
     }

or, with the data to insert as a bind var:

     procedure proc1 (type.var.fed.data.db1.foo $r) {
         assign_union( &topic => $fed.data.db1.foo, other => $r );
     }

==========

SQL:

     UPDATE foo SET col1 = 1

Muldis D Text (with data to substitute hard-coded):

     procedure proc1 () {
         main {
             inn.do_update( &foo => $fed.data.db1.foo );
         }
         inner_updater do_update (Relation &$foo) {
             assign_static_substitution(
                 &topic => $foo, attrs => Tuple:{ col1 => 1 } );
         }
     }

or, with the data to substitute as a bind var:

     procedure proc1 (Tuple $attrs) {
         assign_static_substitution(
             &topic => $fed.data.db1.foo, attrs => $attrs );
     }

or:

     procedure proc1 (Int $col1) {
         main {
             inn.do_update( &foo => $fed.data.db1.foo, col1 => $col1 );
         }
         inner_updater do_update (Relation &$foo, Int $col1) {
             assign_static_substitution(
                 &topic => $foo, attrs => Tuple:{ col1 => $col1 } );
         }
     }

==========

SQL:

     UPDATE foo SET col1 = 1, col2 = 6

Muldis D Text (with data to substitute hard-coded):

     procedure proc1 () {
         main {
             inn.do_update( &foo => $fed.data.db1.foo );
         }
         inner_updater do_update (Relation &$foo) {
             assign_static_substitution(
                 &topic => $foo, attrs => Tuple:{ col1 => 1, col2 => 6 } );
         }
     }

or, with the data to substitute as a bind var:

     procedure proc1 (Tuple $attrs) {
         assign_static_substitution(
             &topic => $fed.data.db1.foo, attrs => $attrs );
     }

or:

     procedure proc1 (Int $col1, Int $col2) {
         main {
             inn.do_update( &foo => $fed.data.db1.foo,
                 col1 => $col1, col2 => $col2 );
         }
         inner_updater do_update (Relation &$foo, Int $col1, Int $col2) {
             assign_static_substitution( &topic => $foo,
                 attrs => Tuple:{ col1 => $col1, col2 => $col2 } );
         }
     }

==========

SQL:

     DELETE FROM foo WHERE col1 = 10

Muldis D Text (with data to filter by hard-coded):

     procedure proc1 () {
         main {
             inn.do_delete( &foo => $fed.data.db1.foo );
         }
         inner_updater do_delete (Relation &$foo) {
             assign_semidifference( &source => $foo,
                 filter => Relation:{ { col1 => 10 } } );
         }
     }

or, with the data to filter by as a bind var:

     procedure proc1 (Relation $filter) {
         assign_semidifference(
             &source => $fed.data.db1.foo, filter => $filter );
     }

or:

     procedure proc1 (Int $col1) {
         main {
             inn.do_delete( &foo => $fed.data.db1.foo, col1 => $col1 );
         }
         inner_updater do_delete (Relation &$foo, Int $col1) {
             assign_semidifference( &source => $foo,
                 filter => Relation:{ { col1 => $col1 } } );
         }
     }

==========

SQL:

     INSERT INTO foo ( col1, col2 ) SELECT col1, col2 FROM bar;

Muldis D Text (if bar has more than 2 attributes):

     procedure proc1 () {
         main {
             inn.do_insert( &foo => $fed.data.db1.foo,
                 bar => $fed.data.db1.bar );
         }
         inner_updater do_insert (Relation &$foo, Relation $bar) {
             assign_union( &topic => $foo, other => ($bar{col1,col2}) );
         }
     }

or, if bar has only the 2 attributes:

     procedure proc1 () {
         assign_union( &topic => $fed.data.db1.foo,
             other => $fed.data.db1.bar );
     }

==========