[Pdx-pm] off-topic: C90 declaration question

Marvin Humphrey marvin at rectangular.com
Sun Apr 16 09:00:33 PDT 2006


I have a C programming question, and I have two excuses for asking it  
here.  First, I need the answer so that my XS code can be more  
portable and more people can use my CPAN modules.  Second, it relates  
to the op-tree discussion from the March meeting.

In Perl, the "my" keyword signals that a lexical variable needs to be  
allocated.  If there's a "my" declaration anywhere in a block, the op- 
tree has an op dedicated to allocating space for that lexical.  If  
you eject from the block via return/next/etc. before the allocation  
op, the allocation never happens.

If you are operating under "use strict;" and you use a fully- 
qualified package global variable, the allocation op is still there  
-- Perl understands that it needs to allocate space for a variable  
before it can be used.  The same thing implicit allocation happens  
for any global variable when "use strict;" isn't in force.

Under C90, all variable declarations have to occur at the top of a  
block.  If you try to compile a C program that contains this...

     meaning_of_life() {
         have_fun();      /* "code" */
         int i;           /* declare a variable, after "code" */
         i = 42;
         return i;

... and you pass the "-pedantic" flag to the gcc compiler so that it  
warns about non-C90-compliant code, you get this warning:

     meaning.c: In function 'meaning_of_life':
     meaning.c:23: warning: ISO C90 forbids mixed declarations and code

That example is perfectly legal under the later C99 standard, but it  
doesn't fly under C90 and there are a lot of compilers out there that  
choke on it.

I understand that by imposing this constraint the creators of C were  
trying to make life easy for compiler writers.  In my head, I imagine  
an op-tree that has all the variable allocation routines up front.   
"We'll tell the compiler about all the variables we might possibly  
need in this block before we do anything with any of them."

     meaning_of_life() {
         int i, j; /* nuthin' but allocation ops here */

         i = 41;   /* NOW we begin executable code */
         j = 1;
         return i + j;

However, that model gets messier when you consider that C90 allows  
you to "initialize" variables at the same time you declare them.

     meaning_of_life() {
         int i = 41;  /* not "code", according to gcc */
         int j = 1;

         have_fun();  /* start executable code */
         return i + j;

That's a little harder for the compiler.  It has to figure out that i  
and j need space, and also that they need to be assigned specific  
values.  However, that's not a big deal if we're only assigning  
constant values which are known at compile-time, right?  I'm still  
imagining an "initialization phase" at the beginning of the block  
that doesn't have to pay any attention to the state of the program...

Nope, C90 is more liberal than that.  Declaration statements are  
executed in order, and they can refer to earlier values.

     meaning_of_life(int opinion) {
         int meaning = opinion;
         int i       = meaning - 1;  /* THIS isn't "code"?! */
         int j       = meaning - 41;

         have_fun();  /* start "code" */
         return i + j;

That works fine -- or at least gcc doesn't complain.  I'm surprised.   
The idea of a monolithic, stateless init phase at the start of each  
block has gone out the window.  Now I'm imagining an op tree a lot  
like Perl's.

     1: allocate meaning
     2: assign value of opinion to meaning
     3: perform subtraction and store result in register
     4: allocate i
     5: assign register value to i

It gets murkier.  Seemingly, you can even do this under C90, as gcc  
doesn't complain:

     meaning_of_life() {
         int i = function_which_returns_forty_one(); /* NOT "code"?! */
         int j = 1;
         return i + j;

BUT... you *can't* do this:

     meaning_of_life() {
         int i;
         i = function_which_returns_forty_one();     /* "code" !! */
         int j = 1;      /* BZZT! Too late to declare a variable. */
         return i + j;

... and that's where my bafflement becomes total.  Tell me, how is  
that any more difficult for the compiler writer than the previous  

Here's my question: Can the right hand side of a C90 declaration/ 
initialization contain an arbitrarily complex expression?

Marvin Humphrey
Rectangular Research

More information about the Pdx-pm-list mailing list