[kw-pm] how to *delete* all "#if 0" content from a source tree

fishbot eric at uc.org
Sun Mar 18 08:36:29 PDT 2007


I've used Text::Balanced (which is core, incidentally, though I
forget the as-of) at $work to great effect, though understanding
the API can be a challenge, particularly if you want to do more
than basic things.

Adding to the trickiness here is that you actually want to track
all the #if * ... #endif blocks, but you are only interested in
the ones starting with #if 0.  That is, you have to track all the
nestings, then prune that list.  And the interesting #if 0s might
be nested in uninteresting #ifs, or vice-versa.

Also, modern compilers let you have whitespace before the '#' and
between the '#' and the 'if' and 'endif'.  Plus #endif can be for
#ifs or #ifdefs, etc.

So, it's not a trivial challenge.  In theory, Text::Balanced
extract_tagged/get_extract_tagged, but damned if I can get it to
work at all.

I rolled my own, I've tested it... just prints the before and
after right now, but making it replace files, recurse directories
is just a SMOP.

Code review/opinions welcome.  It's just a Sunday morning
coffee-hack, so it doesn't handle extreme cases.  It's sloppy
with what it accepts as #ifs.  Sort of cool is that if you decide
you want to expunge #ifdef DEBUG etc. as well, it's just a matter
of altering is_interesting().

fishbot

---- original message : 2007-03-17 12:13pm : Richard Dice ----

> This isn't particularly a Perl question, is it?
>
> That said, if you're interested in doing this with Perl, you should look
> into Damian Conway's module called Text::Balanced.
>
> http://search.cpan.org/~dconway/Text-Balanced-v2.0.0/lib/Text/Balanced.pm
>
> This only identifies the nested #if 0 ... #endif blocks. Traversing the
> source tree, keeping what you what in the files, editing and saving the
> files in place, etc., is all up to you.  But these are pretty easy to do
> with Perl (plus some standard Unix shell tools like "find" and "xargs") too.
> :-)
>
> Cheers,
> Richard
>
> On 3/17/07, Robert P. J. Day <rpjday at mindspring.com> wrote:
> >
> >
> >   is there a convenient way to, throughout an entire source tree,
> > delete all content in all source or header files that's been
> > de-activated with an "#if 0"?  that would, of course, include the "#if
> > 0" line and all content up to and including the terminating "#endif"
> > (which might have some extraneous comment content after it).
> >
> >   thoughts?
> >
> > rday
> > _______________________________________________
> > kw-pm mailing list
> > kw-pm at pm.org
> > http://mail.pm.org/mailman/listinfo/kw-pm
> >
>
-------------- next part --------------
#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

# global defs
my $ws     = qr/[ \t]/;  # permitted whitespace
my $opener = qr/^$ws*#$ws*if\w*\s+[^\n]+\n?/ms;
my $closer = qr/^$ws*#$ws*endif[^\n]*\n?/ms; 

clean( shift );

# main: takes filename
sub clean 
{
   my $filename = shift;
   open my $fh, "<", $filename
      or die( "couldn't open '$filename': $!" );
   my $text = do { local $/; <$fh>; };

   print "INPUT:\n";
   print $text;
   print "\nOUTPUT:\n";
   print out( parse( toke( $text )));
}

# toke: takes text, returns marked tokens
sub toke
{
   my $text = shift;
   my @chunks = split m/($opener|$closer)/, $text;
   my @seg;

   for ( @chunks )
   {
      next unless defined;
      my $type = m/^$opener/ ? 'opener' :
                 m/^$closer/ ? 'closer' :
                 'text';
      push @seg, { type => $type, text => $_ };
   }
   
   return \@seg;
}

# parse: takes named tokens, builds nested tree
sub parse
{
   my $tokes = shift;
   my $stack = [];
   my $root = { data => [] };
   push @$stack, $root;
   
   for my $t ( @$tokes )
   {
      if ( $t->{type} eq 'text' )
      {
         push @{$stack->[$#$stack]->{data}}, $t;
         next;
      }

      if ( $t->{type} eq 'closer' )
      {
         $stack->[$#$stack]->{closer} = $t->{text};
         pop @$stack;
         next;
      }

      my $new = { 
         type   => 'nest',
         data   => [],
         opener => $t->{text},
         closer => undef,
         };
      push @{$stack->[$#$stack]->{data}}, $new;
      push @$stack, $new;
   }
   return $root;
}

# is_interesting: takes node, returns boolean
sub is_interesting
{
   my $node = shift;
   return if $node->{opener} =~ m/^$ws*#$ws*if$ws+0$ws*$/;
   return 1;
}

# out: takes node root, returns string, uses is_interesting
sub out 
{
   my $root = shift;
   my $out = "";

   for ( @{$root->{data}} )
   {
      if ( $_->{type} eq 'text' )
      {  
         $out .= $_->{text};
      } else {
         # recurse
         $out .= $_->{opener} . out( $_ ) . $_->{closer}
            if is_interesting( $_ );
      }
   }

   return $out;
}
__END__

INPUT:
# if 1

#if 0
foo
bar
#if foo
 # ifdef   0
  #endif
bar
#endif
const char * foo = "#endif";
#endif
quux
# if 0
  #endif
  # endif

fnord #if 0
fnord #endif
foo

OUTPUT:
# if 1

quux
  # endif

fnord #if 0
fnord #endif
foo


More information about the kw-pm mailing list