SPUG: FW: function source grepping tool

Tue Jun 13 21:48:05 CDT 2000

> -----Original Message-----
> From:	Patterson, David S (David) 
> Sent:	Tuesday, June 13, 2000 7:41 PM
> To:	Dubuc, Paul M (Paul)
> Subject:	RE: function source grepping tool
> 
> Ok, I spent about an hour on this.  Modified a search/replace utility I
> had to isolate blocks of code (complete functions, in your case) and then
> search for items within those blocks.
> 
> Here is the utility:  
> 
>  <<findblock.txt>> 
> I did not clean up the help page to reflect all the tweaking I did to the
> original utility program 'findtext'.
> 
> Here's how you would use it:
> 
> 1)  To list all the files, and functions within the files containing the
> string MY_QUEST, run the program in the file's directory (or a directory
> above it) as follows
> 
> findblock  -s '*.c' -D '^\w+ *\(.+?^{.+?^}' 'MY_QUEST'
> 
> 2) To just list the function names, pipe the output through grep, as
> follows:
> 
> findblock  -s '*.c' -D '^\w+ *\(.+?^{.+?^}' 'MY_QUEST' | grep '^<'
> 
> The -D option would have to be modified in the case of C++ code to allow
> for colons in the function name.
> 
> findblock  -s '*.cpp' -D '^[\w:]+ *\(.+?^{.+?^}' 'case' | grep '^<'
> ((I think, --didn't test this one))
> 
> Note that the -R (replace) option does not work as described in the help
> text (due to the modifications for this problem).
> It does work, however, in the original utility.  Let me know if you are
> interested in having a copy of it.
> 
> Best regards,
> 
> Dave Patterson
> 
> 
> -----Original Message-----
> From:	Dubuc, Paul M (Paul) 
> Sent:	Monday, June 12, 2000 12:31 PM
> To:	Patterson, David S (David)
> Subject:	Re: function source grepping tool
> 
> Yes the functions are all ANSI C/C++.
> 
> Paul
> 
> "Patterson, David S (David)" wrote:
> 
> > I have a tool I wrote a long time ago that allows full perl regexp
> searches
> > or search/replaces through files in an entire directory tree.  I've
> often
> > had situations where I wanted to extend it in such a way that the seach
> > pattern for locating the block of interest was separate from the search
> > pattern I used to do the search/replace of text.  I wrote this tool
> (called
> > findtext) in perl 4.  Perl 4 is missing a lot of the sophistication in
> the
> > regular expression syntax that perl 5 has.  It might be interesting to
> > upgrade if (if necessary) to do this job.
> >
> > Question:  are your functions of the following form?
> >
> > type
> > function_name (param list...
> >                         ...)
> > {
> >    function body
> >    ...
> > }
> >
> > If not, how are they formatted?
> >
> >
> > > -----Original Message-----
> > > From: Dubuc, Paul M (Paul)
> > > Sent: Monday, June 12, 2000 10:12 AM
> > > To:   Patterson, David S (David)
> > > Subject:      Re: function source grepping tool
> > >
> > > I've been distracted from the problem by work demands, but I haven't
> > > really
> > > seen a good solution.  I would be glad to read yours.  Thanks.
> > >
> > > Paul
> > >
> > > "Patterson, David S (David)" wrote:
> > >
> > > > Did you find a solution to this?  I had several ideas about this
> (and
> > > > probably one good one).  Let me know if you're still interested....
> > > >
> > > > Best regards,
> > > >
> > > > >       ---
> > > >                 "To do great important tasks, two things are
> necessary,
> > > a
> > > > plan and not enough time..."
> > > >
> > > > >       David Patterson
> > > > >       Software Engineer
> > > > >       Lucent Technologies *
> > > > >       6464 185th Ave NE
> > > > >       Redmond, WA  98052-6736
> > > > >       425-558-8008 x 2172
> > > > >       888-501-4835 Pgr
> > > > >       davidpa at lucent.com
> > > > >
> > >
> > > --
> > > Paul M. Dubuc                     (614) 860-7836 (voice & fax)
> > > Lucent Technologies               dubuc at lucent.com
> > > Rm. 3s319                         http://waterworks.cb.lucent.com/pmd/
> > > 6200 E. Broad St.
> > > Columbus, OH 43213-1569
> > >
> 
> --
> Paul M. Dubuc                     (614) 860-7836 (voice & fax)
> Lucent Technologies               dubuc at lucent.com
> Rm. 3s319                         http://waterworks.cb.lucent.com/pmd/
> 6200 E. Broad St.
> Columbus, OH 43213-1569
> 
-------------- next part --------------
#!/opt/exp/bin/perl
##  findtext   by Dave Patterson   12 May 94
##      Ver 2.1 02 Sep 94
##      Ver 2.2 15 Jun 95 -- Added -s, -L switches
##      Ver 2.3 14 Aug 95 -- Refined DOS vs whole word search styles
##      Ver 3.0 14 Sep 98 -- Repair -P bugs
##
## NAME
##      findtext - displays filename and lines containing matches to
##      RegExpr  in  text files in current or deeper subdirectories.
##      Optionally replaces matches with a replacement text string.
##
## SYNTAX
##      findtext [-s "FileExpr"] [-i] [[-S] "TextSearchExpr"] [-I]
##               [-R "TextReplExpr"] [-P "TextReplSwitches"] 
##               [-acEefIiLsvWw] [-l n] [-D "InputRecSep"] [-r [-C|-b]]
##      [-?]
##
## DESCRIPTION
##      findtext will search text files for a match to  the  Regular
##      Expression in the current and deeper subdirectories.  Output
##      consists of a relative path/filename followed by lines  con-
##      taining  matches  to  the  regular expression prepended with
##      "<".
##
##      findtext only searches text files and skips  any  file  that
##      appears  to  have  binary  or  non-text  contents.  findtext
##      searches for any string match by default, but whole words or
##      arbitrarily complex regular expressions  can be searched for
##      using the -w, -W, -e, or -E options.
##
##      In  DOS-style  search  expressions,  /? + */  are treated as
##      /.? .+ .*/,  other nonalphanumerics are treated as literals,
##      and expressions are assumed to be right & left-justified.
##
##      findtext optionally replaces matched strings with a replace-
##      ment  text string.  Options include auto-confirmation and/or
##      auto-backup of changed files.
##
##
## OPTIONS
##      -a Alphabetize (sort) file listing before searching.
##
##      -b backup modified files.  (Can't be used with -f)
##
##      -c set file search pattern to "*.[Cch]"
##
##      -C confirm  replacments.  Program  prompts "Confirm: ([y]/n/f/a)"
##         where:
##         y - confirm change,
##         n - do not make change,
##         f - confirm change for remainder of current file,
##         a - confirm all further changes.
##
##      -d dry run replacements (display but don't actually write to
##         file proposed changes)
##
##      -D Specify alternate input record delimiter (separator).
##         Default is \n (line by line).
##
##      -e file search pattern is an extended regular expression
##         (SED- or PERL-style.  Default is DOS-style).
##
##      -E text search pattern is an extended regular expression
##         (SED- or PERL-style.  Default is DOS-style).
##
##      -f force processing of backup files.  Normally backup and
##         RCS  files  (*,n  &  *,v) are not processed.
##         (Can't be used with -b)
##
##      -i ignore case during file search.
##
##      -I ignore case during text search.
##
##      -l levels of directories to process.  Don't descend deeper
##         than this number (default: no limit)
##
##      -L follow (descend into) soft-linked directories.
##         (warning: infinite loops are possible in this mode)
##
##      -o Override read-only flag on file (try to, anyway).
##
##      -P "Expr" text replace switches (any of {geio}).  Causes the following
##         evaluation:  s/{-S "Expr"}/{-R "Expr"/{-P "Expr"}.  Assumes that
##         s/TextSearchExpr/TextReplExpr/TextReplSwitches constitutes a fully 
##         specified perl extended regular expression.  Any valid perl search 
##         expression is allowed.  Runs slower than when using the built-in
##         search/replace feature, but gives you more options.  See the
##         perl manual for syntax.
##
##      -q Quiet mode; only list file names with text matches.
##         Skips the diff listing.
##
##      -s "Expr" file search pattern.
##
##      -S "Expr" text search pattern. ("Expr" alone implies -S)
##
##      -R "RepStr" replacement string for text matching text search pattern.
##
##      -v verbose mode shows search matches and replaced lines, if appl.
##
##      -w Search for whole word matches for file name pattern.
##         (Note: do not use ^ or $ anchors in pattern with this switch).
##
##      -W Search for whole word matches for text search pattern.
##         (Note: do not use ^ or $ anchors in pattern with this switch).
##
##      -? show syntax (also -h or -H)
##
## CAVEATS
##      Quotes around RegExpr, RepStr are required if not a simple
##      alphanumeric text string.
##
## EXAMPLES
##      findtext -s zoo -S dogs # Find all files containing whose names
##                                contain the string "zoo" and search for
##                                the string "dogs".
##
##      findtext -S dogs -E     # Find all files containing string "dogs".
##
##      # Find files ending in ".c" and then find the word dogs and
##      # replace with word cats:
##
##      findtext -s "*.c" -S dogs -R cats -W
##
## SEE ALSO
##      Findfile (1), renfiles (1)
##

# Default values:

$NLEVELS = 9999; # Max levels

# Process command line args:

if (@ARGV)
{
  foreach $ARG (@ARGV)
  {
    if ($DD == 2)
    {
      $DD = 1;

      $IRD = $ARG;
    }
    elsif ($L == 2)
    {
      $L = 1;

      $NLEVELS = $ARG;
    }
    elsif ($PP == 2)
    {
      $PP = 1;

      $SWITCHES = $ARG;
    }
    elsif ($S == 2)
    {
      $S = 1;

      $REGEXP = $ARG;
    }
    elsif ($SS == 2)
    {
      $SS = 1;

      $SREGEXP = $ARG;
    }
    elsif ($RR == 2)
    {
      $RR = 1;

      $SREPL = $ARG;
    }
    elsif ($ARG eq "-a")  ## alphabetize (sort) file list
    {
      $A  = 1;
    }
    elsif ($ARG eq "-b")  ## backup changed files
    {
      $B  = 1;
    }
    elsif ($ARG eq "-c")  ## set file search string to "*.[Cch]"
    {
      $C  = 1;
    }
    elsif ($ARG eq "-d")  ## dry run changes
    {
      $D  = 1;
    }
    elsif ($ARG eq "-D")  ## alternate input record delimiter
    {
      $DD = 2;
    }
    elsif ($ARG eq "-C")  ## confirm replacements
    {
      $CC = 1;
    }
    elsif ($ARG eq "-e")  ## file search pattern is ERE
    {
      $E  = 1;
    }
    elsif ($ARG eq "-E")  ## text search pattern is ERE
    {
      $EE = 1;
    }
    elsif ($ARG eq "-f")  ## force processing of backup files
    {
      $F  = 1;
    }
    elsif ($ARG eq "-i")  ## case insensitive file search
    {
      $I  = "i";
    }
    elsif ($ARG eq "-I")  ## case insensitive text search
    {
      $II = "i";
    }
    elsif ($ARG eq "-l")  ## limit depth
    {
      $L  = 2;
    }
    elsif ($ARG eq "-L")  ## follow sym-links
    {
      $LL = 1;
    }
    elsif ($ARG eq "-o")  ## override read-only flag
    {
      $O  = 1;
    }
    elsif ($ARG eq "-P")  ## text search string is a fully specified perl ERE.
    {                     ## Next arg contains search/replace switches.
      $PP = 2;
    }
    elsif ($ARG eq "-q")  ## Quiet mode.  No diff listings.
    {
      $Q  = 1;
    }
    elsif ($ARG eq "-R")  ## text replacement string
    {
      $RR = 2;
    }
    elsif ($ARG eq "-s")  ## file search string
    {
      $S  = 2;
    }
    elsif ($ARG eq "-S")  ## text search string
    {
      $SS = 2;
    }
    elsif ($ARG eq "-v")  ## verbose (same as -d -f switches in findfile)
    {
      $V  = 1;
    }
    elsif ($ARG eq "-w")  ## search for whole words matching file pattern
    {
      $W  = 1;
    }
    elsif ($ARG eq "-W")  ## search for whole words matching text pattern
    {
      $WW = 1;
    }
    elsif ($ARG =~ /^\-/)
    {
      $H  = 1;
    }
    else
    {
      $SREGEXP = $ARG;
    }
  }
}

if ($H || $NLEVELS < 1 || $RR < 0 || $SS < 0 || $L < 0 ||
   ($B || $CC) && ! $RR || $B && $F)
{
  system ("cat $0 | grep \"^##\" | more");
  exit (-1);
}

# Set up for file search string:
if ($C)
{
  $REGEXP = '.+\.[Cch]';
  $E = 1;
  $W = 1;
}
elsif (! $REGEXP) # Default case is to process all text files:
{
  $REGEXP = '.+';
  $E = 1;
}

if ($E)   # ere style
{
  $RE = $REGEXP;

  if ($W)
  {
    # Make expression valid for whole words only:
    $RE = "(^|[\\W_])" . $RE . "(\$|[\\W_])";
  }
}
else      # dos style
{
  $RE = $REGEXP;

  $DOT = ".";

  # Place a \ in front of all non-alpha chars:
  $RE =~ s/(\W)/\\$1/g;

  # Convert DOS wildcards to ERE wildcards:
  $RE =~ s/\\([*?+])/$DOT$1/g;

  $RE = "^" . $RE . "\$";
}

study $RE;

if ($V)
{
  print "File ERE = /$RE/$I\n";
}

# Set up for text search string:

if (! $SREGEXP) # Default case is to show all text in files:
{
  $SREGEXP = '^';
  $EE = 1;
}

if ($EE || $PP)  # ere style
{
  $SRE = $SREGEXP;
}
else             # dos style
{
  $SRE = $SREGEXP;

  $DOT = ".";

  # Place a \ in front of all non-alpha chars:
  $SRE =~ s/(\W)/\\$1/g;

  # Convert DOS wildcards to ERE wildcards:
  $SRE =~ s/\\([*?+])/$DOT$1/g;
}

if ($WW)
{
  # Make expression valid for whole words only:
  $SRE = '(^|\W)' . $SRE . '($|\W)';
}

study $SRE;

if ($V)
{
  if ($PP)
  {
    print "Text ERE = s/$SRE/$SREPL/$SWITCHES\n";
  }
  elsif ($RR)
  {
    print "Text ERE = s/$SRE/$SREPL/$II\n";
  }
  else
  {
    print "Text ERE = s/$SRE/$II\n";
  }
}

$PWD = `pwd`;

chop $PWD;

if ($V)
{
  print "$PWD/\n";
}

$LEVEL = 0;

&Ckdir (".", $PWD);

exit (0);

sub Ckdir
{
  $OLDDIRPATH = $DIRPATH;

  local (@DIRLIST, @DIRLIST0, $DIRPATH,
         $FN, $PARENTDIR, $CURRENTDIR);

  $PARENTDIR  = $_[1];
  $CURRENTDIR = "${PARENTDIR}/${_[0]}";

  if (! chdir ($_[0]))
  {
    print "Error: Couldn't cd down to $_[0]!!!";

    return;  # SHOULD never get here, but...
  }

  $LEVEL++;

  opendir (DIR, ".");

  if ($A)  # alphabetize (sort) file names first
  {
    @DIRLIST0 = readdir (DIR);

    @DIRLIST  = sort @DIRLIST0;
  }
  else
  {
    @DIRLIST  = readdir (DIR);
  }

  $DIRPATH = "${OLDDIRPATH}${_[0]}/";

  closedir (DIR);

  foreach $FN (@DIRLIST) # first print files:
  {
    if (! -d $FN)
    {
      if ($I ? $FN =~ /$RE/i : $FN =~ /$RE/)
      {
        # Process file if it is a text file and (file is not a comma version
        # or file is a comma version and "force processing" flag is on):

        if (-T $FN && ($FN !~ /,\w+$/ || $F))  # Text processing section:
        {
          $CF      = 0;
          $MATCH   = 0;
          $FLISTED = 0;
          $FNT     = $FN;  # All reading is done on file $FNT

          if ($RR)  # if replace text option, back up file before opening
                    # it for output:
          {
            if (! -w $FN)
            {
              print "$FN not writeable.  Skipping;\n\n";

              next;
            }

            if (! $D)
            {
              $FNT = &Backup_file ($FN, "", 0);

              $STATUS = open (FO1, ">$FN");

              if ($STATUS != 1)
              {
                die "Error opening $FN for writing.  Exiting;\n\n";
              }
            }
          }

          $STATUS = open (FI1, $FNT);

          undef $/;

          while (<FI1>)  # Loops only once with undef $/...
          {
            @A = ($_ =~ /$IRD/gsm);

            $/ = "\n";

            foreach $_ (@A)
            {
              # print "\n[$_]\n";

              if (! $II ? /$SRE/o : /$SRE/io)
              {
                $MATCH++;

                if (! $FLISTED)
                {
                  print "${DIRPATH}${FN}\n";

                  $FLISTED++;
                }

                if (! $Q)
                {
                  print "< $_\n\n";
                }

                if ($RR)
                {
                  if ($CC)
                  {
                    $ORIG = $_;
                  }

                  if ($PP)  # Fully specified regexp evaluation:
                  {
                    eval "s/$SRE/$SREPL/$SWITCHES";
                  }
                  elsif ($EE)  # if ERE text search strring:
                  {
                    $II ? s/$SRE/$SREPL/gio : s/$SRE/$SREPL/go;
                  }
                  else  # Dos-like version
                  {
                    $II ? s/$SRE/$1$SREPL$2/gio : s/$SRE/$1$SREPL$2/go;
                  }

                  if (! $Q)
                  {
                    print "> $_";
                  }

                  if ($CC && ! $CF)
                  {
                    print "Confirm: ([y]/n/f/a) ";

                    $ANS = <STDIN>; chop $ANS;

                    if ($ANS =~ /^[Nn]/)
                    {
                      print "Line not changed.\n\n";

                      $_ = $ORIG;

                      $MATCH--;
                    }
                    elsif ($ANS =~ /^[Ff]/) # No further confirms this file.
                    {
                      $CF = 1;
                    }
                    elsif ($ANS =~ /^[Aa]/) # No further confirms required.
                    {
                      $CC = 0;
                    }
                    else
                    {
                      print "\n";
                    }
                  }
                }
              }

              if ($RR  &&  ! $D)
              {
                print FO1 $_;
              }
            }

            close FI1;

            if ($RR  &&  ! $D)
            {
              close FO1;

              if (($MATCH && ! $B)  ||  ! $MATCH)
              {
                system ("rm -f $FNT"); # kill the backup
              }
            }

            if ($MATCH  &&  ! $Q)
            {
              print "\n";
            }
          }
        }
      }
    }
  }

  foreach $FN (@DIRLIST) # next print directory names:
  {
    if ((-d $FN)  &&  ! ($FN =~ /^\.\.?$/))
    {
      if (-x $FN)
      {
        $AD = "";
      }
      else
      {
        $AD = "  (Access denied)";
      }

      if ($V)  # case where we always want to print the dir:
      {     
        if (-l $FN)
        {
          print "${DIRPATH}${FN}@/${AD}\n";
        }
        else
        {
          print "${DIRPATH}${FN}/${AD}\n";
        }
      }

      # This is where we decide whether to descend into the current dir:
      # We do if: (1) haven't exceeded the max level, and
      #           (2) directory allows us access, and
      #           (3) we are allowed to follow symbolic links

      if ($LEVEL < $NLEVELS  &&  ! $AD  &&  ($LL  ||  ! (-l $FN)))
      {
        &Ckdir ("$FN", "$CURRENTDIR");
      }
    }
  }

  $LEVEL--;

  chdir ("$PARENTDIR");
}

# ############################################################################
#
#   Sub Name:    Backup_file
#
#   Description: Looks for all versions of file name from $FILE_SPEC
#                in $BACKUP_LOC.  Makes a backup of file as follows:
#                Program copies filename to filename,# where # is next highest
#                backup number in sequence.  File name cannot be
#                a wild card.
#
#   Arguments:   NAME         DESCRIPTION
#                $FILE_SPEC   Fully specified file to backup
#                $BACKUP_LOC  Directory to place backup copy, e.g:
#                             "./.backup" - for ./backup directory.
#                             ""          - for same directory.
#                $NOTIFY      0 - No backup message
#                             n - (n != 0) print backup message
#   Globals:
#
#   Returns:     $BACKUP_FILE                 Name of backup file, -or-
#                <aborts>                     Failure
#
# ############################################################################

sub Backup_file
{
  local ($FILE_SPEC, $BACKUP_LOC, $NOTIFY) = @_;
  local ($BKP_SPEC);       # file name to search for in backup dir.
  local (@FILES);          # list of existing versions of $FILE_NAME
  local ($FILE_NAME);      # Name of file (no dir path)
  local ($FILE_NAME_RX);   # Name of file converted to a perl search string
  local ($HIGH_VN);        # Highest version not found so far
  local ($HIGH_FN);        # FN of highest version no found so far
  local ($LATEST_BKP_VN);  # Computed new version no for backup
  local ($LATEST_BKP_FN);  # Fully specified new backup file name
  local ($THIS_FILE_VN);   # Current file version number

  if (@_ != 3)
  {
    die "Error: sub Backup_file argc error!\n";
  }

  if (-l $FILE_SPEC)
  {
    # handle softlinks by backing up file link is connected to:

    $TMP = `ls -l $FILE_SPEC`;

    ($FILE_SPEC) = ($TMP =~ /-> (.+)$/);

    # print "FILE_SPEC is now [$FILE_SPEC]\n";
  }

  # Extract the file name from the file spec:
  ($FILE_NAME) = ($FILE_SPEC =~ /([^\/]+)$/);

  # Create a regular expression version of the file name:
  $FILE_NAME_RX = $FILE_NAME;
  $FILE_NAME_RX =~ s/([\W])/\\$1/g;

  if ($BACKUP_LOC)
  {
    $BKP_SPEC = "$BACKUP_LOC/$FILE_NAME";
  }
  else
  {
    $BKP_SPEC = "$FILE_SPEC";
  }

  $HIGH_VN = -1;
  $HIGH_FN = "${FILE_SPEC}_$$";

  # Create a list of all files matching the filespec in the
  # backup directory:
  @FILE = `ls -d $BKP_SPEC* 2>/dev/null`;

  foreach (@FILE)
  {
    $CURFILE = $_;

    chop $CURFILE;

    # Determine the version number of this file (-1 if none):
    ($THIS_FILE_VN) = ($CURFILE =~ /$FILE_NAME_RX,(\d+)$/);
    $THIS_FILE_VN = (length ($THIS_FILE_VN)  ?  $THIS_FILE_VN  :  -1);

    if ($THIS_FILE_VN > $HIGH_VN)
    {
      $HIGH_VN = $THIS_FILE_VN;
      $HIGH_FN = $CURFILE;
    }
  }

  $LATEST_BKP_VN = $HIGH_VN + 1;

  $LATEST_BKP_FN = "$BKP_SPEC,$LATEST_BKP_VN";

  # print "> FILE_SPEC[$FILE_SPEC] HIGH_FN[$HIGH_FN] LATEST_BKP_FN[$LATEST_BKP_FN]\n";

  if (-d $FILE_SPEC)
  {
    $? = "1";
  }
  else
  {
    system ("cmp -s $FILE_SPEC $HIGH_FN");
  }

  if ($? || "$FILE_SPEC" eq "$HIGH_FN")
  {
    if (-e $FILE_SPEC)
    {
      if (-d $FILE_SPEC)
      {
        $RFLAG = "-r"; # If file is a directory, back it up recursively
        $DFLAG = "directory ";
      }

      if ($NOTIFY)
      {
        print "Backing up $DFLAG$FILE_SPEC to $LATEST_BKP_FN.\n";
      }

      $ERR1 = system ("cp -p $RFLAG $FILE_SPEC $LATEST_BKP_FN");

    }
    else # File doesn't exist so just touch backup file.
    {
      if (-z $HIGH_FN)
      {
        warn "$FILE_SPEC does not exist.  Backup file $LATEST_BKP_FN is empty.\n";
      }
    }

    if ($ERR1)
    {
      warn "\nPROBLEM: Can't copy $FILE_SPEC to $LATEST_BKP_FN\n";
      die  "Exiting.\n\n";
    }
  }
  else # Don't back up if last backup file is identical:
  {
    if ($NOTIFY)
    {
      print "Identical file $HIGH_FN already exists.\n";
    }

    $LATEST_BKP_FN = $HIGH_FN;
  }

  return $LATEST_BKP_FN;
}