SPUG: Parsing (or at least matching) C function prototypes

Bill Campbell bill at celestial.com
Mon Mar 1 19:52:20 PST 2010


On Mon, Mar 01, 2010, Michael R. Wolf wrote:
> In order to create an instrumentation layer in his C code, a friend  
> wants to extract function prototypes from his C code.
>
> His first try worked with regular expressions.
>
> And now his second try also works with regular expressions.
>
> His solution may be "good enough", but on the (likely chance) that he  
> calls me again, does anyone know of code that could help *parse* (or at 
> least recognize with mo' bettah regexp's) his C code?

The attached perl script is based on a shell script in the book
``Portable C and Unix System Programming'' by J.E. Lapin.  I
originally wrote this as an exercise as I really didn't
understand the incredibly complex ``sed'' scripts in his.

This depends on some defines and minor modifications to the C
code, marking global variables as ``public'' and static variables
as ``private'', and creates a header file containing the
appropriate ``external'' lines for all the .c files in its
arguments.  The defines, which may be generated with the -d
option, are basicall these:

#define public
#define PUBLIC
#define private

Back in the day when I was doing a lot of C programming, I used
this extensively, and it made my life much simpler.

Please no harsh comments about my ancient perl script which dates
back well over 20 years.  It's CBE (Crude But Effective).

If anybody is interested in the original cx script, it starts on
page 213 of the Lapin book.

Bill
-- 
INTERNET:   bill at celestial.com  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:          (206) 236-1676  Mercer Island, WA 98040-0820
Fax:            (206) 232-9186  Skype: jwccsllc (206) 855-5792

The Constitution is a written instrument.  As such, its meaning
does not alter.  That which it meant when it was adopted, it
means now.  -- SOUTH CAROLINA v. US, 199 U.S. 437, 448 (1905)
-------------- next part --------------
:
#!/csrel25/bin/perl
eval "exec /csrel25/bin/perl -S $0 $*"
	if $running_under_some_shell;

# $Header: /vol/cscvs/lbin/cx,v 2.11 1997/03/01 22:25:59 bill Exp $
# $Date: 1997/03/01 22:25:59 $
# @(#) $Id: cx,v 2.11 1997/03/01 22:25:59 bill Exp $

( $progname = $0 ) =~ s!.*/!!; # save this very early

$USAGE = "
#
#   Usage: $progname [-v] options cfile [cfile...]
#
# Options   Argument    Description
#   -d                  Generate defines for public/private
#   -v                  Verbose
#
";


sub usage {
	die join("\n", at _) .
	"\n$USAGE\n";
}

do "getopts.pl";

&usage("Invalid Option") unless do Getopts("dvV");

$suffix = ($verbose = ($opt_v || $opt_V)) ? '' : $$;

# $< = $>;	# make it ignore taintedness

$\ = "\n";	# terminate lines automaticaly with newline

if($#ARGV < 0) {
	print "$USAGE -- No arguments";
	exit(1);
}
$tmpfile = "/tmp/cx$suffix";	# temporary file
print STDERR "Tmpfile = $tmpfile" if($verbose);

while($_ = shift) {
	print STDERR "file = $_" if($verbose);
	($file = $_) =~ s/\.c$//;
	$cfile = $file . '.c';
	if ( ! -r $cfile ) {
		# cannot find a .c file for $i.
		print "$progname: $cfile not found";
		exit(1);
	}
	($hfile = $file) =~ s!.*/(.*)$!!;
	$hfile .= '.h';
	if ( -r $hfile ) {
		# include the .h file with the same name as the .c file
		$incfile = "#include \"$hfile\"";
	}
	else {
		$incfile = "/* There is no $file.h file */";
	}
	$xfile = $file . '.x';
	open(INPUT, 
		"egrep '^\s*public|#\s*if|^#\s*else|^#\s*endif' $cfile |");
	open(OUT, "> $tmpfile");	# open output file
	select(OUT);
	if ($opt_d) {
		print "#ifndef public	/*{*/";
		print "#\tdefine public";
		print "#\tdefine PUBLIC";
		print "#\tdefine private static";
		print "#endif /* } public */\n";
	}
	print "\t/* $xfile -- declarations file for module $file */";
	print $incfile;
	line: while(<INPUT>) {
		chop;
		rescan: if(/^#\s*if/) {
			$if_line = $_;					# save this line
			chop($_ = <INPUT>);					# get the next line
			print STDERR "test1 >$_<" if($verbose);
			next line if(/^#\s*endif/);	# skips if...endif
			if(/^#\s*else/) {			# may be if...else...endif
				$else_line = $_;
				chop($_ = <INPUT>);
				print STDERR "test2 >$_<" if($verbose);
				next line if(/^#\s*endif/);	# skips if...else...endif
				print STDERR "else test failed" if($verbose);
				print $if_line;
				print $else_line;
				goto rescan;
			}
			print $if_line;					# if line matched
			goto rescan;
		}
		if(/public/) {
			# Change public to extern
			s/public/extern/;
			# Remove function paramater lists.
			unless ( /\bARG[1-9]/ || /\b_PROTO/ || /\b_VARARGS/ ) {
				print STDERR "input = >$_<" if($verbose);
				s/\((..*)\)/&get_protos($1)/e;
				s/\(\)/(void)/g;
				print STDERR "output = >$_<" if($verbose);
			}
			# Remove leftmost array dimension.
			if (/\[/) {
				s/]/]CX/;
				s/\[.*CX/[]/;
			}
			# Remove anything trailng a semicolon
			s/;.*/;/;
			# Remove Variable Initialization
			s/\s*=.*/;/;
		}
		s/;*$/;/ unless /^#/;	# make sure each line is terminated
		s/\s+/ /g;				# replace multiple whitespace.
		print STDERR "output = >$_<" if($verbose);
		print;
	}
	# if tne new cx file is identical to the old. don't update the old.
	# if they are different, move the new one onto the old.
	close(INPUT);
	close(OUT);

	if(! -f $xfile || system("cmp -s $xfile $tmpfile > /dev/null")) {
		system("mv $tmpfile $xfile");
	}
	else {
		print STDERR "$xfile is up to date";
		unlink($tmpfile);
	}
}
exit(0);

sub get_protos {
	local($args) = @_;
	local($_);
	local(@out);
	$lpq = '\('; $rpq = '\)';
	# $args =~ s/$rpq\s*$lpq$//;
	print STDERR "args >$args<" if $verbose;
	for (split(/, */, $args)) {
		if (!s/(\S+\s*$lpq\**)\s*\w+\s*($rpq.*)/$1$2/) {
			s/\*([^\*])/* $1/g;	# change 'char *cp' -> 'char * cp'
			s/\s+\S+$//;		# drop arg name
		}
		push(@out, $_);
	}
	'(' . join(', ', @out) . ');';
}
__END__
public int	(*menucall[]) () = {


More information about the spug-list mailing list