Phoenix.pm: Parsing comments

Thomas Whitney whitneyt at agcs.com
Wed Nov 28 17:03:46 CST 2001


Thomas Whitney wrote:

> Thanks Tim,
>
> I found out later from the guy I was helping that the question I posted was only part of the problem. This tool needs to go through find the
> three types of comments and replace the comment text with a unique string. then save the the part that was replaced. Comments of the same
> type can not be nested, and it is guaranteed to not have any weird mismatched comment brackets. But there can be empty comments, and it needs
> to keep track of those. My solution is listed below in case anybody is interested. Please let me know if anybody has a better way to do it.
>

A little better:

my $p = 1;
my $repl = 'STR';
my @save = ();
my $line = "/*co{mm}ent*/ not {another comment} {} not (*one more comment*) not a comment";
$line =~ s/(\/\*)(.*?)(\*\/)|(\(\*)(.*?)(\*\))|(\{)(.*?)(\})/
             push @save, $2||$5||$8||'' ; ($1||$4||$7).$repl.$p++.($3||$6||$9)
          /xesg;

print "$line\n";
print "[$_]\n" for @save;
exit;




>
> my $p = 1;
> my $repl = 'STR';
> my @save = ();
> my $line = "/*co{mm}ent*/ not {another comment} not (*one more comment*) not a comment";
> $line =~ s/(\/\*)(.*?)(\*\/)|(\(\*)(.*?)(\*\))|(\{)(.*?)(\})/
>               eval{ push @save, $2||$5||$8||'' ; return ($1||$4||$7).$repl.$p++.($3||$6||$9) }
>           /xesg;
>
> print "$line\n";
> print map "[$_]\n", @save;
>
> Thanks
> Tom
>
> Tim Ayers wrote:
>
> > >>>>> "T" == Thomas Whitney <whitneyt at agcs.com> writes:
> > T> Hi Group,
> > T> I am helping somebody write a simple comment parser. "{}" comments can be inside "/**/" comments, and there could be an empty comment.
> >
> > I don't understand exactly. What is the comment delimiter? {}? /**/?
> > From your code below it looks like comments can be delimited by /**/,
> > {}, or even (**). [ Ed: Why does anyone need 3 kinds of balanced
> > comment delimiters? That makes a hard problem even harder. ]
> >
> > Read "perldoc -q balance". This is a hard problem. Here are a couple
> > examples why
> >
> > /* the comment end-delimiter is */ */
> > /* /* nested comment */ */
> >
> > If you want something that always works look at the Parse::RecDescent
> > module.
> >
> > T> Below is an attempt at it. It appears to works except the |ed
> > T> expressions return empty.  I could probably do it with a few lines,
> > T> but does anybody have any ideas for a better one liner?
> >
> > I've been trying to write a slick way that works when there isn't any
> > monkey business, but I haven't found it yet. In the meantime you can
> > fix yours with a little filtering. Not elegant, but it works.
> >
> >   $_ = "/*co{mm}ent*/";
> >   my @save =  grep /\S/,
> >                 m%(?:/\*(.*?)\*/ |
> >                      {(.*?)}     |
> >                      \(\*(.*?)\*\))%xsg;
> >   print "[$_]\n" for @save;
> >
> > HTH and
> > Hope you have a very nice day, :-)
> > Tim Ayers (tim.ayers at reuters.com)




More information about the Phoenix-pm mailing list