SPUG: parsing Perl from Perl...

dancerboy dancerboy at strangelight.com
Wed Nov 7 01:48:59 CST 2001


At 8:54 PM -0800 11/6/01, Tim Maher/CONSULTIX wrote:
>  > If necessary, of course, it wouldn't be that difficult simply to code
>>  a Perl parser from scratch, using regexp's and event streams and all
>
>You're not the first to underestimate the difficulty of this task, either!
>

Heh.  Well, I didn't say it would be trivial  :)

Yeah, I noticed it would be at least somewhat tricky when I began on 
the "easier" project: detecting named sub definitions inside of code 
blocks (and carping about them). (If anyone wants to know why I think 
defining named subroutines inside of code blocks should be 
syntactically illegal, I would be happy to provide a very long rant 
on the subject...  :)

I mean, *all* I had to do was keep track of whether 
/\bsub\s+\w+\s+\{/ had an unmatched { somewhere in front of it, 
right?...

Even the =pod syntax is stranger than I expected.  Here's an 
interesting multiple-choice question that I bet NO ONE on SPUG will 
get right, unless they fire up the Perl interpreter and actually test 
it:

Say you're parsing a Perl file one line at a time, storing each line 
in $_.  You're currently parsing a section of POD documentation. 
Which of the following is the *correct* test to determine when the 
POD documentation should end, and normal Perl parsing should resume 
on the following line?

/^=cut/
/^=cut\b/
/^=cut\s/
/^=cut\s*$/
/^=cut\n/
/^=\s*cut\s*$/
/^=\s*cut\s*/
/^=\s*cut\b/

Would you believe it's actually the first one?  Yes, you can end a 
section of POD documentation with:

=cuthulu is cute but poorly spelled

(!)

At 9:32 PM -0800 11/6/01, David Dyck wrote:
>You might want to look at other editors that already do this.
>     (perhaps vim, elvis, and emacs )

Of course, most of the editors do it wrong  :)  (I usually use BBEdit 
-- which is generally considered one of the best source-code editors 
for the Mac -- and you wouldn't believe how easily it croaks on 
standard Perl syntax...  My Perl code is full of comments that say "I 
know this looks weird: I wrote it this way so that BBEdit would get 
the syntax-coloring right for the rest of the file."  )

>
>  > 2.  A pre-processor that enforces slightly stricter syntax than that
>>  of the normal Perl compiler. (Specifically, not allowing named
>>  subroutines to be defined inside blocks of code.)
>
>Look at B::Lint

Looked.  Doesn't do what I want.


>The B:: modules use perl to parse the perl into intermediate code,
>and then extract information from the compiled code.

I confess I didn't dig through the B:: modules all that thoroughly, 
but it looked to me like they were all meant for manipulating the 
opcodes, *after* compilation.  I'm looking for something that will 
let me peek into the step *before* compilation: where the code is 
being split into tokens.  I didn't see any B:: modules that appeared 
to do that.  Am I missing something?

Right now, cannibalizing perltidy looks like my best option.  Thanks 
for the pointer to that!

-jason

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://zipcon.net/spug/





More information about the spug-list mailing list