SPUG: parsing Perl from Perl...

dancerboy dancerboy at strangelight.com
Wed Nov 7 01:48:59 CST 2001

At 8:54 PM -0800 11/6/01, Tim Maher/CONSULTIX wrote:
>  > If necessary, of course, it wouldn't be that difficult simply to code
>>  a Perl parser from scratch, using regexp's and event streams and all
>You're not the first to underestimate the difficulty of this task, either!

Heh.  Well, I didn't say it would be trivial  :)

Yeah, I noticed it would be at least somewhat tricky when I began on 
the "easier" project: detecting named sub definitions inside of code 
blocks (and carping about them). (If anyone wants to know why I think 
defining named subroutines inside of code blocks should be 
syntactically illegal, I would be happy to provide a very long rant 
on the subject...  :)

I mean, *all* I had to do was keep track of whether 
/\bsub\s+\w+\s+\{/ had an unmatched { somewhere in front of it, 

Even the =pod syntax is stranger than I expected.  Here's an 
interesting multiple-choice question that I bet NO ONE on SPUG will 
get right, unless they fire up the Perl interpreter and actually test 

Say you're parsing a Perl file one line at a time, storing each line 
in $_.  You're currently parsing a section of POD documentation. 
Which of the following is the *correct* test to determine when the 
POD documentation should end, and normal Perl parsing should resume 
on the following line?


Would you believe it's actually the first one?  Yes, you can end a 
section of POD documentation with:

=cuthulu is cute but poorly spelled


At 9:32 PM -0800 11/6/01, David Dyck wrote:
>You might want to look at other editors that already do this.
>     (perhaps vim, elvis, and emacs )

Of course, most of the editors do it wrong  :)  (I usually use BBEdit 
-- which is generally considered one of the best source-code editors 
for the Mac -- and you wouldn't believe how easily it croaks on 
standard Perl syntax...  My Perl code is full of comments that say "I 
know this looks weird: I wrote it this way so that BBEdit would get 
the syntax-coloring right for the rest of the file."  )

>  > 2.  A pre-processor that enforces slightly stricter syntax than that
>>  of the normal Perl compiler. (Specifically, not allowing named
>>  subroutines to be defined inside blocks of code.)
>Look at B::Lint

Looked.  Doesn't do what I want.

>The B:: modules use perl to parse the perl into intermediate code,
>and then extract information from the compiled code.

I confess I didn't dig through the B:: modules all that thoroughly, 
but it looked to me like they were all meant for manipulating the 
opcodes, *after* compilation.  I'm looking for something that will 
let me peek into the step *before* compilation: where the code is 
being split into tokens.  I didn't see any B:: modules that appeared 
to do that.  Am I missing something?

Right now, cannibalizing perltidy looks like my best option.  Thanks 
for the pointer to that!


 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://zipcon.net/spug/

More information about the spug-list mailing list