SPUG: Word boundry regex treated differently by 5.6 and 5.005033

Tim Buckwalter TimBuckwalter at aol.com
Thu Apr 26 13:35:51 CDT 2001


I got the same results and i'm using 5.6.0

-- 
Tim Buckwalter
Senior Language Engineer
AOL Mobile (formerly Tegic)
1000 Dexter Ave N, Suite 300
Seattle, WA 98109-3574
206.268.7552 phone
206.343.7004 fax
206.343.7001 front desk
TimBuckwalter at aol.com
www.tegic.com

Dan Ebert wrote:
> 
> I ran the test on perl 5.6 and got this:
> 
>  <C> h <arles Bronson>
>  <Cha> r <les Bronson>
>  <Charl> e <s Bronson>
>  <Charles B> r <onson>
>  <Charles Bro> n <son>
>  <Charles Brons> o <n>
> 
> Dan.
> -----------------------------------------------------------
> Optimists:  the glass is half full.
> Pessimists: the glass is half empty.
> Engineers:  the glass is twice as big as it needs to be.
> -----------------------------------------------------------
> 
> On Thu, 26 Apr 2001, Michael LaGaly wrote:
> 
> > Actually, it looks like the 5.6 is not returning to the original $text="Charles Bronson" as it tests each successive character, but is instead doing something that amounts to in-place deletion and then testing again.
> >
> > So for:
> >
> > perl -e "use strict;(my $text = qq(Charles Bronson)) =~ s/\B\w//g;print qq(here it is: $text\n\n);"
> >
> > Charles Bronson
> > ^                test : is on a word boundary: go to next
> >  ^               test : is not on a word boundary: delete this char
> > C arles Bronson  test : is on a word boundary: go to next
> >   ^
> >
> > Why don't you try the following in 5.6.  This will show you what the text left of the match, the match, and the text right of the match are as the compiler sees it:
> >
> > perl -le "$t = qq(Charles Bronson); print qq( <$`> $& <$'>) while $t =~ m/\B\w/g"
> >
> > On 5.00503 this gets:
> >  <C> h <arles Bronson>
> >  <Ch> a <rles Bronson>
> >  <Cha> r <les Bronson>
> >  <Char> l <es Bronson>
> >  <Charl> e <s Bronson>
> >  <Charle> s < Bronson>
> >  <Charles B> r <onson>
> >  <Charles Br> o <nson>
> >  <Charles Bro> n <son>
> >  <Charles Bron> s <on>
> >  <Charles Brons> o <n>
> >  <Charles Bronso> n <>
> >
> > I'm curious to see what 5.6 gives you.
> >
> > Michael
> >
> >   ----- Original Message -----
> >   From: Ben Burnett
> >   To: Colin Meyer ; Ben Burnett
> >   Cc: spug-list at pm.org
> >   Sent: Thursday, April 26, 2001 12:09 AM
> >   Subject: Re: SPUG: Word boundry regex treated differently by 5.6 and 5.005033
> >
> >
> >   At 05:51 PM 4/25/01 -0700, Colin Meyer wrote:
> >   >More detail can be seen from the regex debugger:
> >   >perl -M're debug' -le '$t = "abcdefg"; print pos $t while $t =~ m/\B\w/g'
> >
> >   I have to admin I haven't spent much time with the perl debugger I'll take
> >   a closer look at this.
> >
> >   >It is hard for me to decide if this is a new bug or a bug fix for an old
> >   >problem. The camel says that /g causes the regex to "start the next
> >   >match on the same variable at a position *just past* where the last
> >   >match stopped." The older versions of Perl seem to be looking at the
> >   >character that the last match ended on in order to determine the border
> >   >or non-border properties of the character at pos($t). Well, it's either
> >   >a bug with Perl, or a bug with its documentation. In either case, a
> >   >report should be submitted with perlbug.
> >
> >   I think it's probably a bug with Perl itself.  I can't imagine this change
> >   in behavior was intentional.  I'll have to submit it in the morning.
> >
> >   >What sort of problem were you attempting to solve when you came across
> >   >this one? ;-)
> >
> >   Here is an excerpt of code showing the regex hard at work in a motorcycle
> >   rental application CGI script.
> >   ...
> >                    # we need to give this request a registration number while
> >   we are here.  this number
> >                    # will be built out of the initials of each word in the
> >   applicants name, a unique session_key,
> >                    # the applicants state, and the first two letters of the
> >   city that the applicant is in
> >                            my $key = time();
> >                            $key .= "-" . getppid() or $LogH->append("couldn't
> >   getppid to add to session key");
> >                            my $request_id = $PASSED_VARS{'name'};
> >                            $request_id =~ s/\B\w//g;
> >                            $request_id =~ s/\W//g;
> >                            $request_id .= "-" . $key; # . "-";
> >                            # $request_id .= $PASSED_VARS{'state'} . "-";
> >                            # my $city_portion = $PASSED_VARS{'city'};
> >                            # $city_portion =~ m/^([\w]{2})/;
> >                            # $city_portion = $1;
> >                            # $request_id .= $city_portion ;
> >                            $request_id = uc($request_id);
> >   ...
> >
> >   I'll eventually work out some other form of unique id for these requests
> >   that isn't so verbose, but I wanted it to be human readable during testing.
> >
> >
> >   -Ben
> >
> >
> >    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >        POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
> >         Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
> >     Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
> >    For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
> >     Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
> >
> >
> >
> >
> 
>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>      POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
>       Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
>   Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
>  For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
>   Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
  Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/





More information about the spug-list mailing list