[sf-perl] The SF Perl Raku Study Group, 03/28 at 1pm PDT

yary not.com at gmail.com
Mon Mar 29 18:06:00 PDT 2021


Took me a bit of looking, but the issue is in the details- the decision
character for "stuff" is '-' since that is always the beginning of "ruler",
the first regex under "control_2" which is the first regex following
"stuff". Changing to "-" and re-ordering the stuff regex fixes the
proof-of-concept, and speeds it up roughly 40% on my system.

    regex stuff
    {
        [            # stuff is a group of either
            <-[-]>+: # a ratcheting string of non-decision points. Removing
ratcheting makes it hang on Yary's system.
          ||         # or
            '-'      # a "dash" decision point
        ]*           # 0-many of those. Greedy or non-greedy both work,
about the same speed.
    }  # end regex

Will commit this soon.

(and thanks for making this runnable & pointing out that I can use your
test files all in the repo)

-y


On Mon, Mar 29, 2021 at 8:07 PM Joseph Brenner <doomvox at gmail.com> wrote:

> That's interesting thinking... I've been playing around with the idea
> over here, but haven't got it to work (it fails to parse):
>
>
> https://github.com/doomvox/raku-study/blob/main/bin/2021mar28/doomfiles_browse_sequence-iii.raku
>
> This version does work-- just using greedy matching on "stuff" makes
> it use orders-of-magnitude less resources:
>
>
> https://github.com/doomvox/raku-study/blob/main/bin/2021mar28/doomfiles_browse_sequence-ii.raku
>
>
> And yes, I would guess that the non-greedy matching probably works
> here because the following material is effectively pinned at the end
> of the document.
>
> Note that you should be able to run these scripts as written, provided
> you also pull copies of these source files:
>
> https://github.com/doomvox/raku-study/tree/main/dat/doomfiles
>
>
>
> On 3/29/21, yary <not.com at gmail.com> wrote:
> > Hi Joe & other Raku study group attendees,
> >
> > At the time I left, we were looking at a grammar with a speed-memory
> issue
> > on large-ish files. I had a germ of an idea which I couldn't express, and
> > from the meeting notes I see you have a simple fix *"by changing stuff
> > regex (.\*?) to non-greedy (.\*)*" I suspect the greedy-optimization
> works
> > because the thing after the "stuff" regex is near the end of the file.
> Thus
> > if instead it was close to the beginning, it would have a similar issue
> > with greedy and non-greedy would fix.
> >
> > With a night to sleep on it, the thing I was thinking & trying to say is
> > that, in the specialized HTML-grammar you had, the decision points are
> all
> > at left-brackets. By re-writing "stuff" so that it will only backtrack
> when
> > it hits a bracket, I expect more speed-memory gains.
> >
> > How well does this perform vs the simple .* greedy fix?
> >
> >     regex stuff
> >     { (  # capture stuff (positional capture might not be needed)
> >         [               # Stuff is a group of either
> >             \<          # a left-bracket decision point
> >           ||            # or
> >             <-[ \< ]>+: # a ratcheting string of non-decision points
> >         ]*              # 0-many of those. Greedy or non-greedy both
> work?
> >     ) }  # end capture, end regex
> >
> > This was harder to express verbally & in code than I expected!
> >
> > -y
> >
> >
> > On Sun, Mar 28, 2021 at 4:23 PM Joseph Brenner <doomvox at gmail.com>
> wrote:
> >
> >> I did send this one out, but it doesn't seem that it went out exactly,
> >> so let's try this one more time.   The Study Group is happening,
> >> already in progress, though we'll be taking a break next week and
> >> broadcasting a burning yule log with the soundtrack to Jesus Christ
> >> Superstar.  (Just kidding)
> >>
> >>
> >> Flaming Carrot, "Night Patrol" (1986) by Bob Burden:
> >>
> >>     I feel it rising now...
> >>     ... like little bubbles...
> >>     THE MOON IS FULL...
> >>     ... in a full moon, your brain floats to top of your head...
> >>     I feel it...
> >>     beginning to boil...
> >>     a lot will happen tonight.
> >>
> >> The Raku Study Group
> >>
> >> March 28, 2021  1pm in California, 8pm in the UK
> >>
> >> Zoom meeting link:
> >>
> >>
> https://us02web.zoom.us/j/81127128506?pwd=N0I5bkxUZTRLaWwxN2RJTGlsT254QT09
> >>
> >> Passcode: 4RakuRoll
> >>
> >> RSVPs are useful, though not needed:
> >>   https://www.meetup.com/San-Francisco-Perl/events/277163968/
> >> _______________________________________________
> >> SanFrancisco-pm mailing list
> >> SanFrancisco-pm at pm.org
> >> https://mail.pm.org/mailman/listinfo/sanfrancisco-pm
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/sanfrancisco-pm/attachments/20210329/e2038d80/attachment.html>


More information about the SanFrancisco-pm mailing list