[tpm] I wish I was better at regex's

Shaun Fryer sfryer at sourcery.ca
Wed Mar 9 11:55:01 PST 2011


Ah. My bad! You win!
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729




On Wed, Mar 9, 2011 at 2:53 PM, Rob Janes <janes.rob at gmail.com> wrote:
> you're missing a double quote before key.
>
> ================
> line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
> comment with = " ' and ;
> matches!
> data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
> comment is stuff\'" ; comment with = " ' and ;
>
>
> ================
> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
> comment with = " ' and ;
> matches!
> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
> comment is comment with = " ' and ;
>
>
> On Wed, Mar 9, 2011 at 2:48 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> funny. I got...
>>
>> line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
>> comment with = " ' and ;
>> matches!
>> data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
>> comment is stuff\'" ; comment with = " ' and ;
>>
>> as you can see, the (stuff\'";) got put in the comment.
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:46 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>> ================
>>> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>>> matches!
>>> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
>>> comment is comment
>>>
>>> There's a key (in double quotes), an =, a value (in double quotes),
>>> and a comment.
>>>
>>> I don't understand.  looks to me like it did pull off the comment properly.
>>>
>>> On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>> nice (though pretty convoluted). the following breaks it.
>>>>
>>>> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>>>> with " ' and ;
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>> this strips off the comment
>>>>>
>>>>> #!/usr/bin/env perl
>>>>>
>>>>> use strict;
>>>>> use warnings;
>>>>>
>>>>> while (<DATA>) {
>>>>>  chomp;
>>>>>
>>>>>  print "\n\n================\nline: $_\n";
>>>>>
>>>>>  if (m{
>>>>>        ^
>>>>>        (
>>>>>          (?:[^\\"';\s]|
>>>>>            \\.|
>>>>>            \s|
>>>>>            "(?:[^"\\]|\\.)*"|
>>>>>            '(?:[^'\\]|\\.)*'
>>>>>          )+
>>>>>        )
>>>>>        (?:\s*;\s*(.*))?
>>>>>        $
>>>>>        }x)
>>>>>  {
>>>>>    my ($words, $comment) = ($1, $2);
>>>>>    $comment = "" unless defined($comment);
>>>>>    print "matches!\n";
>>>>>
>>>>>    print "data is $words\ncomment is $comment\n";
>>>>>  }
>>>>>  else
>>>>>  {
>>>>>     print "match FAILED!\n";
>>>>>  }
>>>>> }
>>>>>
>>>>> __DATA__
>>>>> key="value"
>>>>> key=value
>>>>> key="value;"
>>>>> key="value1;value2"
>>>>> key="value1;value2"     ; comment
>>>>> key='value1;value2'     ; comment
>>>>> "key"="value1"
>>>>> "key"="value1"          ; comment
>>>>> "key"="value1;value2"
>>>>> "key"="value1;value2"   ; comment
>>>>> "key"="val\"ue1;value2"
>>>>> "key"="val\"ue1;value2" ; comment
>>>>> "key"='val\'ue1;value2' ; comment
>>>>> "key"='val\"ue1;value2' ; comment
>>>>> key="this=that" ; an = in the value
>>>>> key="value" ; a " in the comment
>>>>> this is a title   ; and this is a comment
>>>>>
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>>>>> comment with " ' and ;
>>>>>> --
>>>>>> Shaun Fryer
>>>>>> cell: 1-647-709-6509
>>>>>> voip: 1-647-723-2729
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>> there's
>>>>>>>
>>>>>>> key="this=that"  ; an = in the value
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> key="value" ; a " in the comment
>>>>>>>
>>>>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>> while (<DATA>) {
>>>>>>>>    chomp;
>>>>>>>>    my ($key, $val) = split /=/;
>>>>>>>>    my ($quote) = ($val =~ m{^(["'])}g);
>>>>>>>>    if ($quote) {
>>>>>>>>        ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>>>>>    }
>>>>>>>>    else {
>>>>>>>>        $val =~ s/$strip//;
>>>>>>>>    }
>>>>>>>>    print $key, '=', $val, "\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>> __DATA__
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="vlue;"
>>>>>>>> key="value1;value2"
>>>>>>>> key="value1;value2"     ; comment
>>>>>>>> key='value1;value2'     ; comment
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1"          ; comment
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>> --
>>>>>>>> Shaun Fryer
>>>>>>>> cell: 1-647-709-6509
>>>>>>>> voip: 1-647-723-2729
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 9, 2011 at 2:12 PM,  <daniel at benoy.name> wrote:
>>>>>>>>> Doesn't work.  Output:
>>>>>>>>>
>>>>>>>>> ----
>>>>>>>>> key="value"
>>>>>>>>> key=value
>>>>>>>>> key="value1
>>>>>>>>> key="value1;value2"
>>>>>>>>> key='value1;value2'
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1
>>>>>>>>> "key"="value1;value2"
>>>>>>>>> "key"="val\"ue1
>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>> "key"='val\'ue1;value2'
>>>>>>>>> "key"='val\"ue1;value2'
>>>>>>>>> ----
>>>>>>>>>
>>>>>>>>> Look at line 3.
>>>>>>>>>
>>>>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>>>>
>>>>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>>>>
>>>>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>>>>> while (<DATA>) {
>>>>>>>>>    chomp;
>>>>>>>>>    $_ =~ s/$strip//;
>>>>>>>>>    print $_, "\n";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here's the way I would do it:
>>>>>>>>>
>>>>>>>>> ----
>>>>>>>>> while (<DATA>) {
>>>>>>>>>    chomp;
>>>>>>>>>
>>>>>>>>>    my $stripped;
>>>>>>>>>    my $quotechar = "";
>>>>>>>>>    foreach my $char (split(//, $_)) {
>>>>>>>>>        if ($quotechar) { # We're currently quoted
>>>>>>>>>            if ($char eq $quotechar) { # end of quote
>>>>>>>>>                $quotechar = "";
>>>>>>>>>            }
>>>>>>>>>        } else { # We're not currently quoted
>>>>>>>>>            if ($char eq ';') { # The comment has begun!
>>>>>>>>>                last();
>>>>>>>>>            } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>>>>>                $quotechar = $char;
>>>>>>>>>            }
>>>>>>>>>        }
>>>>>>>>>        $stripped .= $char;
>>>>>>>>>    }
>>>>>>>>>    print "$stripped\n";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> __DATA__
>>>>>>>>> key="value"
>>>>>>>>> key=value
>>>>>>>>> key="value1;value2"
>>>>>>>>> key="value1;value2"     ; comment
>>>>>>>>> key='value1;value2'     ; comment
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1"          ; comment
>>>>>>>>> "key"="value1;value2"
>>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>> ----
>>>>>>>>>
>>>>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>>>>
>>>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>>>> while (<DATA>) {
>>>>>>>>>>    chomp;
>>>>>>>>>>    $_ =~ s/$strip//;
>>>>>>>>>>    print $_, "\n";
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> __DATA__
>>>>>>>>>> key="value"
>>>>>>>>>> key=value
>>>>>>>>>> key="value1;value2"
>>>>>>>>>> key="value1;value2"     ; comment
>>>>>>>>>> key='value1;value2'     ; comment
>>>>>>>>>> "key"="value1"
>>>>>>>>>> "key"="value1"          ; comment
>>>>>>>>>> "key"="value1;value2"
>>>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Shaun Fryer
>>>>>>>>>> cell: 1-647-709-6509
>>>>>>>>>> voip: 1-647-723-2729
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the ideas, but..
>>>>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>>>>> them),
>>>>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>>>>> strings)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> toronto-pm mailing list
>>>>>>>>>>> toronto-pm at pm.org
>>>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> toronto-pm mailing list
>>>>>>>>>> toronto-pm at pm.org
>>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> toronto-pm mailing list
>>>>>>>>> toronto-pm at pm.org
>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the toronto-pm mailing list