[tpm] I wish I was better at regex's

Rob Janes janes.rob at gmail.com
Wed Mar 9 11:53:10 PST 2011


you're missing a double quote before key.

================
line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
comment with = " ' and ;
matches!
data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
comment is stuff\'" ; comment with = " ' and ;


================
line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
comment with = " ' and ;
matches!
data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
comment is comment with = " ' and ;


On Wed, Mar 9, 2011 at 2:48 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
> funny. I got...
>
> line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
> comment with = " ' and ;
> matches!
> data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
> comment is stuff\'" ; comment with = " ' and ;
>
> as you can see, the (stuff\'";) got put in the comment.
> --
> Shaun Fryer
> cell: 1-647-709-6509
> voip: 1-647-723-2729
>
>
>
>
> On Wed, Mar 9, 2011 at 2:46 PM, Rob Janes <janes.rob at gmail.com> wrote:
>> ================
>> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>> matches!
>> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
>> comment is comment
>>
>> There's a key (in double quotes), an =, a value (in double quotes),
>> and a comment.
>>
>> I don't understand.  looks to me like it did pull off the comment properly.
>>
>> On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>> nice (though pretty convoluted). the following breaks it.
>>>
>>> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>>> with " ' and ;
>>> --
>>> Shaun Fryer
>>> cell: 1-647-709-6509
>>> voip: 1-647-723-2729
>>>
>>>
>>>
>>>
>>> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>> this strips off the comment
>>>>
>>>> #!/usr/bin/env perl
>>>>
>>>> use strict;
>>>> use warnings;
>>>>
>>>> while (<DATA>) {
>>>>  chomp;
>>>>
>>>>  print "\n\n================\nline: $_\n";
>>>>
>>>>  if (m{
>>>>        ^
>>>>        (
>>>>          (?:[^\\"';\s]|
>>>>            \\.|
>>>>            \s|
>>>>            "(?:[^"\\]|\\.)*"|
>>>>            '(?:[^'\\]|\\.)*'
>>>>          )+
>>>>        )
>>>>        (?:\s*;\s*(.*))?
>>>>        $
>>>>        }x)
>>>>  {
>>>>    my ($words, $comment) = ($1, $2);
>>>>    $comment = "" unless defined($comment);
>>>>    print "matches!\n";
>>>>
>>>>    print "data is $words\ncomment is $comment\n";
>>>>  }
>>>>  else
>>>>  {
>>>>     print "match FAILED!\n";
>>>>  }
>>>> }
>>>>
>>>> __DATA__
>>>> key="value"
>>>> key=value
>>>> key="value;"
>>>> key="value1;value2"
>>>> key="value1;value2"     ; comment
>>>> key='value1;value2'     ; comment
>>>> "key"="value1"
>>>> "key"="value1"          ; comment
>>>> "key"="value1;value2"
>>>> "key"="value1;value2"   ; comment
>>>> "key"="val\"ue1;value2"
>>>> "key"="val\"ue1;value2" ; comment
>>>> "key"='val\'ue1;value2' ; comment
>>>> "key"='val\"ue1;value2' ; comment
>>>> key="this=that" ; an = in the value
>>>> key="value" ; a " in the comment
>>>> this is a title   ; and this is a comment
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>>>> comment with " ' and ;
>>>>> --
>>>>> Shaun Fryer
>>>>> cell: 1-647-709-6509
>>>>> voip: 1-647-723-2729
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>> there's
>>>>>>
>>>>>> key="this=that"  ; an = in the value
>>>>>>
>>>>>> and
>>>>>>
>>>>>> key="value" ; a " in the comment
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>> while (<DATA>) {
>>>>>>>    chomp;
>>>>>>>    my ($key, $val) = split /=/;
>>>>>>>    my ($quote) = ($val =~ m{^(["'])}g);
>>>>>>>    if ($quote) {
>>>>>>>        ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>>>>    }
>>>>>>>    else {
>>>>>>>        $val =~ s/$strip//;
>>>>>>>    }
>>>>>>>    print $key, '=', $val, "\n";
>>>>>>> }
>>>>>>>
>>>>>>> __DATA__
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="vlue;"
>>>>>>> key="value1;value2"
>>>>>>> key="value1;value2"     ; comment
>>>>>>> key='value1;value2'     ; comment
>>>>>>> "key"="value1"
>>>>>>> "key"="value1"          ; comment
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="value1;value2"   ; comment
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>> --
>>>>>>> Shaun Fryer
>>>>>>> cell: 1-647-709-6509
>>>>>>> voip: 1-647-723-2729
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 9, 2011 at 2:12 PM,  <daniel at benoy.name> wrote:
>>>>>>>> Doesn't work.  Output:
>>>>>>>>
>>>>>>>> ----
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="value1
>>>>>>>> key="value1;value2"
>>>>>>>> key='value1;value2'
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="val\"ue1
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"='val\'ue1;value2'
>>>>>>>> "key"='val\"ue1;value2'
>>>>>>>> ----
>>>>>>>>
>>>>>>>> Look at line 3.
>>>>>>>>
>>>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>>>
>>>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>>>
>>>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>>>> while (<DATA>) {
>>>>>>>>    chomp;
>>>>>>>>    $_ =~ s/$strip//;
>>>>>>>>    print $_, "\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> Here's the way I would do it:
>>>>>>>>
>>>>>>>> ----
>>>>>>>> while (<DATA>) {
>>>>>>>>    chomp;
>>>>>>>>
>>>>>>>>    my $stripped;
>>>>>>>>    my $quotechar = "";
>>>>>>>>    foreach my $char (split(//, $_)) {
>>>>>>>>        if ($quotechar) { # We're currently quoted
>>>>>>>>            if ($char eq $quotechar) { # end of quote
>>>>>>>>                $quotechar = "";
>>>>>>>>            }
>>>>>>>>        } else { # We're not currently quoted
>>>>>>>>            if ($char eq ';') { # The comment has begun!
>>>>>>>>                last();
>>>>>>>>            } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>>>>                $quotechar = $char;
>>>>>>>>            }
>>>>>>>>        }
>>>>>>>>        $stripped .= $char;
>>>>>>>>    }
>>>>>>>>    print "$stripped\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>> __DATA__
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="value1;value2"
>>>>>>>> key="value1;value2"     ; comment
>>>>>>>> key='value1;value2'     ; comment
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1"          ; comment
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>> ----
>>>>>>>>
>>>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>>>
>>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>>> while (<DATA>) {
>>>>>>>>>    chomp;
>>>>>>>>>    $_ =~ s/$strip//;
>>>>>>>>>    print $_, "\n";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> __DATA__
>>>>>>>>> key="value"
>>>>>>>>> key=value
>>>>>>>>> key="value1;value2"
>>>>>>>>> key="value1;value2"     ; comment
>>>>>>>>> key='value1;value2'     ; comment
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1"          ; comment
>>>>>>>>> "key"="value1;value2"
>>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Shaun Fryer
>>>>>>>>> cell: 1-647-709-6509
>>>>>>>>> voip: 1-647-723-2729
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>>>
>>>>>>>>>> Thanks for the ideas, but..
>>>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>>>> them),
>>>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>>>> strings)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> toronto-pm mailing list
>>>>>>>>>> toronto-pm at pm.org
>>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> toronto-pm mailing list
>>>>>>>>> toronto-pm at pm.org
>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> toronto-pm mailing list
>>>>>>> toronto-pm at pm.org
>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the toronto-pm mailing list