[tpm] I wish I was better at regex's

Shaun Fryer sfryer at sourcery.ca
Wed Mar 9 11:48:30 PST 2011


funny. I got...

line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
comment with = " ' and ;
matches!
data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
comment is stuff\'" ; comment with = " ' and ;

as you can see, the (stuff\'";) got put in the comment.
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729




On Wed, Mar 9, 2011 at 2:46 PM, Rob Janes <janes.rob at gmail.com> wrote:
> ================
> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
> matches!
> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
> comment is comment
>
> There's a key (in double quotes), an =, a value (in double quotes),
> and a comment.
>
> I don't understand.  looks to me like it did pull off the comment properly.
>
> On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> nice (though pretty convoluted). the following breaks it.
>>
>> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>> with " ' and ;
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>> this strips off the comment
>>>
>>> #!/usr/bin/env perl
>>>
>>> use strict;
>>> use warnings;
>>>
>>> while (<DATA>) {
>>>  chomp;
>>>
>>>  print "\n\n================\nline: $_\n";
>>>
>>>  if (m{
>>>        ^
>>>        (
>>>          (?:[^\\"';\s]|
>>>            \\.|
>>>            \s|
>>>            "(?:[^"\\]|\\.)*"|
>>>            '(?:[^'\\]|\\.)*'
>>>          )+
>>>        )
>>>        (?:\s*;\s*(.*))?
>>>        $
>>>        }x)
>>>  {
>>>    my ($words, $comment) = ($1, $2);
>>>    $comment = "" unless defined($comment);
>>>    print "matches!\n";
>>>
>>>    print "data is $words\ncomment is $comment\n";
>>>  }
>>>  else
>>>  {
>>>     print "match FAILED!\n";
>>>  }
>>> }
>>>
>>> __DATA__
>>> key="value"
>>> key=value
>>> key="value;"
>>> key="value1;value2"
>>> key="value1;value2"     ; comment
>>> key='value1;value2'     ; comment
>>> "key"="value1"
>>> "key"="value1"          ; comment
>>> "key"="value1;value2"
>>> "key"="value1;value2"   ; comment
>>> "key"="val\"ue1;value2"
>>> "key"="val\"ue1;value2" ; comment
>>> "key"='val\'ue1;value2' ; comment
>>> "key"='val\"ue1;value2' ; comment
>>> key="this=that" ; an = in the value
>>> key="value" ; a " in the comment
>>> this is a title   ; and this is a comment
>>>
>>>
>>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>>> comment with " ' and ;
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>> there's
>>>>>
>>>>> key="this=that"  ; an = in the value
>>>>>
>>>>> and
>>>>>
>>>>> key="value" ; a " in the comment
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>> my $strip = qr{;[^\;]+$};
>>>>>> while (<DATA>) {
>>>>>>    chomp;
>>>>>>    my ($key, $val) = split /=/;
>>>>>>    my ($quote) = ($val =~ m{^(["'])}g);
>>>>>>    if ($quote) {
>>>>>>        ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>>>    }
>>>>>>    else {
>>>>>>        $val =~ s/$strip//;
>>>>>>    }
>>>>>>    print $key, '=', $val, "\n";
>>>>>> }
>>>>>>
>>>>>> __DATA__
>>>>>> key="value"
>>>>>> key=value
>>>>>> key="vlue;"
>>>>>> key="value1;value2"
>>>>>> key="value1;value2"     ; comment
>>>>>> key='value1;value2'     ; comment
>>>>>> "key"="value1"
>>>>>> "key"="value1"          ; comment
>>>>>> "key"="value1;value2"
>>>>>> "key"="value1;value2"   ; comment
>>>>>> "key"="val\"ue1;value2"
>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>> --
>>>>>> Shaun Fryer
>>>>>> cell: 1-647-709-6509
>>>>>> voip: 1-647-723-2729
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 2:12 PM,  <daniel at benoy.name> wrote:
>>>>>>> Doesn't work.  Output:
>>>>>>>
>>>>>>> ----
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="value1
>>>>>>> key="value1;value2"
>>>>>>> key='value1;value2'
>>>>>>> "key"="value1"
>>>>>>> "key"="value1"
>>>>>>> "key"="value1
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="val\"ue1
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"='val\'ue1;value2'
>>>>>>> "key"='val\"ue1;value2'
>>>>>>> ----
>>>>>>>
>>>>>>> Look at line 3.
>>>>>>>
>>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>>
>>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>>
>>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>>> while (<DATA>) {
>>>>>>>    chomp;
>>>>>>>    $_ =~ s/$strip//;
>>>>>>>    print $_, "\n";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Here's the way I would do it:
>>>>>>>
>>>>>>> ----
>>>>>>> while (<DATA>) {
>>>>>>>    chomp;
>>>>>>>
>>>>>>>    my $stripped;
>>>>>>>    my $quotechar = "";
>>>>>>>    foreach my $char (split(//, $_)) {
>>>>>>>        if ($quotechar) { # We're currently quoted
>>>>>>>            if ($char eq $quotechar) { # end of quote
>>>>>>>                $quotechar = "";
>>>>>>>            }
>>>>>>>        } else { # We're not currently quoted
>>>>>>>            if ($char eq ';') { # The comment has begun!
>>>>>>>                last();
>>>>>>>            } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>>>                $quotechar = $char;
>>>>>>>            }
>>>>>>>        }
>>>>>>>        $stripped .= $char;
>>>>>>>    }
>>>>>>>    print "$stripped\n";
>>>>>>> }
>>>>>>>
>>>>>>> __DATA__
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="value1;value2"
>>>>>>> key="value1;value2"     ; comment
>>>>>>> key='value1;value2'     ; comment
>>>>>>> "key"="value1"
>>>>>>> "key"="value1"          ; comment
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="value1;value2"   ; comment
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>> ----
>>>>>>>
>>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>>
>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>> while (<DATA>) {
>>>>>>>>    chomp;
>>>>>>>>    $_ =~ s/$strip//;
>>>>>>>>    print $_, "\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>> __DATA__
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="value1;value2"
>>>>>>>> key="value1;value2"     ; comment
>>>>>>>> key='value1;value2'     ; comment
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1"          ; comment
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="value1;value2"   ; comment
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>
>>>>>>>> --
>>>>>>>> Shaun Fryer
>>>>>>>> cell: 1-647-709-6509
>>>>>>>> voip: 1-647-723-2729
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>>
>>>>>>>>> Thanks for the ideas, but..
>>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>>> them),
>>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>>> strings)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> toronto-pm mailing list
>>>>>>>>> toronto-pm at pm.org
>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> toronto-pm mailing list
>>>>>>> toronto-pm at pm.org
>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>> _______________________________________________
>>>>>> toronto-pm mailing list
>>>>>> toronto-pm at pm.org
>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the toronto-pm mailing list