[tpm] I wish I was better at regex's

Shaun Fryer sfryer at sourcery.ca
Wed Mar 9 11:43:21 PST 2011


nice (though pretty convoluted). the following breaks it.

"key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
with " ' and ;
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729




On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
> this strips off the comment
>
> #!/usr/bin/env perl
>
> use strict;
> use warnings;
>
> while (<DATA>) {
>  chomp;
>
>  print "\n\n================\nline: $_\n";
>
>  if (m{
>        ^
>        (
>          (?:[^\\"';\s]|
>            \\.|
>            \s|
>            "(?:[^"\\]|\\.)*"|
>            '(?:[^'\\]|\\.)*'
>          )+
>        )
>        (?:\s*;\s*(.*))?
>        $
>        }x)
>  {
>    my ($words, $comment) = ($1, $2);
>    $comment = "" unless defined($comment);
>    print "matches!\n";
>
>    print "data is $words\ncomment is $comment\n";
>  }
>  else
>  {
>     print "match FAILED!\n";
>  }
> }
>
> __DATA__
> key="value"
> key=value
> key="value;"
> key="value1;value2"
> key="value1;value2"     ; comment
> key='value1;value2'     ; comment
> "key"="value1"
> "key"="value1"          ; comment
> "key"="value1;value2"
> "key"="value1;value2"   ; comment
> "key"="val\"ue1;value2"
> "key"="val\"ue1;value2" ; comment
> "key"='val\'ue1;value2' ; comment
> "key"='val\"ue1;value2' ; comment
> key="this=that" ; an = in the value
> key="value" ; a " in the comment
> this is a title   ; and this is a comment
>
>
> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>> comment with " ' and ;
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>> there's
>>>
>>> key="this=that"  ; an = in the value
>>>
>>> and
>>>
>>> key="value" ; a " in the comment
>>>
>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>> my $strip = qr{;[^\;]+$};
>>>> while (<DATA>) {
>>>>    chomp;
>>>>    my ($key, $val) = split /=/;
>>>>    my ($quote) = ($val =~ m{^(["'])}g);
>>>>    if ($quote) {
>>>>        ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>    }
>>>>    else {
>>>>        $val =~ s/$strip//;
>>>>    }
>>>>    print $key, '=', $val, "\n";
>>>> }
>>>>
>>>> __DATA__
>>>> key="value"
>>>> key=value
>>>> key="vlue;"
>>>> key="value1;value2"
>>>> key="value1;value2"     ; comment
>>>> key='value1;value2'     ; comment
>>>> "key"="value1"
>>>> "key"="value1"          ; comment
>>>> "key"="value1;value2"
>>>> "key"="value1;value2"   ; comment
>>>> "key"="val\"ue1;value2"
>>>> "key"="val\"ue1;value2" ; comment
>>>> "key"='val\'ue1;value2' ; comment
>>>> "key"='val\"ue1;value2' ; comment
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:12 PM,  <daniel at benoy.name> wrote:
>>>>> Doesn't work.  Output:
>>>>>
>>>>> ----
>>>>> key="value"
>>>>> key=value
>>>>> key="value1
>>>>> key="value1;value2"
>>>>> key='value1;value2'
>>>>> "key"="value1"
>>>>> "key"="value1"
>>>>> "key"="value1
>>>>> "key"="value1;value2"
>>>>> "key"="val\"ue1
>>>>> "key"="val\"ue1;value2"
>>>>> "key"='val\'ue1;value2'
>>>>> "key"='val\"ue1;value2'
>>>>> ----
>>>>>
>>>>> Look at line 3.
>>>>>
>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>
>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>
>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>> while (<DATA>) {
>>>>>    chomp;
>>>>>    $_ =~ s/$strip//;
>>>>>    print $_, "\n";
>>>>> }
>>>>>
>>>>>
>>>>> Here's the way I would do it:
>>>>>
>>>>> ----
>>>>> while (<DATA>) {
>>>>>    chomp;
>>>>>
>>>>>    my $stripped;
>>>>>    my $quotechar = "";
>>>>>    foreach my $char (split(//, $_)) {
>>>>>        if ($quotechar) { # We're currently quoted
>>>>>            if ($char eq $quotechar) { # end of quote
>>>>>                $quotechar = "";
>>>>>            }
>>>>>        } else { # We're not currently quoted
>>>>>            if ($char eq ';') { # The comment has begun!
>>>>>                last();
>>>>>            } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>                $quotechar = $char;
>>>>>            }
>>>>>        }
>>>>>        $stripped .= $char;
>>>>>    }
>>>>>    print "$stripped\n";
>>>>> }
>>>>>
>>>>> __DATA__
>>>>> key="value"
>>>>> key=value
>>>>> key="value1;value2"
>>>>> key="value1;value2"     ; comment
>>>>> key='value1;value2'     ; comment
>>>>> "key"="value1"
>>>>> "key"="value1"          ; comment
>>>>> "key"="value1;value2"
>>>>> "key"="value1;value2"   ; comment
>>>>> "key"="val\"ue1;value2"
>>>>> "key"="val\"ue1;value2" ; comment
>>>>> "key"='val\'ue1;value2' ; comment
>>>>> "key"='val\"ue1;value2' ; comment
>>>>> ----
>>>>>
>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>
>>>>>> my $strip = qr{;[^\;]+$};
>>>>>> while (<DATA>) {
>>>>>>    chomp;
>>>>>>    $_ =~ s/$strip//;
>>>>>>    print $_, "\n";
>>>>>> }
>>>>>>
>>>>>> __DATA__
>>>>>> key="value"
>>>>>> key=value
>>>>>> key="value1;value2"
>>>>>> key="value1;value2"     ; comment
>>>>>> key='value1;value2'     ; comment
>>>>>> "key"="value1"
>>>>>> "key"="value1"          ; comment
>>>>>> "key"="value1;value2"
>>>>>> "key"="value1;value2"   ; comment
>>>>>> "key"="val\"ue1;value2"
>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>
>>>>>> --
>>>>>> Shaun Fryer
>>>>>> cell: 1-647-709-6509
>>>>>> voip: 1-647-723-2729
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>
>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>
>>>>>>> Thanks for the ideas, but..
>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>> them),
>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>> strings)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> toronto-pm mailing list
>>>>>>> toronto-pm at pm.org
>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> toronto-pm mailing list
>>>>>> toronto-pm at pm.org
>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>
>>>>> _______________________________________________
>>>>> toronto-pm mailing list
>>>>> toronto-pm at pm.org
>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>
>>>> _______________________________________________
>>>> toronto-pm mailing list
>>>> toronto-pm at pm.org
>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>
>>>
>>
>


More information about the toronto-pm mailing list