[tpm] I wish I was better at regex's

Shaun Fryer sfryer at sourcery.ca
Wed Mar 9 11:33:04 PST 2011


Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
comment with " ' and ;
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729




On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
> there's
>
> key="this=that"  ; an = in the value
>
> and
>
> key="value" ; a " in the comment
>
> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> my $strip = qr{;[^\;]+$};
>> while (<DATA>) {
>>    chomp;
>>    my ($key, $val) = split /=/;
>>    my ($quote) = ($val =~ m{^(["'])}g);
>>    if ($quote) {
>>        ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>    }
>>    else {
>>        $val =~ s/$strip//;
>>    }
>>    print $key, '=', $val, "\n";
>> }
>>
>> __DATA__
>> key="value"
>> key=value
>> key="vlue;"
>> key="value1;value2"
>> key="value1;value2"     ; comment
>> key='value1;value2'     ; comment
>> "key"="value1"
>> "key"="value1"          ; comment
>> "key"="value1;value2"
>> "key"="value1;value2"   ; comment
>> "key"="val\"ue1;value2"
>> "key"="val\"ue1;value2" ; comment
>> "key"='val\'ue1;value2' ; comment
>> "key"='val\"ue1;value2' ; comment
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:12 PM,  <daniel at benoy.name> wrote:
>>> Doesn't work.  Output:
>>>
>>> ----
>>> key="value"
>>> key=value
>>> key="value1
>>> key="value1;value2"
>>> key='value1;value2'
>>> "key"="value1"
>>> "key"="value1"
>>> "key"="value1
>>> "key"="value1;value2"
>>> "key"="val\"ue1
>>> "key"="val\"ue1;value2"
>>> "key"='val\'ue1;value2'
>>> "key"='val\"ue1;value2'
>>> ----
>>>
>>> Look at line 3.
>>>
>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>
>>> Here's a quick and dirty improvement, but it will still have problems:
>>>
>>> my $strip = qr{;[^\;\"\']*$};
>>> while (<DATA>) {
>>>    chomp;
>>>    $_ =~ s/$strip//;
>>>    print $_, "\n";
>>> }
>>>
>>>
>>> Here's the way I would do it:
>>>
>>> ----
>>> while (<DATA>) {
>>>    chomp;
>>>
>>>    my $stripped;
>>>    my $quotechar = "";
>>>    foreach my $char (split(//, $_)) {
>>>        if ($quotechar) { # We're currently quoted
>>>            if ($char eq $quotechar) { # end of quote
>>>                $quotechar = "";
>>>            }
>>>        } else { # We're not currently quoted
>>>            if ($char eq ';') { # The comment has begun!
>>>                last();
>>>            } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>                $quotechar = $char;
>>>            }
>>>        }
>>>        $stripped .= $char;
>>>    }
>>>    print "$stripped\n";
>>> }
>>>
>>> __DATA__
>>> key="value"
>>> key=value
>>> key="value1;value2"
>>> key="value1;value2"     ; comment
>>> key='value1;value2'     ; comment
>>> "key"="value1"
>>> "key"="value1"          ; comment
>>> "key"="value1;value2"
>>> "key"="value1;value2"   ; comment
>>> "key"="val\"ue1;value2"
>>> "key"="val\"ue1;value2" ; comment
>>> "key"='val\'ue1;value2' ; comment
>>> "key"='val\"ue1;value2' ; comment
>>> ----
>>>
>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>
>>>> my $strip = qr{;[^\;]+$};
>>>> while (<DATA>) {
>>>>    chomp;
>>>>    $_ =~ s/$strip//;
>>>>    print $_, "\n";
>>>> }
>>>>
>>>> __DATA__
>>>> key="value"
>>>> key=value
>>>> key="value1;value2"
>>>> key="value1;value2"     ; comment
>>>> key='value1;value2'     ; comment
>>>> "key"="value1"
>>>> "key"="value1"          ; comment
>>>> "key"="value1;value2"
>>>> "key"="value1;value2"   ; comment
>>>> "key"="val\"ue1;value2"
>>>> "key"="val\"ue1;value2" ; comment
>>>> "key"='val\'ue1;value2' ; comment
>>>> "key"='val\"ue1;value2' ; comment
>>>>
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>
>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>
>>>>>> here's one that dequotes the key and value ...
>>>>>
>>>>> Thanks for the ideas, but..
>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>> them),
>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>> strings)
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> toronto-pm mailing list
>>>>> toronto-pm at pm.org
>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>
>>>>>
>>>> _______________________________________________
>>>> toronto-pm mailing list
>>>> toronto-pm at pm.org
>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>
>>> _______________________________________________
>>> toronto-pm mailing list
>>> toronto-pm at pm.org
>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>
>> _______________________________________________
>> toronto-pm mailing list
>> toronto-pm at pm.org
>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>
>


More information about the toronto-pm mailing list