[tpm] I wish I was better at regex's
Shaun Fryer
sfryer at sourcery.ca
Wed Mar 9 11:55:01 PST 2011
Ah. My bad! You win!
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729
On Wed, Mar 9, 2011 at 2:53 PM, Rob Janes <janes.rob at gmail.com> wrote:
> you're missing a double quote before key.
>
> ================
> line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
> comment with = " ' and ;
> matches!
> data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
> comment is stuff\'" ; comment with = " ' and ;
>
>
> ================
> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
> comment with = " ' and ;
> matches!
> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
> comment is comment with = " ' and ;
>
>
> On Wed, Mar 9, 2011 at 2:48 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> funny. I got...
>>
>> line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
>> comment with = " ' and ;
>> matches!
>> data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
>> comment is stuff\'" ; comment with = " ' and ;
>>
>> as you can see, the (stuff\'";) got put in the comment.
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:46 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>> ================
>>> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>>> matches!
>>> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
>>> comment is comment
>>>
>>> There's a key (in double quotes), an =, a value (in double quotes),
>>> and a comment.
>>>
>>> I don't understand. looks to me like it did pull off the comment properly.
>>>
>>> On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>> nice (though pretty convoluted). the following breaks it.
>>>>
>>>> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>>>> with " ' and ;
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>> this strips off the comment
>>>>>
>>>>> #!/usr/bin/env perl
>>>>>
>>>>> use strict;
>>>>> use warnings;
>>>>>
>>>>> while (<DATA>) {
>>>>> chomp;
>>>>>
>>>>> print "\n\n================\nline: $_\n";
>>>>>
>>>>> if (m{
>>>>> ^
>>>>> (
>>>>> (?:[^\\"';\s]|
>>>>> \\.|
>>>>> \s|
>>>>> "(?:[^"\\]|\\.)*"|
>>>>> '(?:[^'\\]|\\.)*'
>>>>> )+
>>>>> )
>>>>> (?:\s*;\s*(.*))?
>>>>> $
>>>>> }x)
>>>>> {
>>>>> my ($words, $comment) = ($1, $2);
>>>>> $comment = "" unless defined($comment);
>>>>> print "matches!\n";
>>>>>
>>>>> print "data is $words\ncomment is $comment\n";
>>>>> }
>>>>> else
>>>>> {
>>>>> print "match FAILED!\n";
>>>>> }
>>>>> }
>>>>>
>>>>> __DATA__
>>>>> key="value"
>>>>> key=value
>>>>> key="value;"
>>>>> key="value1;value2"
>>>>> key="value1;value2" ; comment
>>>>> key='value1;value2' ; comment
>>>>> "key"="value1"
>>>>> "key"="value1" ; comment
>>>>> "key"="value1;value2"
>>>>> "key"="value1;value2" ; comment
>>>>> "key"="val\"ue1;value2"
>>>>> "key"="val\"ue1;value2" ; comment
>>>>> "key"='val\'ue1;value2' ; comment
>>>>> "key"='val\"ue1;value2' ; comment
>>>>> key="this=that" ; an = in the value
>>>>> key="value" ; a " in the comment
>>>>> this is a title ; and this is a comment
>>>>>
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>>>>> comment with " ' and ;
>>>>>> --
>>>>>> Shaun Fryer
>>>>>> cell: 1-647-709-6509
>>>>>> voip: 1-647-723-2729
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>> there's
>>>>>>>
>>>>>>> key="this=that" ; an = in the value
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> key="value" ; a " in the comment
>>>>>>>
>>>>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>> while (<DATA>) {
>>>>>>>> chomp;
>>>>>>>> my ($key, $val) = split /=/;
>>>>>>>> my ($quote) = ($val =~ m{^(["'])}g);
>>>>>>>> if ($quote) {
>>>>>>>> ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>>>>> }
>>>>>>>> else {
>>>>>>>> $val =~ s/$strip//;
>>>>>>>> }
>>>>>>>> print $key, '=', $val, "\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>> __DATA__
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="vlue;"
>>>>>>>> key="value1;value2"
>>>>>>>> key="value1;value2" ; comment
>>>>>>>> key='value1;value2' ; comment
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1" ; comment
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="value1;value2" ; comment
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>> --
>>>>>>>> Shaun Fryer
>>>>>>>> cell: 1-647-709-6509
>>>>>>>> voip: 1-647-723-2729
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 9, 2011 at 2:12 PM, <daniel at benoy.name> wrote:
>>>>>>>>> Doesn't work. Output:
>>>>>>>>>
>>>>>>>>> ----
>>>>>>>>> key="value"
>>>>>>>>> key=value
>>>>>>>>> key="value1
>>>>>>>>> key="value1;value2"
>>>>>>>>> key='value1;value2'
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1
>>>>>>>>> "key"="value1;value2"
>>>>>>>>> "key"="val\"ue1
>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>> "key"='val\'ue1;value2'
>>>>>>>>> "key"='val\"ue1;value2'
>>>>>>>>> ----
>>>>>>>>>
>>>>>>>>> Look at line 3.
>>>>>>>>>
>>>>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>>>>
>>>>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>>>>
>>>>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>>>>> while (<DATA>) {
>>>>>>>>> chomp;
>>>>>>>>> $_ =~ s/$strip//;
>>>>>>>>> print $_, "\n";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here's the way I would do it:
>>>>>>>>>
>>>>>>>>> ----
>>>>>>>>> while (<DATA>) {
>>>>>>>>> chomp;
>>>>>>>>>
>>>>>>>>> my $stripped;
>>>>>>>>> my $quotechar = "";
>>>>>>>>> foreach my $char (split(//, $_)) {
>>>>>>>>> if ($quotechar) { # We're currently quoted
>>>>>>>>> if ($char eq $quotechar) { # end of quote
>>>>>>>>> $quotechar = "";
>>>>>>>>> }
>>>>>>>>> } else { # We're not currently quoted
>>>>>>>>> if ($char eq ';') { # The comment has begun!
>>>>>>>>> last();
>>>>>>>>> } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>>>>> $quotechar = $char;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> $stripped .= $char;
>>>>>>>>> }
>>>>>>>>> print "$stripped\n";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> __DATA__
>>>>>>>>> key="value"
>>>>>>>>> key=value
>>>>>>>>> key="value1;value2"
>>>>>>>>> key="value1;value2" ; comment
>>>>>>>>> key='value1;value2' ; comment
>>>>>>>>> "key"="value1"
>>>>>>>>> "key"="value1" ; comment
>>>>>>>>> "key"="value1;value2"
>>>>>>>>> "key"="value1;value2" ; comment
>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>> ----
>>>>>>>>>
>>>>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>>>>
>>>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>>>> while (<DATA>) {
>>>>>>>>>> chomp;
>>>>>>>>>> $_ =~ s/$strip//;
>>>>>>>>>> print $_, "\n";
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> __DATA__
>>>>>>>>>> key="value"
>>>>>>>>>> key=value
>>>>>>>>>> key="value1;value2"
>>>>>>>>>> key="value1;value2" ; comment
>>>>>>>>>> key='value1;value2' ; comment
>>>>>>>>>> "key"="value1"
>>>>>>>>>> "key"="value1" ; comment
>>>>>>>>>> "key"="value1;value2"
>>>>>>>>>> "key"="value1;value2" ; comment
>>>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Shaun Fryer
>>>>>>>>>> cell: 1-647-709-6509
>>>>>>>>>> voip: 1-647-723-2729
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the ideas, but..
>>>>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>>>>> them),
>>>>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>>>>> strings)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> toronto-pm mailing list
>>>>>>>>>>> toronto-pm at pm.org
>>>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> toronto-pm mailing list
>>>>>>>>>> toronto-pm at pm.org
>>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> toronto-pm mailing list
>>>>>>>>> toronto-pm at pm.org
>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the toronto-pm
mailing list