[tpm] I wish I was better at regex's
Rob Janes
janes.rob at gmail.com
Wed Mar 9 11:46:53 PST 2011
================
line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
matches!
data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
comment is comment
There's a key (in double quotes), an =, a value (in double quotes),
and a comment.
I don't understand. looks to me like it did pull off the comment properly.
On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
> nice (though pretty convoluted). the following breaks it.
>
> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
> with " ' and ;
> --
> Shaun Fryer
> cell: 1-647-709-6509
> voip: 1-647-723-2729
>
>
>
>
> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>> this strips off the comment
>>
>> #!/usr/bin/env perl
>>
>> use strict;
>> use warnings;
>>
>> while (<DATA>) {
>> chomp;
>>
>> print "\n\n================\nline: $_\n";
>>
>> if (m{
>> ^
>> (
>> (?:[^\\"';\s]|
>> \\.|
>> \s|
>> "(?:[^"\\]|\\.)*"|
>> '(?:[^'\\]|\\.)*'
>> )+
>> )
>> (?:\s*;\s*(.*))?
>> $
>> }x)
>> {
>> my ($words, $comment) = ($1, $2);
>> $comment = "" unless defined($comment);
>> print "matches!\n";
>>
>> print "data is $words\ncomment is $comment\n";
>> }
>> else
>> {
>> print "match FAILED!\n";
>> }
>> }
>>
>> __DATA__
>> key="value"
>> key=value
>> key="value;"
>> key="value1;value2"
>> key="value1;value2" ; comment
>> key='value1;value2' ; comment
>> "key"="value1"
>> "key"="value1" ; comment
>> "key"="value1;value2"
>> "key"="value1;value2" ; comment
>> "key"="val\"ue1;value2"
>> "key"="val\"ue1;value2" ; comment
>> "key"='val\'ue1;value2' ; comment
>> "key"='val\"ue1;value2' ; comment
>> key="this=that" ; an = in the value
>> key="value" ; a " in the comment
>> this is a title ; and this is a comment
>>
>>
>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>> comment with " ' and ;
>>> --
>>> Shaun Fryer
>>> cell: 1-647-709-6509
>>> voip: 1-647-723-2729
>>>
>>>
>>>
>>>
>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>> there's
>>>>
>>>> key="this=that" ; an = in the value
>>>>
>>>> and
>>>>
>>>> key="value" ; a " in the comment
>>>>
>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>> my $strip = qr{;[^\;]+$};
>>>>> while (<DATA>) {
>>>>> chomp;
>>>>> my ($key, $val) = split /=/;
>>>>> my ($quote) = ($val =~ m{^(["'])}g);
>>>>> if ($quote) {
>>>>> ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>> }
>>>>> else {
>>>>> $val =~ s/$strip//;
>>>>> }
>>>>> print $key, '=', $val, "\n";
>>>>> }
>>>>>
>>>>> __DATA__
>>>>> key="value"
>>>>> key=value
>>>>> key="vlue;"
>>>>> key="value1;value2"
>>>>> key="value1;value2" ; comment
>>>>> key='value1;value2' ; comment
>>>>> "key"="value1"
>>>>> "key"="value1" ; comment
>>>>> "key"="value1;value2"
>>>>> "key"="value1;value2" ; comment
>>>>> "key"="val\"ue1;value2"
>>>>> "key"="val\"ue1;value2" ; comment
>>>>> "key"='val\'ue1;value2' ; comment
>>>>> "key"='val\"ue1;value2' ; comment
>>>>> --
>>>>> Shaun Fryer
>>>>> cell: 1-647-709-6509
>>>>> voip: 1-647-723-2729
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:12 PM, <daniel at benoy.name> wrote:
>>>>>> Doesn't work. Output:
>>>>>>
>>>>>> ----
>>>>>> key="value"
>>>>>> key=value
>>>>>> key="value1
>>>>>> key="value1;value2"
>>>>>> key='value1;value2'
>>>>>> "key"="value1"
>>>>>> "key"="value1"
>>>>>> "key"="value1
>>>>>> "key"="value1;value2"
>>>>>> "key"="val\"ue1
>>>>>> "key"="val\"ue1;value2"
>>>>>> "key"='val\'ue1;value2'
>>>>>> "key"='val\"ue1;value2'
>>>>>> ----
>>>>>>
>>>>>> Look at line 3.
>>>>>>
>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>
>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>
>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>> while (<DATA>) {
>>>>>> chomp;
>>>>>> $_ =~ s/$strip//;
>>>>>> print $_, "\n";
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Here's the way I would do it:
>>>>>>
>>>>>> ----
>>>>>> while (<DATA>) {
>>>>>> chomp;
>>>>>>
>>>>>> my $stripped;
>>>>>> my $quotechar = "";
>>>>>> foreach my $char (split(//, $_)) {
>>>>>> if ($quotechar) { # We're currently quoted
>>>>>> if ($char eq $quotechar) { # end of quote
>>>>>> $quotechar = "";
>>>>>> }
>>>>>> } else { # We're not currently quoted
>>>>>> if ($char eq ';') { # The comment has begun!
>>>>>> last();
>>>>>> } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>> $quotechar = $char;
>>>>>> }
>>>>>> }
>>>>>> $stripped .= $char;
>>>>>> }
>>>>>> print "$stripped\n";
>>>>>> }
>>>>>>
>>>>>> __DATA__
>>>>>> key="value"
>>>>>> key=value
>>>>>> key="value1;value2"
>>>>>> key="value1;value2" ; comment
>>>>>> key='value1;value2' ; comment
>>>>>> "key"="value1"
>>>>>> "key"="value1" ; comment
>>>>>> "key"="value1;value2"
>>>>>> "key"="value1;value2" ; comment
>>>>>> "key"="val\"ue1;value2"
>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>> ----
>>>>>>
>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>
>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>> while (<DATA>) {
>>>>>>> chomp;
>>>>>>> $_ =~ s/$strip//;
>>>>>>> print $_, "\n";
>>>>>>> }
>>>>>>>
>>>>>>> __DATA__
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="value1;value2"
>>>>>>> key="value1;value2" ; comment
>>>>>>> key='value1;value2' ; comment
>>>>>>> "key"="value1"
>>>>>>> "key"="value1" ; comment
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="value1;value2" ; comment
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>
>>>>>>> --
>>>>>>> Shaun Fryer
>>>>>>> cell: 1-647-709-6509
>>>>>>> voip: 1-647-723-2729
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>
>>>>>>>> Thanks for the ideas, but..
>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>> them),
>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>> strings)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> toronto-pm mailing list
>>>>>>> toronto-pm at pm.org
>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>
>>>>>> _______________________________________________
>>>>>> toronto-pm mailing list
>>>>>> toronto-pm at pm.org
>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>
>>>>> _______________________________________________
>>>>> toronto-pm mailing list
>>>>> toronto-pm at pm.org
>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>
>>>>
>>>
>>
>
More information about the toronto-pm
mailing list