[tpm] I wish I was better at regex's
Shaun Fryer
sfryer at sourcery.ca
Wed Mar 9 11:48:30 PST 2011
funny. I got...
line: key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ;
comment with = " ' and ;
matches!
data is key=\'stuff\'" = "value1=\'stuff\',value2=\'more
comment is stuff\'" ; comment with = " ' and ;
as you can see, the (stuff\'";) got put in the comment.
--
Shaun Fryer
cell: 1-647-709-6509
voip: 1-647-723-2729
On Wed, Mar 9, 2011 at 2:46 PM, Rob Janes <janes.rob at gmail.com> wrote:
> ================
> line: "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
> matches!
> data is "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'"
> comment is comment
>
> There's a key (in double quotes), an =, a value (in double quotes),
> and a comment.
>
> I don't understand. looks to me like it did pull off the comment properly.
>
> On Wed, Mar 9, 2011 at 2:43 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>> nice (though pretty convoluted). the following breaks it.
>>
>> "key=\'stuff\'" = "value1=\'stuff\',value2=\'more ;stuff\'" ; comment
>> with " ' and ;
>> --
>> Shaun Fryer
>> cell: 1-647-709-6509
>> voip: 1-647-723-2729
>>
>>
>>
>>
>> On Wed, Mar 9, 2011 at 2:38 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>> this strips off the comment
>>>
>>> #!/usr/bin/env perl
>>>
>>> use strict;
>>> use warnings;
>>>
>>> while (<DATA>) {
>>> chomp;
>>>
>>> print "\n\n================\nline: $_\n";
>>>
>>> if (m{
>>> ^
>>> (
>>> (?:[^\\"';\s]|
>>> \\.|
>>> \s|
>>> "(?:[^"\\]|\\.)*"|
>>> '(?:[^'\\]|\\.)*'
>>> )+
>>> )
>>> (?:\s*;\s*(.*))?
>>> $
>>> }x)
>>> {
>>> my ($words, $comment) = ($1, $2);
>>> $comment = "" unless defined($comment);
>>> print "matches!\n";
>>>
>>> print "data is $words\ncomment is $comment\n";
>>> }
>>> else
>>> {
>>> print "match FAILED!\n";
>>> }
>>> }
>>>
>>> __DATA__
>>> key="value"
>>> key=value
>>> key="value;"
>>> key="value1;value2"
>>> key="value1;value2" ; comment
>>> key='value1;value2' ; comment
>>> "key"="value1"
>>> "key"="value1" ; comment
>>> "key"="value1;value2"
>>> "key"="value1;value2" ; comment
>>> "key"="val\"ue1;value2"
>>> "key"="val\"ue1;value2" ; comment
>>> "key"='val\'ue1;value2' ; comment
>>> "key"='val\"ue1;value2' ; comment
>>> key="this=that" ; an = in the value
>>> key="value" ; a " in the comment
>>> this is a title ; and this is a comment
>>>
>>>
>>> On Wed, Mar 9, 2011 at 2:33 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>> Indeed. Or "key=\'stuff\'"="value1=\'stuff\',value2=\'more ;stuff\'" ;
>>>> comment with " ' and ;
>>>> --
>>>> Shaun Fryer
>>>> cell: 1-647-709-6509
>>>> voip: 1-647-723-2729
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 9, 2011 at 2:30 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>> there's
>>>>>
>>>>> key="this=that" ; an = in the value
>>>>>
>>>>> and
>>>>>
>>>>> key="value" ; a " in the comment
>>>>>
>>>>> On Wed, Mar 9, 2011 at 2:16 PM, Shaun Fryer <sfryer at sourcery.ca> wrote:
>>>>>> my $strip = qr{;[^\;]+$};
>>>>>> while (<DATA>) {
>>>>>> chomp;
>>>>>> my ($key, $val) = split /=/;
>>>>>> my ($quote) = ($val =~ m{^(["'])}g);
>>>>>> if ($quote) {
>>>>>> ($val) = ($val =~ m{^($quote[^\b]+($quote))}g);
>>>>>> }
>>>>>> else {
>>>>>> $val =~ s/$strip//;
>>>>>> }
>>>>>> print $key, '=', $val, "\n";
>>>>>> }
>>>>>>
>>>>>> __DATA__
>>>>>> key="value"
>>>>>> key=value
>>>>>> key="vlue;"
>>>>>> key="value1;value2"
>>>>>> key="value1;value2" ; comment
>>>>>> key='value1;value2' ; comment
>>>>>> "key"="value1"
>>>>>> "key"="value1" ; comment
>>>>>> "key"="value1;value2"
>>>>>> "key"="value1;value2" ; comment
>>>>>> "key"="val\"ue1;value2"
>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>> --
>>>>>> Shaun Fryer
>>>>>> cell: 1-647-709-6509
>>>>>> voip: 1-647-723-2729
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2011 at 2:12 PM, <daniel at benoy.name> wrote:
>>>>>>> Doesn't work. Output:
>>>>>>>
>>>>>>> ----
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="value1
>>>>>>> key="value1;value2"
>>>>>>> key='value1;value2'
>>>>>>> "key"="value1"
>>>>>>> "key"="value1"
>>>>>>> "key"="value1
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="val\"ue1
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"='val\'ue1;value2'
>>>>>>> "key"='val\"ue1;value2'
>>>>>>> ----
>>>>>>>
>>>>>>> Look at line 3.
>>>>>>>
>>>>>>> Also it wouldn't catch a trailing semicolon with nothing after it.
>>>>>>>
>>>>>>> Here's a quick and dirty improvement, but it will still have problems:
>>>>>>>
>>>>>>> my $strip = qr{;[^\;\"\']*$};
>>>>>>> while (<DATA>) {
>>>>>>> chomp;
>>>>>>> $_ =~ s/$strip//;
>>>>>>> print $_, "\n";
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Here's the way I would do it:
>>>>>>>
>>>>>>> ----
>>>>>>> while (<DATA>) {
>>>>>>> chomp;
>>>>>>>
>>>>>>> my $stripped;
>>>>>>> my $quotechar = "";
>>>>>>> foreach my $char (split(//, $_)) {
>>>>>>> if ($quotechar) { # We're currently quoted
>>>>>>> if ($char eq $quotechar) { # end of quote
>>>>>>> $quotechar = "";
>>>>>>> }
>>>>>>> } else { # We're not currently quoted
>>>>>>> if ($char eq ';') { # The comment has begun!
>>>>>>> last();
>>>>>>> } elsif ($char eq '"' || $char eq "'") { # start of quote
>>>>>>> $quotechar = $char;
>>>>>>> }
>>>>>>> }
>>>>>>> $stripped .= $char;
>>>>>>> }
>>>>>>> print "$stripped\n";
>>>>>>> }
>>>>>>>
>>>>>>> __DATA__
>>>>>>> key="value"
>>>>>>> key=value
>>>>>>> key="value1;value2"
>>>>>>> key="value1;value2" ; comment
>>>>>>> key='value1;value2' ; comment
>>>>>>> "key"="value1"
>>>>>>> "key"="value1" ; comment
>>>>>>> "key"="value1;value2"
>>>>>>> "key"="value1;value2" ; comment
>>>>>>> "key"="val\"ue1;value2"
>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>> ----
>>>>>>>
>>>>>>> On Wed, 9 Mar 2011 13:48:39 -0500, Shaun Fryer wrote:
>>>>>>>>
>>>>>>>> my $strip = qr{;[^\;]+$};
>>>>>>>> while (<DATA>) {
>>>>>>>> chomp;
>>>>>>>> $_ =~ s/$strip//;
>>>>>>>> print $_, "\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>> __DATA__
>>>>>>>> key="value"
>>>>>>>> key=value
>>>>>>>> key="value1;value2"
>>>>>>>> key="value1;value2" ; comment
>>>>>>>> key='value1;value2' ; comment
>>>>>>>> "key"="value1"
>>>>>>>> "key"="value1" ; comment
>>>>>>>> "key"="value1;value2"
>>>>>>>> "key"="value1;value2" ; comment
>>>>>>>> "key"="val\"ue1;value2"
>>>>>>>> "key"="val\"ue1;value2" ; comment
>>>>>>>> "key"='val\'ue1;value2' ; comment
>>>>>>>> "key"='val\"ue1;value2' ; comment
>>>>>>>>
>>>>>>>> --
>>>>>>>> Shaun Fryer
>>>>>>>> cell: 1-647-709-6509
>>>>>>>> voip: 1-647-723-2729
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 9, 2011 at 1:39 PM, Fulko Hew <fulko.hew at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Mar 9, 2011 at 1:14 PM, Rob Janes <janes.rob at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> here's one that dequotes the key and value ...
>>>>>>>>>
>>>>>>>>> Thanks for the ideas, but..
>>>>>>>>> the issue isn't with extracting the keys and the values (and dequoting
>>>>>>>>> them),
>>>>>>>>> the task was only to strip trailing comments (while obeying quoted
>>>>>>>>> strings)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> toronto-pm mailing list
>>>>>>>>> toronto-pm at pm.org
>>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> toronto-pm mailing list
>>>>>>>> toronto-pm at pm.org
>>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> toronto-pm mailing list
>>>>>>> toronto-pm at pm.org
>>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>>
>>>>>> _______________________________________________
>>>>>> toronto-pm mailing list
>>>>>> toronto-pm at pm.org
>>>>>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the toronto-pm
mailing list