[tpm] Regex assistance

Olaf Alders olaf.alders at gmail.com
Thu Aug 11 08:00:44 PDT 2016


> On Aug 11, 2016, at 10:37 AM, Chris Jones <cj at enersave.ca> wrote:
> 
> 
> Hello Perl Mongers,
> 
> I am looking for assistance with a regex. I have a bunch of strings in for form:
> 
> "01.03.16,,Studio one, Space 22,1         500,500,01.051,,"
> or
> ",01.03.16,,Studio one, Space 22,1         500,500,01.051,"
> or
> ",01.03.16,,Studio one, Space 22,1         500,500,01.051,,"
> or
> ",01.03.16,,Studio one, Space 22, ,01.051,,"
> 
> So the middle section can be one or more comma separated strings.
> 
> I am trying to match and return the first non-blank pattern and the last non-blank pattern
> 01.03.16 and 01.051 – these numbering formats are always the same: xx.xx.xx and yy.yyy
> 
> So far I have a regex that matches the first pattern:
> 
> "([0-9]{2})([\.])([0-9]{2})([\.])([0-9]{2})"
> 
> In any of those above example.
> 
> I am stuck after that.
> Any insights appreciated!

I know you're looking for a regex, but you can do this with a split as well, which may be easier to read.

use List::AllUtils qw( first );

my $foo = "01.03.16,,Studio one, Space 22,1         500,500,01.051,,";
my @foo = split m{,}, $foo;

my $first = first { $_ } @foo;
my $last  = first { $_ } reverse @foo;

Having said that, it looks like you're maybe parsing a CSV file, in which case just using a CSV parser from CPAN would help catch any corner cases.

Olaf


More information about the toronto-pm mailing list