[Melbourne-pm] \N in regular expressions and unicode

Jacinta Richardson jarich at perltraining.com.au
Sun Jan 9 23:21:09 PST 2011

G'day folk,

In 5.12.0(ish) a new meta-character was added for Perl's regular expressions.
\N matches anything that isn't a newline, and it was added so that when you use
the /s switch (so that . also matches newlines), you still have something other
than [^\n] to give you the previous . behaviour.

This follows the existing mnemonics:

	\s - any whitespace
	\S - any non-whitespace
	\w - any word character
	\W - any non-word character

and thus:

	\n - a newline
	\N - not a newline.

However \N{some unicode name} *also* allows you to specify a unicode character
by name.  This leaves us with a problem.

The following two snippets are equivalent:

	my ($five_letters) = /(\w{5})/;

	my $five = 5;
	my ($five_letters) = /(\w{$five})/;

Although this is contrived, I can imagine situations where you might not know in
advance how many characters you wished to match.

Likewise, the following two snippets are equivalent:

	my ($five_any) = /(.{5})/s;

	my $five = 5;
	my ($five_any) = /(.{$five})/s;

as you would expect.

We can match non-newlines in 5.12.2 with:

	use v5.12.2;

	my ($five_non_newlines) = /(\N{5})/;

We can match unicode characters with:

	use utf8;
	use charnames ':full';

	my ($symbol) = /(\N{AC CURRENT})/;

What would you expect for the following though?

	use strict;
	use warnings;
	use utf8;
	use charnames ':full';
	use v5.12.2;

	my $var1 = 5;
	my $var2 = "AC CURRENT";

	say $1 if /(\N{$var1})/;

	say $1 if /(\N{$var2})/;

All the best,


PS: I know what we do get:

Unknown charname '$var1' at utf8.pl line 10
Deprecated character(s) in \N{...} starting at '$var1' at utf8.pl line 10

   ("`-''-/").___..--''"`-._          |  Jacinta Richardson         |
    `6_ 6  )   `-.  (     ).`-.__.`)  |  Perl Training Australia    |
    (_Y_.)'  ._   )  `._ `. ``-..-'   |      +61 3 9354 6001        |
  _..`--'_..-_/  /--'_.' ,'           | contact at perltraining.com.au |
 (il),-''  (li),'  ((!.-'             |   www.perltraining.com.au   |

More information about the Melbourne-pm mailing list