[tpm] Dumb regex question
Liam R E Quin
liam at holoweb.net
Tue Aug 21 09:45:02 PDT 2007
On Tue, 2007-08-21 at 12:15 -0400, Madison Kelly wrote:
> For the life of me, I can't seem to get a simple regex working...
>
> All I want is to be able to match a word-character string that may have
> a hyphen in it.
>
> So:
>
> mizu-bu # should match
> alteeve # should match
> m!zu-bu # should not match
> a|teeve # should not match
Two things to note here
(1) hypen is special in a character class, e.g. [a-z]
(2) you need to anchor the match, since a!b could be two matching
words separated by a "!"
Perl defines \w for a word character, so we can match that or
a hyphen with (\w|-)
and then,
^(\w|-)+$
will do what you want I think.
You can also use a character class as long as the hyphen
is at the end:
^[\w-]+$/
If Perl's definition of a word character (alphanumeric
plus _) isn't what you want, you can use
^[a-zA-Z0-9-]+$
for example, or you can use the bizarre Posix syntax:
^[[:alnum:]-]+$
You'll want "use locale" for that to be sensible.
If you do, "use utf8" you can also use the the Unicode properties:
^[\p{Letter}\l{Number}-]+$
and this will allow, for example, Hindi words too.
Hope this helps.
Liam (now living in Prince Edward County)
--
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org www.advogato.org
More information about the toronto-pm
mailing list