[Chicago-talk] parsing HTML

Jay Strauss me at heyjay.com
Fri Feb 23 14:11:58 PST 2007


> I suppose />(\s*[^<\s][^<]*)</ if you want to extract something with a
> non-whitespace character.

I don't understand that regex.

match a ">" and memorize 0 or more whitespaces, followed by characters
not< and whitespace, followed by 0 or more not<, end memorize followed
by <.

how come />(\s*[^<]*)</ doesn't work?

Thanks
Jay


More information about the Chicago-talk mailing list