[Philadelphia-pm] selective splitting?

Morgan Jones morgan at morganjones.org
Thu Nov 17 17:40:37 PST 2016


mjd’s talk Monday has me thinking about peer review and how helpful it can be.  So here goes.  I can certainly work around this but as a learning experience I’m wondering if someone has a straightforward answer. Can I split on only instances of a character that is not surrounded by in this case parentheses?

I have a semicolon separated string that contains a date, a string, an ip address and a user agent string.  The catch is the user agent string contains a semicolon however it’s between parentheses.  So what I want is to split on semicolons that are not surrounded by parentheses.

For example:
$v = ‘20161116172606Z;accepted-terms-of-use via CAS;192.168.1.5;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/602.2.14 (KHTML, like Gecko) Version/10.0.1 Safari/602.2.14’;

It seems to me I should be able to split like this:
my ($date, $ignore, $ip, $agent) = split /[^\(]+[^\;]*\;[^\)]*[^\)]+/, $v;

From a little reading I may need to use look aheads which are new to me.  Here’s an attempt at that that is of course not working:
my ($date, $ignore, $ip, $agent) = 
	    	split /(?<!()
                       \;
                       (?!))/x, $v;


Does anyone have a suggestion or see what I’m missing?

thanks,

-morgan


More information about the Philadelphia-pm mailing list