[Melbourne-pm] Untainting locale-based data?

Jacinta Richardson jarich at perltraining.com.au
Mon Mar 19 03:55:49 PDT 2007


Alexis Hazell wrote:
> Hi all,
> 
> i'm working on a Web app in which the supplied data (to wit, people's names) 
> might include Latin 1 supplement characters. So i'm wanting to use taint 
> checking, given the untrusted data source; but making use of the ISO-8859-1 
> locale, to enable \w+ to match Latin 1 supplement characters, will mean that 
> the data remains tainted even after being filtered through a regexp, due to 
> Perl's view of the untrustworthiness of locales. How, then, can i untaint 
> such data?

Taint's restriction is that data from outside your program cannot be used to 
affect other things outside your program (at least not unintentionally). 
Unfortunately, locales come from outside your program and it's considered 
possible for a user to edit them before your program runs.  If a user then 
suggests that certain punctuation characters are included in the definition of 
"word characters" they might use your extra-privileged program to do naughty 
things.  For example the following code:

	# only accept filenames containing word characters and .s
	($filename) = ($filename =~ m/^([\w.]+)$/);

	if($filename) {
		unlink("/tmp/$filename") or die $!;
	}

is probably safe under taint and no locale, even if run under suidperl.  But if 
you use a locale the user can edit and they increase \w so that it allows pretty 
much anything:

	\w = [A-Za-z./;!@#$%^&*()+=-]

then suddenly that previous code isn't so safe.

The only way I know of solving this is to avoid using \w, \W, \s and \S in your 
code.  This probably means that you'll have to spell out your character classes 
with the supplement characters.

You can also _hope_ that your locale is safe, and do the following:

	if(my ($name) = ($tainted =~ m/^(\w)+/)) {
		# $name is still tainted because of locale, force it otherwise
		($name) = ($name =~ m/^(.*)/);

		# $name is no longer tainted
	}

which might be an acceptable solution.  I don't believe that there are any other 
alternatives.

All the best,

	Jacinta


More information about the Melbourne-pm mailing list