[Melbourne-pm] An intermittent problem with open for append

Wed May 28 05:30:03 PDT 2008

G'day Scott/Tim/MPM,

Scott Penrose wrote:

> UNIX - Yes no problem BUT you must be under the internal buffer on the  
> system, and it is line bound. So multi line insert will not be in  
> order, but single lines will.

I do agree that O_APPEND on a local unix filesystem is atomic provided 
you're within the relevant limit for block IO.  I beg to disagree that it 
has anything to do with *lines*.  As far as your OS and filesystem is 
concerned, a file is just a bunch of bytes.  If you write a 40MB "line" to 
that file, you can be pretty sure it won't be an atomic write.  If you write 
ten "lines" of six characters each, you can be pretty certain it *will* be 
atomic.

The preferred size for block IO for your filesystem can be found in the 11th 
field from Perl's stat() function.  On most systems that corresponds to the 
size of a block on the filesystem, and is typically about 4k on ext2/ext3. 
AFAIK, it should also correspond to the smallest atomic write on your system.

> Using the Sync and Buffer changes Paul suggested won't improve the
 > situation or make it any safer.

My suggestion of forcing writes after we've written a logical record was to 
catch three possible problems:

1) If the data was completely missing from the file, it could be because the 
process is being zapped by a signal.  This could be the case if the 
web-server zaps processes if the connection goes away, as Toby suggested 
earlier in this thread.  Perl doesn't usually flush its buffers when dying 
to a signal, and so we can lose the write.  You can observe this with a 
simple program like:

	use Fatal qw(open);
	open (my $fh, '>>', '/tmp/myfile.log');
	while (<STDIN>) {
		print {$fh} $_;
	}

Type a few lines, and then hit CTRL-C.  You'll discover that myfile.log ends 
up empty.  Tim indicated that he was *missing* data, and being zapped by a 
signal is a possible culprit[1].  That's less likely now that Tim has 
indicated he's unbuffering the whole filehandle (provided this is done 
before it's written to).

2) If we're writing a lot of records, and we're leaving the flushing up to 
stdio, then stdio is free to flush data that intersects a record boundary. 
In this case we can end up with our record being mangled.  You can see this 
in action by taking the above script, and repeatedly pasting a bunch of data 
into it while doing a 'tail -f' on myfile.log.  When your data *does* get 
written to the file, you'll notice that the end of the data written doesn't 
correspond to the end of the data that's been pasted (unless you're pasting 
in blocks which are an exact multiple of your buffer-size).  The last part 
of the data will be written when perl closes its filehandles (after we've 
hit CTRL-D to indicate end-of-input).

This can particularly be a problem with long-running processes that are 
writing to a shared logfile.

3) If we completely unbuffer the filehandle, and then use multiple print()s 
to write our data, then the data from other processes can become 
intermingled with ours, since we'll be flushing after every print().  If 
we're manually calling ->flush() then we can ensure all our data is kept 
together, provided it fits within a single IO block.

> Windows - Anyone know?

Windows append isn't atomic, it's emulated by perl.  It seeks, and then 
writes, meaning you can quite happily end up with race conditions and 
corrupted data if you don't take steps to avoid it (such as locking).

Cheerio,

	Paul

-- 
Paul Fenwick <pjf at perltraining.com.au> | http://perltraining.com.au/
Director of Training                   | Ph:  +61 3 9354 6001
Perl Training Australia                | Fax: +61 3 9354 2681