[Melbourne-pm] An intermittent problem with open for append

Scott Penrose scottp at dd.com.au
Wed May 28 05:41:21 PDT 2008


On 28/05/2008, at 10:30 PM, Paul Fenwick wrote:

> G'day Scott/Tim/MPM,
>
> Scott Penrose wrote:
>
>> UNIX - Yes no problem BUT you must be under the internal buffer on  
>> the  system, and it is line bound. So multi line insert will not be  
>> in  order, but single lines will.
>
> I do agree that O_APPEND on a local unix filesystem is atomic  
> provided you're within the relevant limit for block IO.  I beg to  
> disagree that it has anything to do with *lines*.  As far as your OS  
> and filesystem is concerned, a file is just a bunch of bytes.  If  
> you write a 40MB "line" to that file, you can be pretty sure it  
> won't be an atomic write.  If you write ten "lines" of six  
> characters each, you can be pretty certain it *will* be atomic.

Quite right. It is the block that matters, what I meant is if you  
write multiple lines you may pass that block size.

So you see this often works:
	print OUT "Some Error line\n";
and this often does not
	print OUT join("\n", @all_my_errors);

Sorry about that.

> 2) If we're writing a lot of records, and we're leaving the flushing  
> up to stdio, then stdio is free to flush data that intersects a  
> record boundary. In this case we can end up with our record being  
> mangled.  You can see this in action by taking the above script, and  
> repeatedly pasting a bunch of data into it while doing a 'tail -f'  
> on myfile.log.  When your data *does* get written to the file,  
> you'll notice that the end of the data written doesn't correspond to  
> the end of the data that's been pasted (unless you're pasting in  
> blocks which are an exact multiple of your buffer-size).  The last  
> part of the data will be written when perl closes its filehandles  
> (after we've hit CTRL-D to indicate end-of-input).

Sorry no, the record will still be mangled. Flushing does not fix  
that. If you are writing something greater than the buffer size the  
only answer is locking, nothing else works.

Your answer above works, only if there is one script writing to the  
log and then you are fixing the internal flusing of the data.

> Windows append isn't atomic, it's emulated by perl.  It seeks, and  
> then writes, meaning you can quite happily end up with race  
> conditions and corrupted data if you don't take steps to avoid it  
> (such as locking).

Typical, I expected that :) Then again it uses threads to emulate  
forks, so maybe not as big a problem :-)

Scott


More information about the Melbourne-pm mailing list