[tpm] Fast(er) way of deleting X lines from start of a file

Wed Sep 30 14:20:01 PDT 2009

On Wed, 2009-09-30 at 11:53 -0400, Uri Guttman wrote:
> >>>>> "MK" == Madison Kelly <linux at alteeve.com> writes:
> 
>   MK> Hi all,
>   MK>   Thus far, I've opened a file, read it in, shifted/ignored the first
>   MK> X number of line and then wrote out the resulting file. This works,
>   MK> but strikes me as horribly inefficient. Doubly so when the file could
>   MK> be larger than RAM.

In this case I'd consider a short C program that used mmap() and bcopy()
and ftrunc().  The point is to avoid per-line processing, and to avoid
having the whole file in memory (mmap() maps the file on disk into
memory locations without actually reading it).

In perl, you can use sysread and syswrite to avoid per-line processing.

Watch out, if something is writing to the file while you do this, you
can end up with corruption, though. You can use file locking, 

However, Liam's Rule of Optimization :-) is that the fastest way to do
something is not to do it at all.  For example, what if you had a
separate file for every X lines? Then you'd just delete the oldest file.

Or, buy more memory so it all fits :-)

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org www.advogato.org