APM: File reading optimization

Tue Feb 17 15:16:42 CST 2004

On Tue, Feb 17, 2004 at 10:10:55AM -0800, Chris Vaughan wrote:
> Brian,
> 
> If you have the flexibility, you may want to consider changing
> your protocol away from separators and towards packets.  If you
> don't have the flexibility, then don't read on.
> 
> Have the sender send packed data length (in bytes), then the
> data itself, in a loop.  The reader would simply block waiting
> for the first 4 bytes, construct a count by unpacking the
> integer, and block reading that count of the handle, forming
> your message.  After the reader reads the message, it blocks
> again waiting for the next count.
> 
> The downside to this solution is that the reader has two logical
> reading states.  If the reader gets out of sync for any reason,
> you're screwed.

This is a pretty good solution.  The reader/writer should never get out
of sync, but I've just been through a nightmare on this (that I created
myself).  Just remember that send() and recv() are NOT guaranteed to
send/get the number of bytes you told it to!

> 
> Regards,
> Chris
> 
> --- Brian Michalk <michalk at awpi.com> wrote:
> > I am in a quandry about how to do efficient filehandle
> > reading.
> > I'm trying to make it uniform across all of the filehandles
> > that may be
> > named pipes, device driver handles, network sockets, or stdio.
> > 
> > I have some slow devices on a serial line, and other fast
> > devices that
> > continally generate data at high rates.  My protocol is all
> > line oriented,
> > and that naturally leads me to use something like <>, but read
> > the
> > following:
> > perldoc -f select
> >             WARNING: One should not attempt to mix buffered
> > I/O (like "read"
> >             or <FH>) with "select", except as permitted by
> > POSIX, and even
> >             then only on POSIX systems. You have to use
> > "sysread" instead.
> > 
> > However sysread doesn't care about line separators.  Instead,
> > I have to
> > search through the incoming data for separators and store
> > partial reads in
> > my own buffer.  This is not a problem, I have code, and it
> > works.  The
> > performance is bad.  C code would have the same type of
> > problem.
> > 
> > The serial port dribbles in GPS data at 9600 baud, causing the
> > select() to
> > return without a complete line being available, so I store all
> > of the one or
> > two characters at a time in the internal buffer.  The radar
> > data, however
> > can come in at 100hertz, at about 12K of data per line, and
> > I've got
> > buffering turned on for performance, so I have to go searching
> > through the
> > data to find the line separators.
> > 
> > Are there any better solutions?
> > 
> > _______________________________________________
> > Austin mailing list
> > Austin at mail.pm.org
> > http://mail.pm.org/mailman/listinfo/austin
> 
> 
> =====
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     Chris Vaughan    | "I love deadlines.  I like the
>                      |  swooshing sound as they fly by."
>  vaughan99 at yahoo.com |   - Douglas Adams
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Finance: Get your refund fast by filing online.
> http://taxes.yahoo.com/filing.html
> _______________________________________________
> Austin mailing list
> Austin at mail.pm.org
> http://mail.pm.org/mailman/listinfo/austin

-- 

Wayne Walker
wwalker at bybent.com                 Do you use Linux?!
http://www.bybent.com              Get Counted!  http://counter.li.org/
Perl - http://www.perl.org/        Perl User Groups - http://www.pm.org/
Jabber IM:  wwalker at jabber.phototropia.org       AIM:     lwwalkerbybent