[Melbourne-pm] Designing modules to handle large data files

Tulloh, David david.tulloh at AirservicesAustralia.com
Thu Aug 19 00:35:22 PDT 2010


On 19/08/10 17:15, Toby Corkindale wrote:
> Suggestion 1:
> Perhaps you should import the data file into a database, then let the
database do all the hard work for you? By all means put a layer over the
DB interface so as to make it nice for people to use.
> You are running the risk of reinventing the wheel otherwise.
> 
> Suggestion 2:
> If you want to stick with processing the file in situ, then you'll
need to approach it with a streaming processor, rather than loading the
whole thing into memory at once.
> Are you familiar with that concept?

Thanks for the ideas.

My hesitation with the first suggestion is that a database felt like
overkill for what is normally simple data structures.  Ideally I would
like all the data to be permanently kept in a database but that's
unlikely to happen soon.  I'll have another look into temporary SQLite
databases as an option.

The catch with processing in situ is that often I want random access and
some file formats need at least one full pass (data and cancellation
entries for example).

The more I ponder the more I feel that my objectives are too broad for a
single solution.  Switching to a database for the complex messy data
sets and streaming for the simpler ones may be the ticket.  Possibly
with a file size check early on.


David


More information about the Melbourne-pm mailing list