[Melbourne-pm] Designing modules to handle large data files
david.tulloh at AirservicesAustralia.com
Thu Aug 19 00:35:22 PDT 2010
On 19/08/10 17:15, Toby Corkindale wrote:
> Suggestion 1:
> Perhaps you should import the data file into a database, then let the
database do all the hard work for you? By all means put a layer over the
DB interface so as to make it nice for people to use.
> You are running the risk of reinventing the wheel otherwise.
> Suggestion 2:
> If you want to stick with processing the file in situ, then you'll
need to approach it with a streaming processor, rather than loading the
whole thing into memory at once.
> Are you familiar with that concept?
Thanks for the ideas.
My hesitation with the first suggestion is that a database felt like
overkill for what is normally simple data structures. Ideally I would
like all the data to be permanently kept in a database but that's
unlikely to happen soon. I'll have another look into temporary SQLite
databases as an option.
The catch with processing in situ is that often I want random access and
some file formats need at least one full pass (data and cancellation
entries for example).
The more I ponder the more I feel that my objectives are too broad for a
single solution. Switching to a database for the complex messy data
sets and streaming for the simpler ones may be the ticket. Possibly
with a file size check early on.
More information about the Melbourne-pm