> The problem is, there are supposed to be 179 nodes of interest over 16
> million+ lines, so I was not able to easily determine the structure,
> as editors seemed to die trying to read it.
> I sat down this morning and split the file into 160 parts (files) of
> size ~ 100,000 lines, looked at the first and last parts (files 1 and
> 160), and I think I have the structure now.  I'll run some random
> sample of the other chunks to verify.  But I think it is easily solved
> now without XML parsers.
> Just FYI in case anyone is interested:)

Good for you. Divide and conquer is always a good debugging principle.

