[Za-pm] introduction
Jonathan McKeown
jonathan at hst.org.za
Thu Apr 10 02:25:37 PDT 2008
On Wednesday 09 April 2008 16:38, aesop at fables.co.za wrote:
> I have a flat file database (Clue) running on DOS still! 20 years use
> but am forced to migrate now.
[snip]
> The whole database comprises 161 sequential files at present. NO CR
> or LF. Clue handles all formatting by an Alt-127 (little house)
> character followed by a letter/s. Open the file in the Midnite
> Commander editor and you have a one liner wandering 30+k off to the
> right.
>
> One routine moves a field and its data in the record, another adds
> another field in if it doesn't exist already. I do this under Linux.
> The problem is after running the routines, reported successfully by
> various Perl test routines (like 6699 records, 6699 of each field,
> all fields in the correct order), we get problems when the database
> is imported back into Clue - like will not count the records
> correctly under certain circumstances. Strictly speaking the move
> routine causes no problem, it is the add-in one that does.
Pure guesswork at this stage, but my money would be on some sort of indexing
issue, either in a separate file or embedded, perhaps in the letter(s) after
the chr(127), which is causing Clue to be confused about where records start
and end.
Is Clue producing correct data for (at least some of) the records when
queried? How far off is the record count? What if anything is the
significance of the letter(s) after the DEL (the ASCII 127 character)? Is
there any documentation on the Web about Clue and its file formats?
> I have spent hours checking and refining the Perl routines, trying
> variants in methodology, more hours spent methodically checking by
> hand the modified Clue files, all to no avail. I can see nothing
> wrong but of course there must be. I run them under DOS, I modify
> them in Linux under Perl 5.8, I copy them back onto the DOS machine,
> and then it goes wrong.
>
> So I don't think the question is necessarily a Perl one at present,
> but it is this. Can anything happen to the files (that don't have CRs
> or LFs in them for Perl to handle in any way and we slurp in a file
> at a time as one string to process ~= m/regex/replace/gx) in the
> transition from one system to another?
I don't think it should fiddle about with anything other than line-endings -
which as you say don't occur in the file. Checking the filesizes should let
you know if there are any unexpected differences in size (from mystery bytes
being added in somehow), and if there are changes you can use diff or cmp on
the Linux box to find out what they are.
> I need to have that clear in my mind, that it is not an operating
> system thing before I stare at my Perl scripts yet again. It can only
> be something in the files that has been changed but yet ....
How big are the data files and your Perl routines? Provided it's not too huge
(and of course that your data isn't confidential) you can email a sample of
the data and a copy of your code to me direct and I'd be prepared to take a
look and see if anything jumps out at me.
Sorry, lots of questions but no real answers...
Jonathan
More information about the Za-pm
mailing list