[Za-pm] introduction

Thu Apr 10 02:25:37 PDT 2008

On Wednesday 09 April 2008 16:38, aesop at fables.co.za wrote:

> I have a flat file database (Clue) running on DOS still! 20 years use
> but am forced to migrate now.
[snip]
> The whole database comprises 161 sequential files at present. NO CR 
> or LF. Clue handles all formatting by an Alt-127 (little house)
> character followed by a letter/s. Open the file in the Midnite
> Commander editor and you have a one liner wandering 30+k off to the
> right.
>
> One routine moves a field and its data in the record, another adds
> another field in if it doesn't exist already. I do this under Linux.
> The problem is after running the routines, reported successfully by
> various Perl test routines (like 6699 records, 6699 of each field,
> all fields in the correct order), we get problems when the database
> is imported back into Clue - like will not count the records
> correctly under certain circumstances. Strictly speaking the move
> routine causes no problem, it is the add-in one that does.

Pure guesswork at this stage, but my money would be on some sort of indexing 
issue, either in a separate file or embedded, perhaps in the letter(s) after 
the chr(127), which is causing Clue to be confused about where records start 
and end.

Is Clue producing correct data for (at least some of) the records when 
queried? How far off is the record count? What if anything is the 
significance of the letter(s) after the DEL (the ASCII 127 character)? Is 
there any documentation on the Web about Clue and its file formats?

> I have spent hours checking and refining the Perl routines, trying
> variants in methodology, more hours spent methodically checking by
> hand the modified Clue files, all to no avail. I can see nothing
> wrong but of course there must be. I run them under DOS, I modify
> them in Linux under Perl 5.8, I copy them back onto the DOS machine,
> and then it goes wrong.
>
> So I don't think the question is necessarily a Perl one at present,
> but it is this. Can anything happen to the files (that don't have CRs
> or LFs in them for Perl to handle in any way and we slurp in a file
> at a time as one string to process  ~= m/regex/replace/gx) in the
> transition from one system to another?

I don't think it should fiddle about with anything other than line-endings - 
which as you say don't occur in the file. Checking the filesizes should let 
you know if there are any unexpected differences in size (from mystery bytes 
being added in somehow), and if there are changes you can use diff or cmp on 
the Linux box to find out what they are.

> I need to have that clear in my mind, that it is not an operating
> system thing before I stare at my Perl scripts yet again. It can only
> be something in the files that has been changed but yet ....

How big are the data files and your Perl routines? Provided it's not too huge 
(and of course that your data isn't confidential) you can email a sample of 
the data and a copy of your code to me direct and I'd be prepared to take a 
look and see if anything jumps out at me.

Sorry, lots of questions but no real answers...

Jonathan