[pm-h] how to split one file into multiple files?

Russell L. Harris rlharris at oplink.net
Mon Jun 23 00:59:56 PDT 2008


My problem:

I have several text source files, each of which contains several
chapters of a book.  I need to split each of these source files into
the component chapter files, with a coherent naming scheme for the
chapter files.

Details:

    => No single file contains every chapter of the book.

    => The chapter numbers must be determined from a hash of the
       chapter names; they cannot be determined from the sequence of the
       chapter in the source file.

    => Each chapter is separated by line consisting of the word
       "chapterbreak".

    => Each chapter begins with a unique string which provides the
       chapter title ("LION", "BEAR", "DUCK", "MOOSE", etc.).

Examples:

    source file No. 1 ::

        LIONtext of chapter three\n
        chapterbreak\n
        MOOSEtext of chapter one\n
        chapterbreak\n
        KANGAROOtext of chapter four\n
 
    source file No. 2 ::

        BEARtext of chapter five\n
        chapterbreak\n
        PENGUINtext of chapter six\n
        chapterbreak\n
        DUCKtext of chapter two\n

I wish to split each source file on the pattern "chapterbreak" and
place each chapter into a separate file, with the chapter filename
being of the form "chapternumber.txt".

%%%%%%%%%%%%%%%%%%%%

I know how to create a hash of the first several characters of the
chapter names, using the chapter numbers as the keys:

            key: MOOS  DUCK  LION  KANG  BEAR  PENG
   hash element: 1     2     3     4     5     6  

----------

I think that I know how to read a file using the <> operator and split
the chapters into an array (but I am a little fuzzy on this).

----------

And I think that I know how to use the match operator to obtain the
first several characters of the chapter name from each string scalar
in the array, which is the needed hash key:

   /(.{4})/

----------

But, after several hours of reading in the O'Reilly Perl books, I
still do not understand how to open a new file using the hash element
(the string "1", "2", "3", etc.) as the filename.

%%%%%%%%%%%%%%%%%

If I am trying to do this the hard way, kindly advise.

RLH


More information about the Houston mailing list