[Kc] converting ms-dos file names to href equivalents

Tom Miller tlgalenson at chatnfiles.com
Sat Dec 13 10:00:12 CST 2003


All of this is in the context of needing to convert at least 2000 text files located in at least 2000 subdirectories into .htm files.  So while it is a "1 time" programming project, the resulting time saving and tendum saving will not be trivial.

I am trying to search for an ms-dos formated file name (eg. 8.3 form).  I would like to limit it to the legal characters for a dos name rather than any 8 characters followed by a period followed by the letters "zip".  I am presuming very similar code will allow me to also search for .arc and .lhz files.  I want to replace that name with the same name as a href.  e.g.. tomsfile.zip <a href="tomsfile.zip">tomsfile.zip</a>

The code I am showing is adapted from the code that is converting things like tabs into the html equivalent in a perl script called: txt2html that a guy who speaks French wrote.  I am aware the replacement string is wildly wrong.  So here is my 1st approximation of the code:

$TXT =~ s/........\.[zip|ZIP]/<a href="........\.zip">........\.zip</a>/g;

Since I don't want to match against anywhite space how about this?

$TXT =~ s/[\d|\w](8)\.[zip|ZIP]/<a href="........\.zip">........\.zip</a>/g;

According to the book I am mumbling arround in:  \d is the range of numbers, \w is all alphabetic characters, | lets you put two groups together so: [\d|\w] should allow all legal ms-dos file name characters.  [\d|\w](8) is supposed to find 8 legal ms-dos file characters in a row.   [\d|\w](8)\.[zip|ZIP] should find any legal ms-dos file name that ends in zip or ZIP?

What (if anything) am I doing wrong on the search string?

Once the search string is right, I want to move onto the harder question of how to I get this thing to replace the file name with an href to that file name.

Once I get past these questions, I have questions about trying to add file information (eg. size, date/time created) to this conversion).  But right now, I want to struggle with this level of the code.

Respectfully,

Tom Miller

-------------------------------------
If I knew what I was doing, would I be posting here?
Eileen Chat and 160,000 downloads at: www.chatnfiles.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/kc/attachments/20031213/0726d980/attachment.htm


More information about the kc mailing list