[za-pm] Text::Format

Anne Wainwright anotheranne at fables.co.za
Fri Aug 31 00:23:53 PDT 2012


Hi,

I have had a closer look at this. 

Particularly obscure has been the reasoning behind the start and eol
anchors bracketing the regex expressions for the two words. Without
looking at the module code I believe that the workings of this is as
follows.

As each word is processed (and increasing line length calculated) so
each word is matched against the two regex. It is the matching of the 2
words "Mrs" & "Jones" that, combined, determine the action of wrapping
"Mrs" because "Jones" has been wrapped to a new line. When "Mrs" is at
the end of a line (or anywhere on a line except the first word) then the
regex will fail. The regex for "Jones" on the newline will succeed. Thus
the conbination  fail/succeed will cause the "Mrs" to be wrapped. The
various combinations of fail/succeed form a small truth table to
determine what happens. Thus fail/fail will mean both words are happily
ensconced on one line and no action is taken, as will succeed/fail.

If that is so then I wonder why the Text::Format docs include the
anchors since surely no one is expected to reason this out, and we
should surely only be expected to enter regexi (?) for the words proper,
the anchors being added by the module code. That said, the documentation
is a little ambiguous and may be showing what ends up in the hash.

Whatever, no way can I persuade  this to go, with or without my own
anchors. I have even  set "Mrs Jones" in my own input file and it does
not go.

I would ask the package maintainer (Shlomi Fish) whether there are any
issues with this but do not want to find myself at the receiving end of
some simple thing that means I am wrong. In short, I need a volunteer to
check this out independantly

Any offers?

Anne

On Wed, Aug 29, 2012 at 09:53:36PM +0200, Anne Wainwright wrote:
> Note: Beware! Default reply-to is to the list.
> 
> 
> Hello,
> 
> I am using the Text::Format module. It works well but I cannot get any
> mileage out of one feature. I'll append the section from the pod file
> here:
> 
> The module will take text input from whererever and format it in various
> ways designed to suit any conceivable need.
> 
> This option is in this case is to  prevent breaking of a line of text at a
> sensitive point.
> 
> In my case i dont want to break "isbn 123456789" leaving "isbn" at the
> end of the line and "123456789" at the start of the next.
>  
> ------------------------------
> noBreakRegex \%HASH || NOTHING
>            Pass in a reference to your hash that would hold the regexes
> on
>            which not to break.  Without any arguments, it returns the
> hash.
>            eg.
> 
>                {'^Mrs?\.$' => '^\S+$','^\S+$' => '^(?:S|J)r\.$'}
> 
>            don't break names such as Mr. Jones, Mrs. Jones, Jones Jr.
> -------------------------
> 
> this isn't really a regex query, but just to note that the above seems a
> little weird with "Mrs" sandwiched between both start of line and finish
> of line anchoris. Still I can make my regex to suit if that is what is
> needed.
> 
> So I have the following to get my regex into a hash:
> 
>     my  %regx = ('^isbn$'=>'^[0-9]{9,13}X?$');  # my regex
>     $text->noBreakRegex(\%regx);  # this sets the option
> 
> Well we compile, but I have isbn breaks all over. All reasonable and
> many unreasonable changes have no effect except to break the script.
> 
> Any ideas welcome, please.
> 
> Anne
> 
> PS. I am using the Padre IDE now.
> _______________________________________________
> Za-pm mailing list
> Za-pm at pm.org
> http://mail.pm.org/mailman/listinfo/za-pm
> 
> posts also archived on Mail Archive
> http://www.mail-archive.com/za-pm@pm.org/


More information about the Za-pm mailing list