SPUG: Anyone savvy in this?

Colin Meyer cmeyer at helvella.org
Wed Jan 30 12:33:22 CST 2002


On Wed, Jan 30, 2002 at 09:33:22AM -0800, Tim Maher wrote:
> 
> Then I think you want:
> 
> 	$row[4] =~ s|<br */>|<br>|g;
> 
> I won't even ask how you got those messed up line-break tags there
> in the first place! 8-}

Those "messed up" break tags come from xhtml, and are completely
compatible with sgml (and therefore with any of the older html
standards). Any decent web browser renders <br /> just as it
renders <br>.  The space is optional; <br/> is valid xml & xhtml,
but fools older web browsers.

For info: http://www.w3.org/TR/xhtml1
Section 4 discusses differences between html & xhtml.  Appendix C gives
some useful html compatibility advice.

Many generated (as opposed to hand written) webpages are xhtml
compliant.  If you've dealt with parsing html files (to extract data,
add data, or otherwise transform documents), then you'll appreciate
xhtml.  xhtml is *much* easier to deal with.  

If you are doing anything more complex than replacing <br /> with <br>
or "\n", then you'll probably want to use HTML::Parser, or some combo of
tidy and XML::*. Writing regexs to deal with longer tags, with optional
attributes, and possibly spreading across two or more lines, is tough!

Have fun,
-C.

p.s. I guess this post is off topic, but hopefully somewhat useful to 
     web developers, which many of us reading this list are.

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
     POST TO: spug-list at pm.org       PROBLEMS: owner-spug-list at pm.org
      Subscriptions; Email to majordomo at pm.org:  ACTION  LIST  EMAIL
  Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
 For daily traffic, use spug-list for LIST ;  for weekly, spug-list-digest
     Seattle Perl Users Group (SPUG) Home Page: http://seattleperl.org





More information about the spug-list mailing list