Summary Re: SPUG:escaping HTML

Fred Morris m3047 at inwa.net
Sun May 11 12:02:18 CDT 2003


I originally wrote:

>What are people's recommendations for escaping HTML without importing the
>kitchen sink? It's not hard to do it yourself.
>
>There are about a bajillion HTML:: modules, I'm sure there are more than
>several ways to do this.

For my calendar thingy I escaped the markups myself. I did that because I
wanted to reserve the option of altering the behavior based on the user
login, although I haven't used that capability.


I have received the following responses (in no particular order):

Marc M. Adkins concurs that it's not hard to do yourself, and that it's not
just the kitchen sink but "...the entire kitchen (and the basement and the
garage...)."

I agree, particularly because I am running mod_perl; although I tolerate
the memory bloat for the wicked fast performance, it's only prudent to give
some consideration to how much memory is consumed to achieve a particular
functionality with a certain level of convenience. I'm writing another
gizmo right now (a firewall analyzer and rule manager), and given the
relatively sporadic usage profile sucking in loads of stuff that isn't
already there doesn't make a lot of sense. I suppose I could run the stuff
without using mod_perl, and the performance penalty of compiling the
scripts each time they're run might be reasonable for this application.
OTOH just not even bothering might be reasonable for this application,
because the data is conditioned in other ways and access is presumed to be
controlled.

Adam Monsen proffers this:

  $ perl -ne 's/</&lt;/g; s/>/&gt;/g' < index.html > outdex.html

I would add that to mitigate HTML injection opportunities you should also
think about the impact of thick quotes and ampersands. (And although I
haven't done so at least to-date in my calendar thingie, perhaps you do
want to allow certain users to use certain tags in certain situations.)

Jeremy Kahn believes that HTML::Entities is part of the 5.8 core.

Douglas Kirkland wants to know if I want to strip the HTML out entirely or
display the HTML as plain text. I want to escape it, roughly speaking.
There is some stuff you just can't allow being entered, either. For
instance a thick quote in a textfield, or </textarea> in a textarea.


Anyway, thanks to all of you for your responses..

--

Fred Morris
m3047 at inwa.net







More information about the spug-list mailing list