<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#ffffff">
G'day lovely people,<br>
<br>
I don't know where the next meeting will be, but I'd be happy to
give the following talk thereat (my OPL mini conf talk at LCA this
year).<br>
<br>
<h2> <span class="mw-headline"> Don't hate Unicode </span></h2>
<p>Unicode sneaks into the most unexpected places. Do you ever
wonder if your life would be much, much easier if your default
encoding was not ASCII? Do you know what the difference between
UTF-8 and Unicode strings are? Do you know what your default
encoding is, or how to change it? Does it all seem to hard, and
make you resent anything to do with the locale?
</p>
<p>If 7-bit ASCII was good enough for me, it should be good enough
for you! Have you been left behind with this whole Unicode thing
to the point that you're confused and resentful of the whole
thing? I know I was. When your name, and everything you write
works wonderfully in ASCII it can be hard to summon the enthusiasm
to learn about Unicode, even when you know that you should be
handling your data better.
</p>
<p>Imagine your code is using a logging library, that expects
strings. What does it do when you pass it a Unicode object? It'll
probably write it, encoding it in your default encoding (probably
ASCII). And it'll probably work, on all of your test cases, and on
most of your data. Until someone comes on with a non-ASCII
character in their name, and causes your code to throw an
exception. You probably weren't expecting it, it might not even be
your library. Unicode works implicitly often enough that Unicode
can sneak in well before you realise your code isn't robust enough
to handle it.
</p>
<p>This talk will cover the essentials of Unicode and how it affects
things like regular expressions.<br>
</p>
<br>
<br>
</body>
</html>