unicoding

I stored the same text file in various encodings with notepad (a fine tool for unicode diagnostics). I read in those files in python, as plain 8-bit streams. No surprises there:

Here is a breakdown of the BOMs (http://en.wikipedia.org/wiki/Byte-order_mark):

UTF-8	EF BB BF
UTF-16 BE ("unicode big endian")	FE FF
UTF-16 LE ("unicode")	FF FE

decode works as expected:

Of course it happened again: program works okay, unit-tests suck and must be debugged.

At this point, I have my unicoder (18:36), and the unit-tests show that it works.