* Unidecoder
Posted on October 31st, 2009 by John. Filed under programming.
A while back I made a post about ASCIIizing Text. With it was a simple python application that would convert Unicode characters to ASCII equivalents. It doesn’t do a basic conversion but also Latinizes the characters when they are outside of the ASCII range.
The uni2ascii package I made has a few short comings I’ve decided to fix. The three major problems with it are: 1) Very basic permission checking, 2) Only accepts one file, 3) Required all input to be UTF8 encoded, 4) The decoder was a very literal port of a the ruby version.
To fix these issues I’ve written an entirely new script. Problems 1, 2 and 3 are fixed. It has robust error checking, can handle an arbitrary number of files, and the file encoding can be specified. Number 4 is fixed by using the Python port created by Tomaz Solc.
I’ve put the source code for the new decoder into a Launchpad branch:
$ bzr branch lp:~user-none/+junk/unidecoder
Leave a Reply
Tags
Archives
- July 2010 (4)
- June 2010 (1)
- May 2010 (2)
- March 2010 (1)
- January 2010 (8)
- December 2009 (5)
- November 2009 (6)
- October 2009 (4)
- September 2009 (2)
- August 2009 (6)
- July 2009 (6)
- June 2009 (4)
- May 2009 (6)
- April 2009 (4)
- March 2009 (2)
- February 2009 (4)
- January 2009 (4)
- December 2008 (7)
- November 2008 (2)