kconv version 0.3 -- readme Copyright (C) 1996, Eccosys Co. Ltd. (Sen Nagata ) kconv is a module for Python (tested on 1.3 and 1.4) which allows one to convert text in a Japanese encoding scheme (choose one from JIS, SJIS (Shift JIS), EUC at a time) to an encoding scheme in that same group of schemes (yuck -- what a poor sentence -- basically, this means that you should be able to convert from JIS to SJIS or EUC, SJIS to JIS or EUC, and EUC to JIS or SJIS). It is based on the Kconv kanji conversion package for perl -- which in turn is based on nkf (network kanji filter). nkf was originally written by Itaru Ichikawa of Fujitsu LTD. - see nkf.c for his copyright notice. This code uses a modified nkf.c written by Ikuo Nakagawa for the perl Kconv package. (ftp://ftp.intec.co.jp/pub/utils/ -- the file name should look like Kconv-1.1.tar.gz) I am not sure of what terms to apply to this software, but since it would be silly not to specify any, I will temporarily use a notice similar to the one I found in the original nkf.c. Copyright (C) 1996, Eccosys Co. Ltd. (Sen Nagata) For the portion of the software written by Sen Nagata (this excludes the included files nkf.c and kconv.h), everyone is permitted to do anything on this program including copying, modifying, improving as long as you don't try to make money off it, or pretend that you wrote it. i.e., the above copyright notice has to appear in all copies. THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE. I think this is consistent with the previous copyright statements. (Note: I'm not firm on the above copyright statement holding for future versions (> 0.3) of the software.) Look in the file INSTALL for installation instructions. Sen Nagata (sen@eccosys.com) P.S. If anyone knows of other Japanese conversion packages for Python, please let me know! -------------------------------------------------------------------------- Some Questions -------------- - It's clear how to define python 'functions' using C libraries...so, how do you define 'constants' or 'variables-with-values'? In particular, how do I get 'AUTO', 'JIS', etc. to have appropriate values through the process of 'import kconv'? - How useful would it be to have a 'python-native' kanji conversion ability? I guess it would save people the trouble of trying to get the C code to compile on non-UNIX platforms. Practically speaking, how good would a port of jcode for perl be? - Am I dealing with reference counting correctly in this module? - Would a code-detection function be useful? What kind of results should such a beast return? It seems like a good idea for it to make a guess if necessary and if it does guess, to indicate that it is guessing, and if possible to give some measure of how confident it is of its guess. - Is there a good way to deal with half-width katakana? -------------------------------------------------------------------------- Statement of Thanks ------------------- Thanks go out to (in no particular order): - Ted Crossman (tedo@eccosys.com) for his help in getting this working. You were a great help Ted! I really appreciate it! - Guido van Rossum for Python - and everyone else for making Python what it is today and contributing to the Python home page at http://www.python.org/ - KDD Labs for their mirror of the Python home page at http://w3.lab.kdd.co.jp/www.python.org/ - Mark Lutz for the 'Programming Python' book. - Ken Lunde for the 'Japanese Information Processing' book a.k.a. 'the Blowfish book'. - Itaru Ichikawa for nkf. - Ikuo Nakagawa for Kconv (and modifications to nkf). - Shohei Takeuchi for his helpful comments about Kconv.c. - People who document their code well!!!! There aren't enough of you out there! -------------------------------------------------------------------------- Various Encoding Scheme Wishes: ------------------------------- - Single ecoding scheme for all languages. - Abolishment of half-width katkana. - Existence of an algorithm for telling EUC apart from SJIS 99.9% of the time. - Existence of a similarly reliable algorithm for the detection of half-width katakana.