1. A study on word-based and integral-bit Chinese text compression algorithms
2. [2] P. Charoenpornsawat, B. Kijsirikul, and S. Meknavin, “Feature-based thai unknown word boundary identification using winnow,” Proc. IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS-98), pp.547-550, Chiang Mai, Thailand, Nov. 1998.
3. [3] G.C. Ling, M. Asahara, and Y. Matsumoto, “Chinese unknown word identification using character-based tagging and chunking,” Proc. 41st Annual Meeting on Association for Computational Linguistics (ACL-2003), vol.2, pp.197-200, Sapporo, Japan, July 2003.
4. [4] R.K. Ando and L. Lee, “Mostly-unsupervised statistical segmentation of Japanese: Applications to kanji,” Proc. 1st North American chapter of the Association for Computational Linguistics Conference (ANLP-NAACL), pp.241-248, San Francisco, CA, USA, Morgan Kaufmann Publishers, 2000.
5. [5] T. Theeramunkong and T. Tanhermhong, “Pattern-based features vs. statistical-based features in decision trees for word segmentation,” IEICE Trans. Inf. & Syst., vol.E87-D, no.5, pp.1254-1260, May 2004.