Title:
|
A Framework of Statistical Machine Translator from English to Malayalam |
Author:
|
Santhosh Kumar, G; Mary, Priya Sebastian; Sheena Kurian, K
|
Abstract:
|
In this paper we describe the methodology and the structural design
of a system that translates English into Malayalam using statistical models. A
monolingual Malayalam corpus and a bilingual English/Malayalam corpus are
the main resource in building this Statistical Machine Translator. Training
strategy adopted has been enhanced by PoS tagging which helps to get rid of the
insignificant alignments. Moreover, incorporating units like suffix separator and
the stop word eliminator has proven to be effective in bringing about better
training results. In the decoder, order conversion rules are applied to reduce the
structural difference between the language pair. The quality of statistical
outcome of the decoder is further improved by applying mending rules.
Experiments conducted on a sample corpus have generated reasonably good
Malayalam translations and the results are verified with F measure, BLEU and
WER evaluation metrics |
Description:
|
Proceedings of Fourth International Conference on Information Processing, Bangalore, India |
URI:
|
http://dyuthi.cusat.ac.in/purl/4138
|
Date:
|
2010 |