DSpace About DSpace Software
 

Dyuthi @ CUSAT >
e-SCHOLARSHIP >
Computer Science >
Faculty >
Dr.Santhosh Kumar G >

Please use this identifier to cite or link to this item: http://purl.org/purl/4138

Title: A Framework of Statistical Machine Translator from English to Malayalam
Authors: Santhosh Kumar, G
Mary, Priya Sebastian
Sheena Kurian, K
Keywords: Alignment
English Malayalam Translation
PoS Tagging
Statistical Machine Translation
Suffix Separation
Issue Date: 2010
Abstract: In this paper we describe the methodology and the structural design of a system that translates English into Malayalam using statistical models. A monolingual Malayalam corpus and a bilingual English/Malayalam corpus are the main resource in building this Statistical Machine Translator. Training strategy adopted has been enhanced by PoS tagging which helps to get rid of the insignificant alignments. Moreover, incorporating units like suffix separator and the stop word eliminator has proven to be effective in bringing about better training results. In the decoder, order conversion rules are applied to reduce the structural difference between the language pair. The quality of statistical outcome of the decoder is further improved by applying mending rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Description: Proceedings of Fourth International Conference on Information Processing, Bangalore, India
URI: http://dyuthi.cusat.ac.in/purl/4138
Appears in Collections:Dr.Santhosh Kumar G

Files in This Item:

File Description SizeFormat
A Framework of Statistical Machine Translator from English to Malayalam.pdfpdf354.75 kBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback