DSpace About DSpace Software

Dyuthi @ CUSAT >
Computer Science >
Faculty >
Dr.Santhosh Kumar G >

Please use this identifier to cite or link to this item: http://purl.org/purl/4185

Title: A Classification of Sandhi Rules for Suffix Separation in Malayalam
Authors: Santhosh Kumar, G
Sheena Kurian, K
Mary, Priya Sebastian
Keywords: suffix separation
sandhi rules
English Malayalam translation
Issue Date: 2009
Publisher: Cochin University of Science And Technology
Abstract: Suffix separation plays a vital role in improving the quality of training in the Statistical Machine Translation from English into Malayalam. The morphological richness and the agglutinative nature of Malayalam make it necessary to retrieve the root word from its inflected form in the training process. The suffix separation process accomplishes this task by scrutinizing the Malayalam words and by applying sandhi rules. In this paper, various handcrafted rules designed for the suffix separation process in the English Malayalam SMT are presented. A classification of these rules is done based on the Malayalam syllable preceding the suffix in the inflected form of the word (check_letter). The suffixes beginning with the vowel sounds like ആല, ഉെെ, ഇല etc are mainly considered in this process. By examining the check_letter in a word, the suffix separation rules can be directly applied to extract the root words. The quick look up table provided in this paper can be used as a guideline in implementing suffix separation in Malayalam language
URI: http://dyuthi.cusat.ac.in/purl/4185
Appears in Collections:Dr.Santhosh Kumar G

Files in This Item:

File Description SizeFormat
A Classification of Sandhi Rules for Suffix Separation in Malayalam.pdfpdf410.19 kBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback