Now showing items 1-4 of 4
Abstract: | In this paper a method of copy detection in short Malayalam text passages is proposed. Given two passages one as the source text and another as the copied text it is determined whether the second passage is plagiarized version of the source text. An algorithm for plagiarism detection using the n-gram model for word retrieval is developed and found tri-grams as the best model for comparing the Malayalam text. Based on the probability and the resemblance measures calculated from the n-gram comparison , the text is categorized on a threshold. Texts are compared by variable length n-gram(n={2,3,4}) comparisons. The experiments show that trigram model gives the average acceptable performance with affordable cost in terms of complexity |
URI: | http://dyuthi.cusat.ac.in/purl/4104 |
Files | Size |
---|---|
A Copy detectio ... ntsusing N-grams Model.pdf | (505.4Kb) |
Abstract: | This paper describes about an English-Malayalam Cross-Lingual Information Retrieval system. The system retrieves Malayalam documents in response to query given in English or Malayalam. Thus monolingual information retrieval is also supported in this system. Malayalam is one of the most prominent regional languages of Indian subcontinent. It is spoken by more than 37 million people and is the native language of Kerala state in India. Since we neither had any full-fledged online bilingual dictionary nor any parallel corpora to build the statistical lexicon, we used a bilingual dictionary developed in house for translation. Other language specific resources like Malayalam stemmer, Malayalam morphological root analyzer etc developed in house were used in this work |
URI: | http://dyuthi.cusat.ac.in/purl/4102 |
Files | Size |
---|---|
English-Malayal ... rieval – An Experience.pdf | (195.0Kb) |
Abstract: | On-line handwriting recognition has been a frontier area of research for the last few decades under the purview of pattern recognition. Word processing turns to be a vexing experience even if it is with the assistance of an alphanumeric keyboard in Indian languages. A natural solution for this problem is offered through online character recognition. There is abundant literature on the handwriting recognition of western, Chinese and Japanese scripts, but there are very few related to the recognition of Indic script such as Malayalam. This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using K-NN algorithm. It would help in recognizing Malayalam text entered using pen-like devices. A novel feature extraction method, a combination of time domain features and dynamic representation of writing direction along with its curvature is used for recognizing Malayalam characters. This writer independent system gives an excellent accuracy of 98.125% with recognition time of 15-30 milliseconds |
Description: | 2010 First International Conference on Integrated Intelligent Computing |
URI: | http://dyuthi.cusat.ac.in/purl/4095 |
Files | Size |
---|---|
k-NN based On-L ... cterrecognition system.pdf | (577.8Kb) |
Abstract: | This paper presents a novel approach to recognize Grantha, an ancient script in South India and converting it to Malayalam, a prevalent language in South India using online character recognition mechanism. The motivation behind this work owes its credit to (i) developing a mechanism to recognize Grantha script in this modern world and (ii) affirming the strong connection among Grantha and Malayalam. A framework for the recognition of Grantha script using online character recognition is designed and implemented. The features extracted from the Grantha script comprises mainly of time-domain features based on writing direction and curvature. The recognized characters are mapped to corresponding Malayalam characters. The framework was tested on a bed of medium length manuscripts containing 9-12 sample lines and printed pages of a book titled Soundarya Lahari writtenin Grantha by Sri Adi Shankara to recognize the words and sentences. The manuscript recognition rates with the system are for Grantha as 92.11%, Old Malayalam 90.82% and for new Malayalam script 89.56%. The recognition rates of pages of the printed book are for Grantha as 96.16%, Old Malayalam script 95.22% and new Malayalam script as 92.32% respectively. These results show the efficiency of the developed system |
Description: | (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 3, No. 7, 2012 |
URI: | http://dyuthi.cusat.ac.in/purl/4106 |
Files | Size |
---|---|
An Online Chara ... ha Script to Malayalam.pdf | (548.4Kb) |
Now showing items 1-4 of 4
Dyuthi Digital Repository Copyright © 2007-2011 Cochin University of Science and Technology. Items in Dyuthi are protected by copyright, with all rights reserved, unless otherwise indicated.