Now showing items 1-10 of 10
Abstract: | This paper investigates certain methods of training adopted in the Statistical Machine Translator (SMT) from English to Malayalam. In English Malayalam SMT, the word to word translation is determined by training the parallel corpus. Our primary goal is to improve the alignment model by reducing the number of possible alignments of all sentence pairs present in the bilingual corpus. Incorporating morphological information into the parallel corpus with the help of the parts of speech tagger has brought around better training results with improved accuracy |
URI: | http://dyuthi.cusat.ac.in/purl/4140 |
Files | Size |
---|---|
Alignment Model ... m English to Malayalam.pdf | (388.8Kb) |
Abstract: | In this paper a method of copy detection in short Malayalam text passages is proposed. Given two passages one as the source text and another as the copied text it is determined whether the second passage is plagiarized version of the source text. An algorithm for plagiarism detection using the n-gram model for word retrieval is developed and found tri-grams as the best model for comparing the Malayalam text. Based on the probability and the resemblance measures calculated from the n-gram comparison , the text is categorized on a threshold. Texts are compared by variable length n-gram(n={2,3,4}) comparisons. The experiments show that trigram model gives the average acceptable performance with affordable cost in terms of complexity |
URI: | http://dyuthi.cusat.ac.in/purl/4104 |
Files | Size |
---|---|
A Copy detectio ... ntsusing N-grams Model.pdf | (505.4Kb) |
Abstract: | The span of writer identification extends to broad domes like digital rights administration, forensic expert decisionmaking systems, and document analysis systems and so on. As the success rate of a writer identification scheme is highly dependent on the features extracted from the documents, the phase of feature extraction and therefore selection is highly significant for writer identification schemes. In this paper, the writer identification in Malayalam language is sought for by utilizing feature extraction technique such as Scale Invariant Features Transform (SIFT).The schemes are tested on a test bed of 280 writers and performance evaluated |
Description: | India Conference (INDICON), 2012 Annual IEEE |
URI: | http://dyuthi.cusat.ac.in/purl/4318 |
Files | Size |
---|---|
The Effect of S ... in Malayalam Language.pdf | (833.4Kb) |
Abstract: | This paper describes about an English-Malayalam Cross-Lingual Information Retrieval system. The system retrieves Malayalam documents in response to query given in English or Malayalam. Thus monolingual information retrieval is also supported in this system. Malayalam is one of the most prominent regional languages of Indian subcontinent. It is spoken by more than 37 million people and is the native language of Kerala state in India. Since we neither had any full-fledged online bilingual dictionary nor any parallel corpora to build the statistical lexicon, we used a bilingual dictionary developed in house for translation. Other language specific resources like Malayalam stemmer, Malayalam morphological root analyzer etc developed in house were used in this work |
URI: | http://dyuthi.cusat.ac.in/purl/4102 |
Files | Size |
---|---|
English-Malayal ... rieval – An Experience.pdf | (195.0Kb) |
Abstract: | Handwritten character recognition is always a frontier area of research in the field of pattern recognition and image processing and there is a large demand for OCR on hand written documents. Even though, sufficient studies have performed in foreign scripts like Chinese, Japanese and Arabic characters, only a very few work can be traced for handwritten character recognition of Indian scripts especially for the South Indian scripts. This paper provides an overview of offline handwritten character recognition in South Indian Scripts, namely Malayalam, Tamil, Kannada and Telungu |
Description: | National Conference on Indian Language Computing, Kochi, Feb 19-20, 2011 |
URI: | http://dyuthi.cusat.ac.in/purl/4191 |
Files | Size |
---|---|
Handwritten Cha ... ndian Scripts A Review.pdf | (189.7Kb) |
Abstract: | On-line handwriting recognition has been a frontier area of research for the last few decades under the purview of pattern recognition. Word processing turns to be a vexing experience even if it is with the assistance of an alphanumeric keyboard in Indian languages. A natural solution for this problem is offered through online character recognition. There is abundant literature on the handwriting recognition of western, Chinese and Japanese scripts, but there are very few related to the recognition of Indic script such as Malayalam. This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using K-NN algorithm. It would help in recognizing Malayalam text entered using pen-like devices. A novel feature extraction method, a combination of time domain features and dynamic representation of writing direction along with its curvature is used for recognizing Malayalam characters. This writer independent system gives an excellent accuracy of 98.125% with recognition time of 15-30 milliseconds |
Description: | 2010 First International Conference on Integrated Intelligent Computing |
URI: | http://dyuthi.cusat.ac.in/purl/4095 |
Files | Size |
---|---|
k-NN based On-L ... cterrecognition system.pdf | (577.8Kb) |
Abstract: | Development of Malayalam speech recognition system is in its infancy stage; although many works have been done in other Indian languages. In this paper we present the first work on speaker independent Malayalam isolated speech recognizer based on PLP (Perceptual Linear Predictive) Cepstral Coefficient and Hidden Markov Model (HMM). The performance of the developed system has been evaluated with different number of states of HMM (Hidden Markov Model). The system is trained with 21 male and female speakers in the age group ranging from 19 to 41 years. The system obtained an accuracy of 99.5% with the unseen data |
Description: | International Journal of Advanced Information Technology (IJAIT) Vol. 1, No.5, October 2011 |
URI: | http://dyuthi.cusat.ac.in/purl/4214 |
Files | Size |
---|---|
Malayalam Isola ... P cepstral coefficient.pdf | (172.8Kb) |
Abstract: | Optical Character Recognition plays an important role in Digital Image Processing and Pattern Recognition. Even though ambient study had been performed on foreign languages like Chinese and Japanese, effort on Indian script is still immature. OCR in Malayalam language is more complex as it is enriched with largest number of characters among all Indian languages. The challenge of recognition of characters is even high in handwritten domain, due to the varying writing style of each individual. In this paper we propose a system for recognition of offline handwritten Malayalam vowels. The proposed method uses Chain code and Image Centroid for the purpose of extracting features and a two layer feed forward network with scaled conjugate gradient for classification |
Description: | Emerging Trends in Electrical and Computer Technology (ICETECT), 2011 International Conference on |
URI: | http://dyuthi.cusat.ac.in/purl/4196 |
Files | Size |
---|---|
Offline Handwri ... n Chain Code Histogram.pdf | (1.324Mb) |
Abstract: | This paper presents a novel approach to recognize Grantha, an ancient script in South India and converting it to Malayalam, a prevalent language in South India using online character recognition mechanism. The motivation behind this work owes its credit to (i) developing a mechanism to recognize Grantha script in this modern world and (ii) affirming the strong connection among Grantha and Malayalam. A framework for the recognition of Grantha script using online character recognition is designed and implemented. The features extracted from the Grantha script comprises mainly of time-domain features based on writing direction and curvature. The recognized characters are mapped to corresponding Malayalam characters. The framework was tested on a bed of medium length manuscripts containing 9-12 sample lines and printed pages of a book titled Soundarya Lahari writtenin Grantha by Sri Adi Shankara to recognize the words and sentences. The manuscript recognition rates with the system are for Grantha as 92.11%, Old Malayalam 90.82% and for new Malayalam script 89.56%. The recognition rates of pages of the printed book are for Grantha as 96.16%, Old Malayalam script 95.22% and new Malayalam script as 92.32% respectively. These results show the efficiency of the developed system |
Description: | (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 3, No. 7, 2012 |
URI: | http://dyuthi.cusat.ac.in/purl/4106 |
Files | Size |
---|---|
An Online Chara ... ha Script to Malayalam.pdf | (548.4Kb) |
Abstract: | In this paper, we propose a handwritten character recognition system for Malayalam language. The feature extraction phase consists of gradient and curvature calculation and dimensionality reduction using Principal Component Analysis. Directional information from the arc tangent of gradient is used as gradient feature. Strength of gradient in curvature direction is used as the curvature feature. The proposed system uses a combination of gradient and curvature feature in reduced dimension as the feature vector. For classification, discriminative power of Support Vector Machine (SVM) is evaluated. The results reveal that SVM with Radial Basis Function (RBF) kernel yield the best performance with 96.28% and 97.96% of accuracy in two different datasets. This is the highest accuracy ever reported on these datasets |
Description: | I.J. Image, Graphics and Signal Processing, 2013, 4, 53-59 |
URI: | http://dyuthi.cusat.ac.in/purl/4204 |
Files | Size |
---|---|
A System for Of ... rs in Malayalam Script.pdf | (535.2Kb) |
Now showing items 1-10 of 10
Dyuthi Digital Repository Copyright © 2007-2011 Cochin University of Science and Technology. Items in Dyuthi are protected by copyright, with all rights reserved, unless otherwise indicated.