Design And Development Of A Named Entity Based Question Answering System For Malayalam Language

Dyuthi/Manakin Repository

Design And Development Of A Named Entity Based Question Answering System For Malayalam Language

Show simple item record

dc.contributor.author Bindu, M S
dc.contributor.author Dr.Sumam Mary,Idicula
dc.date.accessioned 2014-04-28T06:47:19Z
dc.date.available 2014-04-28T06:47:19Z
dc.date.issued 2012
dc.identifier.uri http://dyuthi.cusat.ac.in/purl/3698
dc.description Dept. Of Computer Science Cochin University Of Science And Technology en_US
dc.description.abstract This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements en_US
dc.description.sponsorship Cochin University Of Science And Technology en_US
dc.language.iso en en_US
dc.publisher Cochin University Of Science And Technology en_US
dc.subject Question Answering Systems en_US
dc.subject Basic Word Types en_US
dc.subject Phrase Types en_US
dc.subject Malayalam Question Answering System en_US
dc.subject Compound Word Splitter en_US
dc.title Design And Development Of A Named Entity Based Question Answering System For Malayalam Language en_US
dc.type Thesis en_US


Files in this item

Files Size Format View Description
Dyuthi-T1663.pdf 4.166Mb PDF View/Open Pd F

This item appears in the following Collection(s)

Show simple item record

Search Dyuthi


Advanced Search

Browse

My Account