An Alternate Approach for Question Answering system in Bengali Language using Classification Techniques
Main Article Content
Abstract
Question Answering (QA) system is becoming more popular with the introduction of Virtual Agents and Chatbots. Medium of QA system is generally either text or audio. There are differences between search engine and QA system. Generally searching is based on keyword matching. In case of web search, list of URLs is ranked based on location, user history, search preference etc. Sophisticated algorithms like page-rank is also involved there. On the other hand, QA system does not work on keyword matching primarily. It’s often possible that the query and the best answer have no term or a very small number of terms in common. QA system in English and other popular languages resolves the issues with the help of ontology, WordNet, machine readable dictionary etc. QA system in low resource languages suffers from lack of annotation, absence of WordNet, immature ontology. In this work, QA system in Bengali is developed using supervised learning algorithms. A collection of Bengali literatures, which was developed during TDIL (Technology Development of Indian Languages) project funded by Govt. of India, is used as the repository. Well known classification techniques like ANN, SVM, Naïve Bayes and Decision Tree are employed in this work. The system has achieved 84.33% accuracy to return the exact answer. It has achieved 97.13% accuracy to return the string containing correct answer. Unavailability of structured dataset and poor resources were the main challenges for this work. QA system in Indian languages especially Bengali is very much useful not only for chatbots or virtual agents but also for the e Governance and mobile governance in West Bengal and Bangladesh. QA system in mother tongue gives opportunity to more number citizens to interact with the administration. Though the system is designed aiming towards Bengali language but it can be tuned to work for any language with minimum modification.
Article Details
Upon receipt of accepted manuscripts, authors will be invited to complete a copyright license to publish the paper. At least the corresponding author must send the copyright form signed for publication. It is a condition of publication that authors grant an exclusive licence to the the INFOCOMP Journal of Computer Science. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning the copyright license, authors may use their own material in other publications and ensure that the INFOCOMP Journal of Computer Science is acknowledged as the original publication place.