Call for Papers
Quick Links
October 2022 | Volume 01 | Issue 01
Out of vocabulary words handling in morphological analysis
Amit Asthana
Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India
Author
Ganesh Chandra
Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India
Author
📌 DOI: https://doi.org/10.63920/tjths.11001
🔑 Keywords: Natural Language Processing; Morphological Analysis;
đź“… Publication Date: 4 October, 2022
📜 License:
This work is licensed under a Creative Commons Attribution 4.0 International License
- Share — Copy and Redistribute the material
- Adapt — Remix, Transform, and build upon the material
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Abstract:
Morphological analysis is the first step in Natural Language Processing (NLP). It paves the way for future analysis and NLP procedures to be completed. Morphological analysis is the act of identifying morphemes in a phrase by studying each word individually. Out of vocabulary (OOV) words are words that are present in a phrase but for which the morphological analyzer is unable to discover a morpheme. In NLP, identifying OOV terms is a challenge. If OOV terms are not detected, it may be difficult to discern the sentence's true meaning. The goal of this research study is to provide a mechanism for identifying OOV words in Hindi during morphological analysis.
Download Full PDF Paper
References
[1] D. Chakrabarti, H. Mandalia, R. Priya, V. Sarma, and P. Bhattacharyya, “Hindi compound verbs and their automatic extraction,” in Proc. COLING, Manchester, U.K., 2008.
[2] G. Ponkiya, K. Patel, P. Bhattacharyya, and G. Palshikar, “Treat us like the sequences we are: Prepositional paraphrasing of noun compounds using LSTM,” in Proc. COLING, Santa Fe, NM, USA, Aug. 20–26, 2018.
[3] G. Ponkiya, R. Murthy, P. Bhattacharyya, and G. Palshikar, “Looking inside noun compounds: Unsupervised prepositional and free paraphrasing using language models,” in Findings of EMNLP, 2020, pp. 16–20.
[4] G. Ponkiya, K. Patel, P. Bhattacharyya, and G. K. Palshikar, “Towards a standardized dataset for noun compound interpretation,” in Proc. LREC, Miyazaki, Japan, May 7–12, 2018.
[5] M. Bapat, H. Gune, and P. Bhattacharyya, “A paradigm-based finite state morphological analyzer for Marathi,” in Workshop on South and South East Asian NLP (COLING), Beijing, China, 2010.
[6] R. Dabre, A. Amberkar, and P. Bhattacharyya, “A way to break them all: A compound word analyzer for Marathi,” in Proc. ICON, Noida, India, Dec. 18–20, 2013.
[7] T. Yamashita and Y. Matsumoto, “Language independent morphological analysis,” in Proc. ACL, 2000, pp. 232–238, doi: 10.3115/974147.974179.
