209
Views
2
CrossRef citations to date
0
Altmetric
Computers and computing

Improved Unsupervised Statistical Machine Translation via Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages

ORCID Icon, , & ORCID Icon
Pages 8848-8858 | Published online: 22 Jul 2022
 

ABSTRACT

Besides word order, word choice is a key stumbling block for machine translation (MT) in morphologically rich languages due to homonyms and polysemous difficulties. On the other hand, un-translated/improperly translated words are a severe issue for Statistical Machine Translation (SMT) models. The quantity of parallel training corpus has limited unsupervised SMT (USMT) systems. Still, current research lines have successfully trained SMT systems in an unsupervised manner using monolingual data alone. However, there is still a need to enhance the translation quality of the MT output due to unaligned and improperly sensed words. This problem is addressed by incorporating unsupervised Word Sense Disambiguation (WSD) into the decoding phase of USMT. The work provided a compendium of SMT systems for five translation tasks, i.e. En→Indic languages for the WMT test dataset and evaluated on BLEU and METEOR evaluation metrics. The studies were performed on En→Hi, En→Kn, En→Ta, En→Te, and En→Be tasks and showed an improvement in BLEU points by 2.3, 2.68, 0.78, 2.32, and 1.79, respectively, and METEOR points by 1.07, 1.34, 0.72, 0.693, and 1.191, respectively, over the baseline model.

DISCLOSURE STATEMENT

No potential conflict of interest was reported by the author(s).

Notes

Additional information

Notes on contributors

Shefali Saxena

Shefali Saxena is pursuing her PhD in natural language processing, statistical machine translation, on low resource languages from the National Institute of Technology, Hamirpur (Himachal Pradesh). She pursued her Master of Technology in communication systems from the National Institute of Technology, Uttarakhand in 2019 from the ECE Department, BTech (Hons) 2015 in ECE from Rajasthan Technical University, Jaipur. Rajasthan. Her research interests include computer vision and image processing, signal processing, natural language processing, and deep learning architecture.

Uttkarsh Chaurasia

Uttkarsh Chaurasia is a final-year student at the National Institute of Technology Hamirpur, where he is pursuing a dual degree in computer science and engineering. His research interests concentrate on deep learning and machine learning. Email: [email protected]

Nitin Bansal

Nitin Bansal is a third-year student at the National Institute of Technology Hamirpur, where he is pursuing a dual degree in electronics and communication engineering. He enjoys working in Machine learning and data analysis fields with a keen interest in nature language processing. Email: [email protected]

Philemon Daniel

Philemon Daniel received his PhD in electronics and communication engineering (NIT Hamirpur), MTech in VLSI Design (VIT Vellore), and BE in E&C Engineering (Bharathidasan University). He has over 13 years of teaching experience at NIT, Hamirpur. He worked as design engineer at Sasken Communication Technologies Limited, Bangalore. His research interests include VLSI testing, embedded systems, image and speech processing, natural language processing, and deep learning architectures. He gives regular talks on deep learning architectures, ARM processors and applications and similar areas. He was awarded ARM-Accredited Microcontroller Engineer (AAME) in 2015. He is the recipient of the NVIDIA GPU Grant in 2018. Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.