Comparative Study of Neural Machine Translation Approaches for Hindi–Malayalam: Bi-LSTM Baselines, Word2Vec and Attention Enhancements, and mBART Transfer Learning
Abstract
Keywords
References
Anand, G. G., et al. (2023). The Effect of Difference in Word Order in Hindi: An Experimental Characterization.
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. In International Conference on Learning Representations (ICLR).
Cho, Kyunghyun, Bart van Merriënboer, Çağlar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio (2014). Learning Phrase Representations Using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
Dabre, Raj, et al. (2021). mBART Pre-training and In-Domain Fine-Tuning for Indic Languages.
Gala, Jay, et al. (2023). IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for All 22 Scheduled Indian Languages.
Gogineni, S., G. Suryanarayana, and S. K. Surendran (2020). An Effective Neural Machine Translation for English to Hindi Language. In Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC).
Hochreiter, Sepp, and Jürgen Schmidhuber (1997). Long Short-Term Memory. Neural Computation 9 (8): 1735–1780.
Kudo, Taku, and John Richardson (2018). SentencePiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations.
Kumar, A., et al. (2017). Morphological Analysis of the Dravidian Language Family.
Laskar, S. R., A. Dutta, P. Pakray, and S. Bandyopadhyay. (2019). Neural Machine Translation: English to Hindi. In IEEE Conference on Information and Communication Technology (CICT).
Lewis, Mike, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer (2020). BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
Liu, Yinhan, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer (2020). Multilingual Denoising Pre-training for Neural Machine Translation. Transactions of the Association for Computational Linguistics 8: 726–742.
Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning (2015). Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Mikolov, Tomáš, Kai Chen, Greg Corrado, and Jeffrey Dean (2013). Efficient Estimation of Word Representations in Vector Space. In International Conference on Learning Representations (ICLR).
Moghe, Nikita, et al. (2023). Extrinsic Evaluation of Machine Translation Metrics. In Proceedings of the Association for Computational Linguistics.
Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318.
Popović, Maja (2015). chrF: Character n-gram F-score for Automatic MT Evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, 392–395.
Ram, V. S., and S. Lalitha Devi (2023). Hindi to Dravidian Language Neural Machine Translation Systems. In Recent Advances in Natural Language Processing (RANLP).
Ramesh, Gowtham, et al (2022). Samanantar: The Largest Publicly Available Parallel Corpus for Indic Languages. Transactions of the Association for Computational Linguistics.
Rei, Ricardo, et al (2020). COMET: A Neural Framework for Machine Translation Evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Schuster, Mike, and Kuldip K. Paliwal (1997). Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing 45 (11): 2673–2681.
Sebastian, M. P., et al. (2023). Malayalam Natural Language Processing: Challenges and Opportunities.
Sennrich, Rico, Barry Haddow, and Alexandra Birch. (2016a). Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.
Sennrich, Rico, Barry Haddow, and Alexandra Birch. (2016b). Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.
Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul (2006). A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings of the Association for Machine Translation in the Americas.
Tang, Yuqing, et al. (2020). Multilingual Translation with Extensible Multilingual Pretraining and Finetuning.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin (2017). Attention Is All You Need. In Advances in Neural Information Processing Systems.
Wolf, Thomas, et al. (2020). Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations.
Refbacks
- There are currently no refbacks.
License URL: https://creativecommons.org/licenses/by/4.0/
Informatics Studies | ISSN: 2583-8954 (Online), 2320-530X (Print)