Contextual Semantic Embeddings Based on Transformer Models for Arabic Biomedical Questions Classification
Abstract
Doi: 10.28991/HIJ-2024-05-04-011
Full Text: PDF
Keywords
References
Sarrouti, M., & El Alaoui, S. O. (2017). A machine learning-based method for question type classification in biomedical question answering. Methods of Information in Medicine, 56(3), 209–216. doi:10.3414/ME16-01-0116.
Xu, S., Cheng, G., & Kong, F. (2016). Research on question classification for automatic question answering. 2016 international conference on Asian language processing (IALP), 218–221. doi:10.1109/IALP.2016.7875972.
Babu, A., & Boddu, S. B. (2024). BERT-Based Medical Chatbot: Enhancing Healthcare Communication through Natural Language Understanding. Exploratory Research in Clinical and Social Pharmacy, 13, 100419. doi:10.1016/j.rcsop.2024.100419.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 1, 4171–4186.
Tama, B. A., & Lim, S. (2020). A comparative performance evaluation of classification algorithms for clinical decision support systems. Mathematics, 8(10), 1–24. doi:10.3390/math8101814.
Hassan, E., Abd El-Hafeez, T., & Shams, M. Y. (2024). Optimizing classification of diseases through language model analysis of symptoms. Scientific Reports, 14(1), 01 2024. doi:10.1038/s41598-024-51615-5.
Momtazi, S. (2018). Unsupervised Latent Dirichlet Allocation for supervised question classification. Information Processing and Management, 54(3), 380–393. doi:10.1016/j.ipm.2018.01.001.
Hamza, A., En-Nahnahi, N., Zidani, K. A., & El Alaoui Ouatik, S. (2021). An Arabic question classification method based on new taxonomy and continuous distributed representation of words. Journal of King Saud University - Computer and Information Sciences, 33(2), 218–224. doi:10.1016/j.jksuci.2019.01.001.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings, 1–12.
Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for sentence summarization. Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing, 379–389.
Aggarwal, C. C., & Zhai, C. X. (2012). A survey of text clustering algorithms. Mining Text Data, 9781461432234, 77–128. doi:10.1007/978-1-4614-3223-4_4.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. ArXiv Preprint, ArXiv:1310.4546. doi:10.48550/arXiv.1310.4546.
Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 1532–1543. doi:10.3115/v1/d14-1162.
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135–146. doi:10.1162/tacl_a_00051.
Zhang, Y., Chen, Q., Yang, Z., Lin, H., & Lu, Z. (2019). BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data, 6(1), 52. doi:10.1038/s41597-019-0055-0.
Lahbari, I., & El Alaoui, S. O. (2024). Exploring Sentence Embedding Representation for Arabic Question Answering. International Journal of Computing and Digital Systems, 15(1), 1229–1241. doi:10.12785/ijcds/150187.
Antoun, W., Baly, F., & Hajj, H. (2020). Arabert: Transformer-based model for Arabic language understanding. arXiv preprint, arXiv:2003.00104. doi:10.48550/arXiv.2003.00104.
Abdelhay, M., & Mohammed, A. (2022). MAQA: Medical Arabic Q & A dataset. Harvard Dataverse, Cambridge, United States.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (1907). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692. doi:10.48550/arXiv.1907.11692.
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2020). BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234–1240. doi:10.1093/bioinformatics/btz682.
Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint, arXiv:1910.01108. doi:10.48550/arXiv.1910.01108.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 2017-December, 5999–6009.
Mutabazi, E., Ni, J., Tang, G., & Cao, W. (2023). An Improved Model for Medical Forum Question Classification Based on CNN and BiLSTM. Applied Sciences (Switzerland), 13(15), 8623. doi:10.3390/app13158623.
Vihikan, W. O., & Trisna, I. N. P. Indonesian health question multi-class classification based on deep learning. Journal of Information Systems and Informatics, 6(3), 1931–1944.
Mansour, M., Tohamy, M., Ezzat, Z., & Torki, M. (2020). {A}rabic Dialect Identification Using {BERT} Fine-Tuning. Proceedings of the Fifth Arabic Natural Language Processing Workshop, 308–312.
Boudjellal, N., Zhang, H., Khan, A., Ahmad, A., Naseem, R., Shang, J., & Dai, L. (2021). ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition. Complexity, 2021, 1–6. doi:10.1155/2021/6633213.
Zafar, A., Sahoo, S. K., Varshney, D., Das, A., & Ekbal, A. (2024). KIMedQA: towards building knowledge-enhanced medical QA models. Journal of Intelligent Information Systems, 62(3), 833–858. doi:10.1007/s10844-024-00844-1.
Hammoud, J., Vatian, A., Dobrenko, N., Vedernikov, N., Shalyto, A., & Gusarova, N. (2021). New Arabic Medical Dataset for Diseases Classification. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13113 LNCS, 196–203. doi:10.1007/978-3-030-91608-4_20.
Al-Smadi, B. S. (2024). DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning. Computers in Biology and Medicine, 170, 107921. doi:10.1016/j.compbiomed.2024.107921.
Yu, H., Liu, C., Zhang, L., Wu, C., Liang, G., Escorcia-Gutierrez, J., & Ghoneim, O. A. (2023). An intent classification method for questions in “Treatise on Febrile diseases” based on TinyBERT-CNN fusion model. Computers in Biology and Medicine, 162, 107075. doi:10.1016/j.compbiomed.2023.107075.
Kofi Akpatsa, S., Lei, H., Li, X., Kofi Setornyo Obeng, V.-H., Mensah Martey, E., Clement Addo, P., & Dodzi Fiawoo, D. (2022). Online News Sentiment Classification Using DistilBERT. Journal of Quantum Computing, 4(1), 1–11. doi:10.32604/jqc.2022.026658.
Aftan, S., & Shah, H. (2023). Using the AraBERT Model for Customer Satisfaction Classification of Telecom Sectors in Saudi Arabia. Brain Sciences, 13(1), 147. doi:10.3390/brainsci13010147.
El-Alami, F. zahra, Ouatik El Alaoui, S., & En Nahnahi, N. (2022). Contextual semantic embeddings based on fine-tuned AraBERT model for Arabic text multi-class categorization. Journal of King Saud University - Computer and Information Sciences, 34(10), 8422–8428. doi:10.1016/j.jksuci.2021.02.005.
Houssein, E. H., Mohamed, R. E., Hu, G., & Ali, A. A. (2024). Adapting transformer-based language models for heart disease detection and risk factors extraction. Journal of Big Data, 11(1). doi:10.1186/s40537-024-00903-y.
Abdelali, A., Darwish, K., Durrani, N., & Mubarak, H. (2016). Farasa: A fast and furious segmenter for Arabic. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Demonstrations Session, 11–16. doi:10.18653/v1/n16-3003.
Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to Fine-Tune BERT for Text Classification? Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Vol. 11856 LNAI, 194–206. doi:10.1007/978-3-030-32381-3_16.
Abdelhay, M., Mohammed, A., & Hefny, H. A. (2023). Deep learning for Arabic healthcare: MedicalBot. Social Network Analysis and Mining, 13(1), 71. doi:10.1007/s13278-023-01077-w.
DOI: 10.28991/HIJ-2024-05-04-011
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Ismail AIT TALGHALIT