Arabic Sign Language Recognition with Deep Learning Models and Keypoint Landmarks
Abstract
Communication is a fundamental aspect of human interaction, essential for expressing emotions and building relationships. While individuals with typical hearing rely on spoken language, the deaf and mute community communicates through visual gestures and facial expressions, commonly known as sign language. However, communication barriers persist between hearing and non-hearing individuals, especially in regions with limited assistive technologies. To address this gap, we developed a real-time sign language system that converts Arabic sign gestures into textual output. Unlike most existing systems that are limited to individual alphabets or numbers, our model recognizes complete, meaningful words. It was trained on a curated dataset of 112 Arabic sign language words extracted from the KARSL dataset. Using OpenCV and the MediaPipe framework, multimodal keypoints from hands, face, and upper-body pose were extracted. MediaPipe Hands generated a 255-dimensional feature vector for each video frame, capturing real-time hand movements. These features were used to train deep learning models—CNN, GRU, LSTM, and Bi-LSTM. Among these, the Bi-LSTM model achieved the highest performance with a training accuracy of 99.89% and testing accuracy of 99.61%. These results emphasize the potential of MediaPipe-based landmark extraction combined with deep learning to support accessible communication for Arabic-speaking deaf communities.
Keywords
Arabic Sign Language, KARSL dataset, MediaPipe, LSTM, GRU, Sign language recognition, Deep LearningReferences
[1] Diane Lillo-Martin and Jonathan Henner. Acquisition of sign languages. Annual review of linguistics, 7 (1):395-419, 2021.
https://doi.org/10.1146/annurev-linguistics-043020-092357
[2] Michael Higgins and Amy M Lieberman. Deaf students as a linguistic and cultural minority: Shift- ing perspectives and implications for teaching and learning. Journal of Education, 196(1):9-18, 2016.
https://doi.org/10.1177/002205741619600103
[3] World Federation of the Deaf. Crpd & deaf rights. https://wfdeaf.org/crpd/, 2025. World Federation of the Deaf - Convention on the Rights of Persons with Disabilities (CRPD).
[4] Barathi Subramanian, Bekhzod Olimov, Shraddha M Naik, Sangchul Kim, Kil-Houm Park, and Jeonghong Kim. An integrated mediapipe-optimized gru model for indian sign language recognition. Scientific Reports, 12(1):11964, 2022.
https://doi.org/10.1038/s41598-022-15998-7
[5] Taewan Kim and Bongjae Kim. Techniques for detecting the start and end points of sign language utter- ances to enhance recognition performance in mobile environments. Applied Sciences, 14(20):9199, 2024.
https://doi.org/10.3390/app14209199
[6] Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, and Matthias Grundmann. Mediapipe hands: On-device real-time hand tracking. arXiv preprint arXiv:2006.10214, 2020.
https://doi.org/10.48550/arXiv.2006.10214.
[7] Google Research. Mediapipe holistic - simultaneous face, hand and pose prediction, on device. https: //ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html, December 2020. (Accessed: October 22, 2025).
[8] Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grund- mann. Blazepose: On-device real-time body pose tracking. arXiv preprint arXiv:2006.10204, 2020.
https://doi.org/10.48550/arXiv.2006.10204.
[9] Kanchon Kanti Podder, Maymouna Ezeddin, Muhammad EH Chowdhury, Md Shaheenur Islam Sumon, Anas M Tahir, Mohamed Arselene Ayari, Proma Dutta, Amith Khandakar, Zaid Bin Mahbub, and Muham- mad Abdul Kadir. Signer-independent arabic sign language recognition system using deep learning model. Sensors, 23(16):7156, 2023. doi:https://doi.org/10.3390/s23167156.
https://doi.org/10.3390/s23167156
[10] Aditya Raj Verma, Gagandeep Singh, Karnim Meghwal, Banawath Ramji, and Praveen Kumar Dadheech. Enhancing sign language detection through mediapipe and convolutional neural networks (cnn). arXiv preprint arXiv:2406.03729, 2024. doi:https://doi.org/10.48550/arXiv.2406.03729.
[11] Bader Alsharif, Easa Alalwany, Ali Ibrahim, Imad Mahgoub, and Mohammad Ilyas. Real-time ameri- can sign language interpretation using deep learning and keypoint tracking. Sensors, 25(7):2138, 2025. doi:https://doi.org/10.3390/s25072138.
https://doi.org/10.3390/s25072138
[12] Din Ezra, Shai Mastitz, and Irina Rabaev. Signsability: Enhancing communication through a sign lan- guage app. Software, 3(3):368-379, 2024. doi:https://doi.org/10.3390/software3030019.
https://doi.org/10.3390/software3030019
[13] Ala Addin I Sidig, Hamzah Luqman, Sabri Mahmoud, and Mohamed Mohandes. Karsl: Arabic sign language database. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 20(1):1-19, 2021. doi:https://doi.org/10.1145/3423420.
https://doi.org/10.1145/3423420
[14] Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, and Richard Bowden. Sign language transformers: Joint end-to-end sign language recognition and translation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10023-10033, 2020. doi:https://doi.org/10.48550/arXiv.2003.13830.
https://doi.org/10.1109/CVPR42600.2020.01004
[15] Sylvie CW Ong and Surendra Ranganath. Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis & Machine Intelligence, 27(06):873- 891, 2005. doi:10.1109/TPAMI.2005.112.
https://doi.org/10.1109/TPAMI.2005.112
[16] Mohammed Waleed Kadous et al. Machine recognition of auslan signs using powergloves: Towards large-lexicon recognition of sign language. In Proceedings of the Workshop on the Integration of Gesture in Language and Speech, volume 165, pages 165-174. DE Wilmington, 1996.
[17] Brandon Garcia, Sigberto Alarcon Viesca, et al. Real-time american sign language recognition with convolutional neural networks. Convolutional Neural Networks for Visual Recognition, 2(225-232):8, 2016.
[18] Prabhakara Uyyala. Sign language recognition using convolutional neural networks. J. Interdiscip. Cycle Res, 14:1198-1207, 2022. doi:https://doi.org/10.17613/47ga-zw60.
[19] Oscar Koller, Jens Forster, and Hermann Ney. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Under- standing, 141:108-125, 2015. doi:https://doi.org/10.1016/j.cviu.2015.09.013.
https://doi.org/10.1016/j.cviu.2015.09.013
[20] Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Openpose: Realtime multi- person 2d pose estimation using part affinity fields. IEEE transactions on pattern analysis and machine intelligence, 43(1):172-186, 2019. doi:10.1109/TPAMI.2019.2929257.
https://doi.org/10.1109/TPAMI.2019.2929257
[21] Markus Oberweger, Paul Wohlhart, and Vincent Lepetit. Hands deep in deep learning for hand pose estimation. arXiv preprint arXiv:1502.06807, 2015. doi:https://doi.org/10.48550/arXiv.1502.06807.
[22] Pranav Sheth, Sanju Rajora, and Yogeshwari Makwana. Sign language recognition application using lstm and gru (rnn). Int. J. Sci. Res. Comput. Sci., 8(2), 2023. doi:10.13140/RG.2.2.18635.87846.
[23] Kenneth Mejı́a-Peréz, Diana-Margarita Córdova-Esparza, Juan Terven, Ana-Marcela Herrera-Navarro, Teresa Garcı́a-Ramı́rez, and Alfonso Ramı́rez-Pedraza. Automatic recognition of mexican sign lan- guage using a depth camera and recurrent neural networks. Applied Sciences, 12(11):5523, 2022. doi:https://doi.org/10.3390/app12115523.
https://doi.org/10.3390/app12115523
[24] Suguna Mariappan, Ponmalar Murugesan, and Hemapriya Muthamil Selvan. Real-time interpreter for short sentences in indian sign language using mediapipe and deep learning. Information Technology and Control, 53(3):888-898, 2024. doi:https://doi.org/10.5755/j01.itc.53.3.33935.
https://doi.org/10.5755/j01.itc.53.3.33935
[25] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014. doi:https://doi.org/10.48550/arXiv.1412.3555.
[26] Su Yang and Qing Zhu. Continuous chinese sign language recognition with cnn-lstm. In Ninth interna- tional conference on digital image processing (ICDIP 2017), volume 10420, pages 83-89. SPIE, 2017. doi:https://doi.org/10.1117/12.2281671.
https://doi.org/10.1117/12.2281671
[27] Diksha Kumari and Radhey Shyam Anand. Isolated video-based sign language recognition us- ing a hybrid cnn-lstm framework based on attention mechanism. Electronics, 13(7):1229, 2024. doi:https://doi.org/10.3390/electronics13071229.
https://doi.org/10.3390/electronics13071229
[28] Qidan Zhu, Jing Li, Fei Yuan, and Quan Gan. Multiscale temporal network for contin- uous sign language recognition. Journal of Electronic Imaging, 33(2):023059-023059, 2024. doi:https://doi.org/10.1117/1.JEI.33.2.023059.
https://doi.org/10.1117/1.JEI.33.2.023059
[29] Sarah Alyami and Hamzah Luqman. A comparative study of continuous sign language recognition tech- niques. arXiv preprint arXiv:2406.12369, 2024. doi:https://doi.org/10.48550/arXiv.2406.12369.
[30] Runpeng Cui, Hu Liu, and Changshui Zhang. A deep neural framework for continuous sign lan- guage recognition by iterative training. IEEE Transactions on Multimedia, 21(7):1880-1891, 2019. doi:10.1109/TMM.2018.2889563.
https://doi.org/10.1109/TMM.2018.2889563
[31] Oscar Koller, O Zargaran, Hermann Ney, and Richard Bowden. Deep sign: Hybrid cnn-hmm for contin- uous sign language recognition. In Proceedings of the British Machine Vision Conference 2016, 2016. doi:https://doi.org/10.5244/C.30.136.
https://doi.org/10.5244/C.30.136
[32] Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, and Xavier Giro-i Nieto. How2sign: a large-scale multimodal dataset for continuous ameri- can sign language. In Proceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, pages 2735-2744, 2021. doi:https://doi.org/10.48550/arXiv.2008.08143.
https://doi.org/10.1109/CVPR46437.2021.00276
[33] Ulrich von Agris and Karl-Friedrich Kraiss. Signum database: Video corpus for signer-independent continuous sign language recognition. In 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, pages 243-246, 2010. URL https://www. sign-lang.uni-hamburg.de/lrec/pub/10006.pdf.
[34] Jens Forster, Christoph Schmidt, Thomas Hoyoux, Oscar Koller, Uwe Zelle, Justus H Piater, and Hermann Ney. Rwth-phoenix-weather: A large vocabulary sign language recognition and translation corpus. In LREC, volume 9, pages 3785-3789, 2012.
https://doi.org/10.63317/4vmmiu4jeew5
[35] Ahmad Sami Al-Shamayleh, Rodina Ahmad, Nazean Jomhari, and Mohammad AM Abushariah. Automatic arabic sign language recognition: A review, taxonomy, open challenges, research roadmap and future directions. Malaysian Journal of Computer Science, 33(4):306-343, 2020. doi:https://doi.org/10.22452/mjcs.vol33no4.5.
https://doi.org/10.22452/mjcs.vol33no4.5
[36] Hamzah Luqman. Arabsign: A multi-modality dataset and benchmark for continuous arabic sign language recognition. In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pages 1-8. IEEE, 2023. doi:10.1109/FG57933.2023.10042720.
https://doi.org/10.1109/FG57933.2023.10042720
[37] Ghazanfar Latif, Nazeeruddin Mohammad, Jaafar Alghazo, Roaa AlKhalaf, and Rawan AlKha- laf. Arasl: Arabic alphabets sign language dataset. Data in brief, 23:103777, 2019. doi:https://doi.org/10.1016/j.dib.2019.103777.
https://doi.org/10.1016/j.dib.2019.103777
[38] Gamal Tharwat, Abdelmoty M Ahmed, and Belgacem Bouallegue. Arabic sign language recognition system for alphabets using machine learning techniques. Journal of Electrical and Computer Engineering, 2021(1):2995851, 2021. doi:https://doi.org/10.1155/2021/2995851.
https://doi.org/10.1155/2021/2995851
[39] Ganzorig Batnasan, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Fady Alnajjar, and Timothy K Shih. Arsl21l: Arabic sign language letter dataset benchmarking and an educational avatar for meta- verse applications. In 2022 IEEE global engineering education conference (educon), pages 1814-1821. IEEE, 2022. doi:10.1109/EDUCON52537.2022.9766497.
https://doi.org/10.1109/EDUCON52537.2022.9766497
[40] Muhammad Al-Barham, Ahmad Jamal, and Musa Al-Yaman. Design of arabic sign language recognition model. arXiv preprint arXiv:2301.02693, 2023. doi:https://doi.org/10.48550/arXiv.2301.02693.
[41] Mazen Balat, Rewaa Awaad, Hend Adel, Ahmed B Zaky, and Salah A Aly. Advanced arabic alphabet sign language recognition using transfer learning and transformer models. In 2024 International Conference on Computer and Applications (ICCA), pages 1-6. IEEE, 2024. doi:10.1109/ICCA62237.2024.10927914.
https://doi.org/10.1109/ICCA62237.2024.10927914
[42] Naif Alasmari and Sultan Asiri. Asldetect: Arabic sign language detection using resnet and u-net like component. Scientific Reports, 15(1):18012, 2025. doi:https://doi.org/10.1038/s41598-025-01588-w.
https://doi.org/10.1038/s41598-025-01588-w
[43] Ammar Alnahhas, Bassel Alkhatib, Nazeer Al-Boukaee, Noor Alhakim, Ola Alz- abibi, and Noor Ajalyakeen. Enhancing the recognition of arabic sign language by using deep learning and leap motion controller. Int. J. Sci. Technol. Res, 9(4): 1865-1870, 2020. URL https://www.ijstr.org/final-print/apr2020/ Enhancing-The-Recognition-Of-Arabic-Sign-Language-By-Using-Deep-Learning-And-L
[44] Samah Abbas, Hassanin Al-Barhamtoshy, and Fahad Alotaibi. Towards an arabic sign language (arsl) corpus for deaf drivers. PeerJ Computer Science, 7:e741, 2021. doi:https://doi.org/10.7717/peerj-cs.741.
https://doi.org/10.7717/peerj-cs.741
[45] Samah Abbas, Dimah Alahmadi, and Hassanin Al-Barhamtoshy. Establishing a multimodal dataset for arabic sign language (arsl) production. Journal of King Saud University-Computer and Information Sci- ences, 36(8):102165, 2024. doi:https://doi.org/10.1016/j.jksuci.2024.102165.
https://doi.org/10.1016/j.jksuci.2024.102165
[46] Nahlah Algethami, Raghad Farhud, Manal Alghamdi, Huda Almutairi, Maha Sorani, and Noura Aleisa. Continuous arabic sign language recognition models. Sensors, 25(9):2916, 2025. doi:https://doi.org/10.3390/s25092916.
https://doi.org/10.3390/s25092916
[47] Soukeina Elhassen, Lama Al Khuzayem, Areej Alhothali, Ohoud Alzamzami, and Nahed Alowaidi. Continuous saudi sign language recognition: A vision transformer approach. arXiv preprint arXiv:2509.03467, 2025. doi:https://doi.org/10.48550/arXiv.2509.03467.
[48] Angel Diego Briones Cerquı́n, Johan Alonso Tumay Guevara, and Christian Ovalle. Mobile application for continuous recognition and classification of sign language images through deep learning. International Journal of Interactive Mobile Technologies, 19(7), 2025. doi:10.3991/ijim.v19i07.52853.
https://doi.org/10.3991/ijim.v19i07.52853
[49] Bogart Yail Marquez, Arnulfo Alanis, Angeles Quezada, José Sergio, et al. Development of a mobile application with artificial intelligence for mexican sign language recognition. International Journal of Interactive Mobile Technologies, 19(9), 2025. doi:10.3991/ijim.v19i09.54205.
https://doi.org/10.3991/ijim.v19i09.54205
[50] Ali Akdag and Omer Kaan Baykan. Enhancing signer-independent recognition of isolated sign lan- guage through advanced deep learning techniques and feature fusion. Electronics, 13(7):1188, 2024. doi:https://doi.org/10.3390/electronics13071188.
https://doi.org/10.3390/electronics13071188
[51] Nada B Ibrahim, Hala H Zayed, and Mazen M Selim. Advances, challenges and opportunities in contin- uous sign language recognition. Journal of Engineering and Applied Sciences, 15(5):1205-1227, 2020. doi:10.36478/jeasci.2020.1205.1227.
https://doi.org/10.36478/jeasci.2020.1205.1227
[52] Sarah Alyami, Hamzah Luqman, and Mohammad Hammoudeh. Isolated arabic sign language recognition using a transformer-based model and landmark keypoints. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(1):1-19, 2024. doi:https://doi.org/10.1145/3584984.
https://doi.org/10.1145/3584984
[53] Google AI Edge. Mediapipe solutions guide. https://ai.google.dev/edge/mediapipe/ solutions/guide, 2025. MediaPipe Solutions provides cross-platform ML tools and pre-trained task libraries.
[54] Alejandro Toro-Ossaba, Juan Jaramillo-Tigreros, Juan C Tejada, Alejandro Peña, Alexandro López- González, and Rui Alexandre Castanho. Lstm recurrent neural network for hand gesture recognition using emg signals. Applied Sciences, 12(19):9700, 2022. doi:https://doi.org/10.3390/app12199700.
https://doi.org/10.3390/app12199700
[55] Kok Seang Tan, Kian Ming Lim, Chin Poo Lee, and Lee Chung Kwek. Bidirectional long short-term memory with temporal dense sampling for human action recognition. Expert Systems with Applications, 210:118484, 2022. doi:https://doi.org/10.1016/j.eswa.2022.118484.
https://doi.org/10.1016/j.eswa.2022.118484
[56] Yang Song, Mengru Liu, Feilu Wang, Jinggen Zhu, Anyang Hu, and Niuping Sun. Gesture recognition based on a convolutional neural network-bidirectional long short-term memory network for a wearable wrist sensor with multi-walled carbon nanotube/cotton fabric material. Micromachines, 15(2):185, 2024. doi:https://doi.org/10.3390/mi15020185.
https://doi.org/10.3390/mi15020185
[57] Yanqiong Zhang and Xianwei Jiang. Recent advances on deep learning for sign language recognition. Computer Modeling in Engineering & Sciences (CMES), 139(3), 2024. doi:10.32604/cmes.2023.045731.
https://doi.org/10.32604/cmes.2023.045731
[58] Katerina Papadimitriou, Gerasimos Potamianos, Galini Sapountzaki, Theodoros Goulas, Eleni Efthimiou, Stavroula-Evita Fotinea, and Petros Maragos. Greek sign language recognition for an education platform. Universal Access in the Information Society, 24(1):51-68, 2025. doi:https://doi.org/10.1007/s10209-023- 01017-7.
https://doi.org/10.1007/s10209-023-01017-7
[59] Oscar Koller, Sepehr Zargaran, and Hermann Ney. Re-sign: Re-aligned end-to-end sequence modelling with deep recurrent cnn-hmms. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4297-4305, 2017. doi:10.1109/CVPR.2017.364.
https://doi.org/10.1109/CVPR.2017.364
[60] Gerges H Samaan, Abanoub R Wadie, Abanoub K Attia, Abanoub M Asaad, Andrew E Kamel, Salwa O Slim, Mohamed S Abdallah, and Young-Im Cho. Mediapipe's landmarks with rnn for dynamic sign language recognition. Electronics, 11(19):3228, 2022. doi:https://doi.org/10.3390/electronics11193228.
Published
How to Cite
Downloads
Copyright (c) 2026 Farah Alshanik, Saif Aljunidi, Ethar Qawasmeh

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.