We develop Large Language Models (LLMs) and Natural Language Processing (NLP) technologies capable of understanding context, language, and culture, and generating precise, relevant content. Our research aims to create solutions across diverse domains, including intelligent assistants, decision support systems, text and audio content analysis, and multimodal models. We also focus on enhancing personalization and natural human–machine interaction, enabling smoother and more effective experiences while establishing a scientific foundation for the next generation of intelligent solutions.
Multimodal Large Language Models ( MLLM )
Research on AI models that integrate multiple modalities (text, images, audio, video) to study cross-modal reasoning, representation learning, and multimodal understanding.
Speech LLM & Speech processing
Research on large language models and algorithms specialized for speech recognition, synthesis, and understanding, enabling conversational AI, voice interfaces, and multimodal communication.
Reasoning (Thinking models + Language reasoning + Vision)
Investigating models capable of logical, causal, and commonsense reasoning across modalities, including structured thinking in language, visual reasoning, and cross-domain problem-solving.
AI Agent & Agentic AI
Studying autonomous AI agents that plan, reason, and act over multiple steps, exploring goal-directed behavior, tool use, and adaptive problem-solving.
Other Research Areas
1. Diffusion Large Language Models: Research on generative model that learns to create new data by reversing a process of adding noise to training data
2. Physical AI: Research into AI systems that interact with the physical world, such as embodied intelligence, and the integration of perception, control, and decision-making.
3. VLA Vision Language Action: Research connecting visual perception and language understanding to actionable outputs, advancing embodied AI, and human-AI interaction.
4. WLM (World Large Models): Investigating models that learn and predict environment dynamics, enabling research in planning, model-based reinforcement learning, and simulation-to-real transfer.
Published Research Papers
1. Arabic Named Entity Recognition with a CRF Model Based on Transformer Architecture
2. Arabic named entity recognition using transformer - based -crf model
3. Recent Advances in Long Documents Classification Using Deep –Learning
4. AraLegal -BERT: A pretrained language model for Arabic Legal text
5. Leveraging BERT Language Model for Arabic Long Document Classification
6. Deep learning for sign language recognition: Current techniques, benchmarks, and open issues
7. pyStudio: An Open -Source Machine Learning Platform
8. Improving Automated Speech Recognition Using Retrieval-Based Voice Conversion
9. Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down
10. Open Universal Arabic ASR Leaderboard
Granted Patents
1. Method for Accelerated Long Document Search using Hilbert Curve Mapping
2. Method and apparatus for identifying similar data elements using string matching
3. Method and apparatus with arabic information extraction and semantic search
4. Method and computer readable storage medium for automated speech recognition using retrieval-based voice conversion
5. Method and system for realtime measuring of product reputation