Mohamed bin Zayed University of Artificial Intelligence

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Mohamed bin Zayed University of Artificial Intelligence
Short name
MBZUAI
Country, city
UAE, Abu Dhabi
Publications
754
Citations
12 763
h-index
50
Top-3 journals
Top-3 foreign organizations

Most cited in 5 years

Fan D., Zhou T., Ji G., Zhou Y., Chen G., Fu H., Shen J., Shao L.
2020-08-01 citations by CoLab: 880 Abstract  
Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues. Further, collecting a large amount of data is impractical within a short time period, inhibiting the training of a deep model. To address these challenges, a novel COVID-19 Lung Infection Segmentation Deep Network ( Inf-Net ) is proposed to automatically identify infected regions from chest CT slices. In our Inf-Net , a parallel partial decoder is used to aggregate the high-level features and generate a global map. Then, the implicit reverse attention and explicit edge-attention are utilized to model the boundaries and enhance the representations. Moreover, to alleviate the shortage of labeled data, we present a semi-supervised segmentation framework based on a randomly selected propagation strategy, which only requires a few labeled images and leverages primarily unlabeled data. Our semi-supervised framework can improve the learning ability and achieve a higher performance. Extensive experiments on our COVID-SemiSeg and real CT volumes demonstrate that the proposed Inf-Net outperforms most cutting-edge segmentation models and advances the state-of-the-art performance.
Fan D., Ji G., Zhou T., Chen G., Fu H., Shen J., Shao L.
2020-10-02 citations by CoLab: 871 Abstract  
Colonoscopy is an effective technique for detecting colorectal polyps, which are highly related to colorectal cancer. In clinical practice, segmenting polyps from colonoscopy images is of great importance since it provides valuable information for diagnosis and surgery. However, accurate polyp segmentation is a challenging task, for two major reasons: (i) the same type of polyps has a diversity of size, color and texture; and (ii) the boundary between a polyp and its surrounding mucosa is not sharp. To address these challenges, we propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images. Specifically, we first aggregate the features in high-level layers using a parallel partial decoder (PPD). Based on the combined feature, we then generate a global map as the initial guidance area for the following components. In addition, we mine the boundary cues using the reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues. Thanks to the recurrent cooperation mechanism between areas and boundaries, our PraNet is capable of calibrating some misaligned predictions, improving the segmentation accuracy. Quantitative and qualitative evaluations on five challenging datasets across six metrics show that our PraNet improves the segmentation accuracy significantly, and presents a number of advantages in terms of generalizability, and real-time segmentation efficiency ( $$\varvec{\sim }$$ 50 fps).
Wang H., Fu T., Du Y., Gao W., Huang K., Liu Z., Chandak P., Liu S., Van Katwyk P., Deac A., Anandkumar A., Bergen K., Gomes C.P., Ho S., Kohli P., et. al.
Nature scimago Q1 wos Q1
2023-08-02 citations by CoLab: 640 Abstract  
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI tools need a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation. The advances in artificial intelligence over the past decade are examined, with a discussion on how artificial intelligence systems can aid the scientific process and the central issues that remain despite advances.
Zamir S.W., Arora A., Khan S., Hayat M., Khan F.S., Yang M., Shao L.
2020-11-20 citations by CoLab: 500 Abstract  
With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography and medical imaging. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present an architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In a nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNet .
Ye M., Shen J., J. Crandall D., Shao L., Luo J.
2020-11-18 citations by CoLab: 297 Abstract  
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Due to the large intra-class variations and cross-modality discrepancy with large amount of sample noise, it is difficult to learn discriminative part features. Existing VI-ReID methods instead tend to learn global representations, which have limited discriminability and weak robustness to noisy images. In this paper, we propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID. We propose an intra-modality weighted-part attention module to extract discriminative part-aggregated features, by imposing the domain knowledge on the part relationship mining. To enhance robustness against noisy samples, we introduce cross-modality graph structured attention to reinforce the representation with the contextual relations across the two modalities. We also develop a parameter-free dynamic dual aggregation learning strategy to adaptively integrate the two components in a progressive joint training manner. Extensive experiments demonstrate that DDAG outperforms the state-of-the-art methods under various settings.
Banabilah S., Aloqaily M., Alsayed E., Malik N., Jararweh Y.
2022-11-01 citations by CoLab: 276 Abstract  
Federated Learning (FL) has been foundational in improving the performance of a wide range of applications since it was first introduced by Google. Some of the most prominent and commonly used FL-powered applications are Android’s Gboard for predictive text and Google Assistant. FL can be defined as a setting that makes on-device, collaborative Machine Learning possible. A wide range of literature has studied FL technical considerations, frameworks, and limitations with several works presenting a survey of the prominent literature on FL. However, prior surveys have focused on technical considerations and challenges of FL, and there has been a limitation in more recent work that presents a comprehensive overview of the status and future trends of FL in applications and markets. In this survey, we introduce the basic fundamentals of FL, describing its underlying technologies, architectures, system challenges, and privacy-preserving methods. More importantly, the contribution of this work is in scoping a wide variety of FL current applications and future trends in technology and markets today. We present a classification and clustering of literature progress in FL in application to technologies including Artificial Intelligence , Internet of Things , blockchain , Natural Language Processing , autonomous vehicles, and resource allocation, as well as in application to market use cases in domains of Data Science, healthcare, education, and industry. We discuss future open directions and challenges in FL within recommendation engines, autonomous vehicles, IoT, battery management, privacy, fairness, personalization, and the role of FL for governments and public sectors. By presenting a comprehensive review of the status and prospects of FL, this work serves as a reference point for researchers and practitioners to explore FL applications under a wide range of domains. • Draw the big picture of the fundamental of federated machine learning. • Presenting the most prominent federated learning applications and shows other potential use cases. • Provide a range of future applications and directions for the research in the federated machine learning.
Fan D., Zhai Y., Borji A., Yang J., Shao L.
2020-10-06 citations by CoLab: 258 Abstract  
Multi-level feature fusion is a fundamental topic in computer vision for detecting, segmenting and classifying objects at various scales. When multi-level features meet multi-modal cues, the optimal fusion problem becomes a hot potato. In this paper, we make the first attempt to leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to develop a novel cascaded refinement network. In particular, we 1) propose a bifurcated backbone strategy (BBS) to split the multi-level features into teacher and student features, and 2) utilize a depth-enhanced module (DEM) to excavate informative parts of depth cues from the channel and spatial views. This fuses RGB and depth modalities in a complementary way. Our simple yet efficient architecture, dubbed Bifurcated Backbone Strategy Network (BBS-Net), is backbone independent and outperforms 18 SOTAs on seven challenging datasets using four metrics.
Zamir S.W., Arora A., Khan S.H., Munawar H., Khan F.S., Yang M., Shao L.
2023-02-01 citations by CoLab: 244 Abstract  
Given a degraded image, image restoration aims to recover the missing high-quality image content. Numerous applications demand effective image restoration, e.g., computational photography, surveillance, autonomous vehicles, and remote sensing. Significant advances in image restoration have been made in recent years, dominated by convolutional neural networks (CNNs). The widely-used CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatial details are preserved but the contextual information cannot be precisely encoded. In the latter case, generated outputs are semantically reliable but spatially less accurate. This paper presents a new architecture with a holistic goal of maintaining spatially-precise high-resolution representations through the entire network, and receiving complementary contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing the following key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) non-local attention mechanism for capturing contextual information, and (d) attention based multi-scale feature aggregation. Extensive experiments on six real image benchmark datasets demonstrate that our method, named as MIRNet-v2, achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
Xu J., Hou Y., Ren D., Liu L., Zhu F., Yu M., Wang H., Shao L.
2020-03-11 citations by CoLab: 220 Abstract  
Retinex theory is developed mainly to decompose an image into the illumination and reflectance components by analyzing local image derivatives. In this theory, larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. In this paper, we utilize exponentiated local derivatives (with an exponent $\gamma $ ) of an observed image to generate its structure map and texture map. The structure map is produced by been amplified with $\gamma >1$ , while the texture map is generated by been shrank with $\gamma < 1$ . To this end, we design exponential filters for the local derivatives, and present their capability on extracting accurate structure and texture maps, influenced by the choices of exponents $\gamma $ . The extracted structure and texture maps are employed to regularize the illumination and reflectance components in Retinex decomposition. A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image. We solve the STAR model by an alternating optimization algorithm. Each sub-problem is transformed into a vectorized least squares regression, with closed-form solutions. Comprehensive experiments on commonly tested datasets demonstrate that, the proposed STAR model produce better quantitative and qualitative performance than previous competing methods, on illumination and reflectance decomposition, low-light image enhancement, and color correction. The code is publicly available at https://github.com/csjunxu/STAR .
Ye M., Shen J., Shao L.
2021-01-01 citations by CoLab: 219 Abstract  
Matching person images between the daytime visible modality and night-time infrared modality (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Existing methods usually learn the multi-modality features in raw image, ignoring the image-level discrepancy. Some methods apply GAN technique to generate the cross-modality images, but it destroys the local structure and introduces unavoidable noise. In this paper, we propose a Homogeneous Augmented Tri-Modal (HAT) learning method for VI-ReID, where an auxiliary grayscale modality is generated from their homogeneous visible images, without additional training process. It preserves the structure information of visible images and approximates the image style of infrared modality. Learning with the grayscale visible images enforces the network to mine structure relations across multiple modalities, making it robust to color variations. Specifically, we solve the tri-modal feature learning from both multi-modal classification and multi-view retrieval perspectives. For multi-modal classification, we learn a multi-modality sharing identity classifier with a parameter-sharing network, trained with a homogeneous and heterogeneous identification loss. For multi-view retrieval, we develop a weighted tri-directional ranking loss to optimize the relative distance across multiple modalities. Incorporated with two invariant regularizers, HAT simultaneously minimizes multiple modality variations. In-depth analysis demonstrates the homogeneous grayscale augmentation significantly outperforms the current state-of-the-art by a large margin.
Din I.U., Khan K.H., Almogren A., Guizani M.
IEEE Internet of Things Journal scimago Q1 wos Q1
2025-03-03 citations by CoLab: 0
Balasubramanian V., Aloqaily M., Guizani M., Ouni B.
2025-03-01 citations by CoLab: 1
Zhu J., Xu T., Wang C., Rui Y., Xu W., Huang Y., Guizani M.
2025-03-01 citations by CoLab: 0
Zeng Y., Qiao L., Gao Z., Qin T., Wu Z., Khalaf E., Chen S., Guizani M.
2025-03-01 citations by CoLab: 1
Zhou L., Leng S., Wang Q., Quek T.Q., Guizani M.
IEEE Communications Magazine scimago Q1 wos Q1
2025-03-01 citations by CoLab: 3
Hu Z., Niu J., Ren T., Liu X., Guizani M.
2025-03-01 citations by CoLab: 0
Luo D., Sun G., Yu H., Guizani M.
IEEE Internet of Things Journal scimago Q1 wos Q1
2025-03-01 citations by CoLab: 0
Xie B., Cui H., Ho I.W., He Y., Guizani M.
2025-03-01 citations by CoLab: 1
Lin N., Wu T., Zhao L., Hawbani A., Wan S., Guizani M.
2025-03-01 citations by CoLab: 1
Rahman S., Khan S., Barnes N.
2025-03-01 citations by CoLab: 6 Abstract  
Conventional object detection models require large amounts of training data. In comparison, humans can recognize previously unseen objects by merely knowing their semantic description. To mimic similar behavior, zero-shot object detection (ZSD) aims to recognize and localize “unseen” object instances by using only their semantic information. The model is first trained to learn the relationships between visual and semantic domains for seen objects, later transferring the acquired knowledge to totally unseen objects. This setting gives rise to the need for correct alignment between visual and semantic concepts so that the unseen objects can be identified using only their semantic attributes. In this article, we propose a novel loss function called “polarity loss” that promotes correct visual-semantic alignment for an improved ZSD. On the one hand, it refines the noisy semantic embeddings via metric learning on a “semantic vocabulary” of related concepts to establish a better synergy between visual and semantic domains. On the other hand, it explicitly maximizes the gap between positive and negative predictions to achieve better discrimination between seen, unseen, and background objects. Our approach is inspired by embodiment theories in cognitive science that claim human semantic understanding to be grounded in past experiences (seen objects), related linguistic concepts (word vocabulary), and visual perception (seen/unseen object images). We conduct extensive evaluations on the Microsoft Common Objects in Context (MS-COCO) and Pascal Visual Object Classes (VOC) datasets, showing significant improvements over state of the art. Our code and evaluation protocols available at: https://github.com/salman-h-khan/PL-ZSD_Release.
Chen H., Cui H., Wang J., Cao P., He Y., Guizani M.
2025-02-24 citations by CoLab: 0
Din I.U., Khan K.H., Almogren A., Guizani M.
IEEE Internet of Things Journal scimago Q1 wos Q1
2025-02-20 citations by CoLab: 0
Peng Y., Wang J., Wang W., Liu L., Atiquzzaman M., Guizani M., Dustdar S.
IEEE Internet of Things Journal scimago Q1 wos Q1
2025-02-20 citations by CoLab: 0
Yohannes S., Bereketeab L., Aloqaily M., Ouni B., Guizani M., Debbah M.
IEEE Network scimago Q1 wos Q1
2025-02-20 citations by CoLab: 0

Since 2020

Total publications
754
Total citations
12763
Citations per publication
16.93
Average publications per year
150.8
Average authors per publication
5.52
h-index
50
Metrics description

Top-30

Fields of science

50
100
150
200
250
Computer Networks and Communications, 203, 26.92%
Computer Science Applications, 180, 23.87%
Electrical and Electronic Engineering, 146, 19.36%
Software, 126, 16.71%
Hardware and Architecture, 123, 16.31%
Information Systems, 113, 14.99%
Signal Processing, 87, 11.54%
Artificial Intelligence, 78, 10.34%
Computer Vision and Pattern Recognition, 39, 5.17%
Applied Mathematics, 38, 5.04%
Automotive Engineering, 33, 4.38%
Control and Systems Engineering, 30, 3.98%
Aerospace Engineering, 26, 3.45%
Computational Theory and Mathematics, 25, 3.32%
General Medicine, 23, 3.05%
General Engineering, 23, 3.05%
General Computer Science, 22, 2.92%
General Materials Science, 19, 2.52%
Human-Computer Interaction, 14, 1.86%
Computer Graphics and Computer-Aided Design, 13, 1.72%
Cognitive Neuroscience, 13, 1.72%
Renewable Energy, Sustainability and the Environment, 11, 1.46%
Media Technology, 11, 1.46%
Control and Optimization, 9, 1.19%
Modeling and Simulation, 9, 1.19%
Management Science and Operations Research, 9, 1.19%
Library and Information Sciences, 8, 1.06%
Mechanical Engineering, 7, 0.93%
Management Information Systems, 7, 0.93%
Statistics, Probability and Uncertainty, 6, 0.8%
50
100
150
200
250

Journals

20
40
60
80
100
20
40
60
80
100

Publishers

50
100
150
200
250
300
350
400
50
100
150
200
250
300
350
400

With other organizations

5
10
15
20
25
30
35
40
45
5
10
15
20
25
30
35
40
45

With foreign organizations

5
10
15
20
25
30
35
5
10
15
20
25
30
35

With other countries

50
100
150
200
250
300
350
400
China, 373, 49.47%
USA, 157, 20.82%
United Kingdom, 75, 9.95%
Canada, 70, 9.28%
India, 56, 7.43%
Australia, 45, 5.97%
Saudi Arabia, 40, 5.31%
Qatar, 39, 5.17%
Japan, 38, 5.04%
Singapore, 33, 4.38%
Republic of Korea, 26, 3.45%
Russia, 20, 2.65%
Pakistan, 20, 2.65%
Germany, 17, 2.25%
Sweden, 17, 2.25%
Jordan, 14, 1.86%
Netherlands, 14, 1.86%
France, 12, 1.59%
Iraq, 12, 1.59%
Kuwait, 12, 1.59%
Portugal, 11, 1.46%
Italy, 11, 1.46%
Finland, 11, 1.46%
Poland, 9, 1.19%
Switzerland, 9, 1.19%
Lebanon, 8, 1.06%
Spain, 7, 0.93%
Egypt, 6, 0.8%
Palestine, 6, 0.8%
50
100
150
200
250
300
350
400
  • We do not take into account publications without a DOI.
  • Statistics recalculated daily.
  • Publications published earlier than 2020 are ignored in the statistics.
  • The horizontal charts show the 30 top positions.
  • Journals quartiles values are relevant at the moment.