ACM Computing Surveys, volume 57, issue 8, pages 1-36

Artificial Intelligence as a Service (AIaaS) for Cloud, Fog and the Edge: State-of-the-Art Practices

Naeem Syed 1, 2
Adnan Anwar 2, 3
Z.A. Baig 2, 4
Sherali Zeadally 5, 6, 7, 8, 9
2
 
Strategic Centre for Cyber Resilience and Trust (Deakin CYBER), Waurn Ponds, Australia
7
 
College of Communication and Information, University of Kentucky, Lexington, United States
8
 
Department of Electronic Engineering, Kyung Hee University, Seoul Republic of Korea
9
 
College of Communication and Information, University of Kentucky, Lexington, United States and Department of Electronic Engineering, Kyung Hee University, Seoul, Republic of Korea
Publication typeJournal Article
Publication date2025-03-23
scimago Q1
SJR6.280
CiteScore33.2
Impact factor23.8
ISSN03600300, 15577341
Abstract

Artificial Intelligence (AI) fosters enormous business opportunities that build and utilize private AI models. Implementing AI models at scale and ensuring cost-effective production of AI-based technologies through entirely in-house capabilities is a challenge. The success of the Infrastructure as a Service (IaaS) and Software as a Service (SaaS) Cloud Computing models can be leveraged to facilitate a cost-effective and scalable AI service paradigm, namely, ‘AI as a Service.’ We summarize current state-of-the-art solutions for AI-as-a-Service (AIaaS), and we discuss its prospects for growth and opportunities to advance the concept. To this end, we perform a thorough review of recent research on AI and various deployment strategies for emerging domains considering both technical as well as survey articles. Next, we identify various characteristics and capabilities that need to be met before an AIaaS model can be successfully designed and deployed. Based on this we present a general framework of an AIaaS architecture that integrates the required aaS characteristics with the capabilities of AI. We also compare various approaches for offering AIaaS to end users. Finally, we illustrate several real-world use cases for AIaaS models, followed by a discussion of some of the challenges that must be addressed to enable AIaaS adoption.

Zang Y., Xue Z., Ou S., Chu L., Du J., Long Y.
Asynchronous federated learning (AFL) is a distributed machine learning technique that allows multiple devices to collaboratively train deep learning models without sharing local data. However, AFL suffers from low efficiency due to poor client model training quality and slow server model convergence speed, which are a result of the heterogeneous nature of both data and devices. To address these issues, we propose Efficient Asynchronous Federated Learning with Prospective Momentum Aggregation and Fine-Grained Correction (FedAC). Our framework consists of three key components. The first component is client weight evaluation based on temporal gradient, which evaluates the client weight based on the similarity between the client and server update directions. The second component is adaptive server update with prospective weighted momentum, which uses an asynchronous buffered update strategy and a prospective weighted momentum with adaptive learning rate to update the global model in server. The last component is client update with fine-grained gradient correction, which introduces a fine-grained gradient correction term to mitigate the client drift and correct the client stochastic gradient. We conduct experiments on real and synthetic datasets, and compare with existing federated learning methods. Experimental results demonstrate effective improvements in model training efficiency and AFL performance by our framework.
Wang Z., Goudarzi M., Gong M., Buyya R.
2024-03-01 citations by CoLab: 48 Abstract  
Edge/fog computing, as a distributed computing paradigm, satisfies the low-latency requirements of ever-increasing number of IoT applications and has become the mainstream computing paradigm behind IoT applications. However, because large number of IoT applications require execution on the edge/fog resources, the servers may be overloaded. Hence, it may disrupt the edge/fog servers and also negatively affect IoT applications’ response time. Moreover, many IoT applications are composed of dependent components incurring extra constraints for their execution. Besides, edge/fog computing environments and IoT applications are inherently dynamic and stochastic. Thus, efficient and adaptive scheduling of IoT applications in heterogeneous edge/fog computing environments is of paramount importance. However, limited computational resources on edge/fog servers imposes an extra burden for applying optimal but computationally demanding techniques. To overcome these challenges, we propose a Deep Reinforcement Learning-based IoT application Scheduling algorithm, called DRLIS to adaptively and efficiently optimize the response time of heterogeneous IoT applications and balance the load of the edge/fog servers. We implemented DRLIS as a practical scheduler in the FogBus2 function-as-a-service framework for creating an edge-fog-cloud integrated serverless computing environment. Results obtained from extensive experiments show that DRLIS significantly reduces the execution cost of IoT applications by up to 55%, 37%, and 50% in terms of load balancing, response time, and weighted cost, respectively, compared with metaheuristic algorithms and other reinforcement learning techniques.
Almalawi A., Hassan S., Fahad A., Khan A.I.
2024-02-06 citations by CoLab: 6 PDF Abstract  
AbstractAs Edge AI systems become more prevalent, ensuring data privacy and security in these decentralized networks is essential. In this work, a novel hybrid cryptographic mechanism was presented by combining Ant Lion Optimization (ALO) and Diffie–Hellman-based Twofish cryptography (DHT) for secure data transmission. The developed work collects the data from the created edge AI system and processes it using the Autoencoder. The Autoencoder learns the data patterns and identifies the malicious data entry. The Diffie–Hellman (DH) key exchange generates a shared secret key for encryption, while the ALO optimizes the key exchange and improves security performance. Further, the Twofish algorithm performs the data encryption using a generated secret key, preventing security threats during transmission. The implementation results of the study show that it achieved a higher accuracy of 99.45%, lower time consumption of 2 s, minimum delay of 0.8 s, and reduced energy consumption of 3.2 mJ.
Qi P., Chiaro D., Guzzo A., Ianni M., Fortino G., Piccialli F.
2024-01-01 citations by CoLab: 137 Abstract  
Federated learning (FL) is a distributed machine learning (ML) approach that enables models to be trained on client devices while ensuring the privacy of user data. Model aggregation, also known as model fusion, plays a vital role in FL. It involves combining locally generated models from client devices into a single global model while maintaining user data privacy. However, the accuracy and reliability of the resulting global model depend on the aggregation method chosen, making the selection of an appropriate method crucial. Initially, the simple averaging of model weights was the most commonly used method. However, due to its limitations in handling low-quality or malicious models, alternative techniques have been explored. As FL gains popularity in various domains, it is crucial to have a comprehensive understanding of the available model aggregation techniques and their respective strengths and limitations. However, there is currently a significant gap in the literature when it comes to systematic and comprehensive reviews of these techniques. To address this gap, this paper presents a systematic literature review encompassing 201 studies on model aggregation in FL. The focus is on summarizing the proposed techniques and the ones currently applied for model fusion. This survey serves as a valuable resource for researchers to enhance and develop new aggregation techniques, as well as for practitioners to select the most appropriate method for their FL applications.
Scotti V., Sbattella L., Tedesco R.
ACM Computing Surveys scimago Q1 wos Q1
2023-10-06 citations by CoLab: 9 Abstract  
The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.
Al-Atat G., Fresa A., Behera A.P., Moothedath V.N., Gross J., Champati J.P.
2023-06-18 citations by CoLab: 7 Abstract  
Resource-constrained Edge Devices (EDs), e.g., IoT sensors and microcontroller units, are expected to make intelligent decisions using Deep Learning (DL) inference at the edge of the network. Toward this end, developing tinyML models is an area of active research - DL models with reduced computation and memory storage requirements - that can be embedded on these devices. However, tinyML models have lower inference accuracy. On a different front, DNN partitioning and inference offloading techniques were studied for distributed DL inference between EDs and Edge Servers (ESs). In this paper, we explore Hierarchical Inference (HI), a novel approach proposed in [19] for performing distributed DL inference at the edge. Under HI, for each data sample, an ED first uses a local algorithm (e.g., a tinyML model) for inference. Depending on the application, if the inference provided by the local algorithm is incorrect or further assistance is required from large DL models on edge or cloud, only then the ED offloads the data sample. At the outset, HI seems infeasible as the ED, in general, cannot know if the local inference is sufficient or not. Nevertheless, we present the feasibility of implementing HI for image classification applications. We demonstrate its benefits using quantitative analysis and show that HI provides a better trade-off between offloading cost, throughput, and inference accuracy compared to alternate approaches.
Singh R., Gill S.S.
2023-03-03 citations by CoLab: 179 Abstract  
Artificial Intelligence (AI) at the edge is the utilization of AI in real-world devices. Edge AI refers to the practice of doing AI computations near the users at the network's edge, instead of centralised location like a cloud service provider's data centre. With the latest innovations in AI efficiency, the proliferation of Internet of Things (IoT) devices, and the rise of edge computing, the potential of edge AI has now been unlocked. This study provides a thorough analysis of AI approaches and capabilities as they pertain to edge computing, or Edge AI. Further, a detailed survey of edge computing and its paradigms including transition to Edge AI is presented to explore the background of each variant proposed for implementing Edge Computing. Furthermore, we discussed the Edge AI approach to deploying AI algorithms and models on edge devices, which are typically resource-constrained devices located at the edge of the network. We also presented the technology used in various modern IoT applications, including autonomous vehicles, smart homes, industrial automation, healthcare, and surveillance. Moreover, the discussion of leveraging machine learning algorithms optimized for resource-constrained environments is presented. Finally, important open challenges and potential research directions in the field of edge computing and edge AI have been identified and investigated. We hope that this article will serve as a common goal for a future blueprint that will unite important stakeholders and facilitates to accelerate development in the field of Edge AI.
Al-Doghman F., Moustafa N., Khalil I., Tari Z., Zomaya A.
2023-03-01 citations by CoLab: 91 Abstract  
The paradigm of edge computing has formed an innovative scope within the domain of IoT through expanding the services of the cloud to the network edge to design distributed architectures and securely enhance decision-making applications. Due to the heterogeneous of edge Computing, edge applications are required to be developed as a set of lightweight and interdependent modules. As this concept aligns with the objectives of microservice architecture, effective implementation of microservices-based edge applications within IoT networks has the prospective of fully leveraging edge nodes capabilities. Deploying microservices at IoT edge faces plenty of challenges associated with security and privacy. Advances in AI, and the easy access to resources with powerful computing providing opportunities for deriving precise models and developing different intelligent applications at the edge of network. In this study, an extensive survey is presented for securing edge computing-based AI Microservices to elucidate the challenges of IoT management and enable secure decision-making systems at the edge. We present recent research studies on edge AI and microservices orchestration and highlight key requirements as well as challenges of securing Microservices at IoT edge. We also propose a Microservices-based edge framework that provides secure edge AI algorithms as Microservices utilizing the containerization technology.
Duan S., Wang D., Ren J., Lyu F., Zhang Y., Wu H., Shen X.
2023-01-01 citations by CoLab: 140 Abstract  
As the computing paradigm shifts from cloud computing to end-edge-cloud computing, it also supports artificial intelligence evolving from a centralized manner to a distributed one. In this paper, we provide a comprehensive survey on the distributed artificial intelligence (DAI) empowered by end-edge-cloud computing (EECC), where the heterogeneous capabilities of on-device computing, edge computing, and cloud computing are orchestrated to satisfy the diverse requirements raised by resource-intensive and distributed AI computation. Particularly, we first introduce several mainstream computing paradigms and the benefits of the EECC paradigm in supporting distributed AI, as well as the fundamental technologies for distributed AI. We then derive a holistic taxonomy for the state-of-the-art optimization technologies that are empowered by EECC to boost distributed training and inference, respectively. After that, we point out security and privacy threats in DAI-EECC architecture and review the benefits and shortcomings of each enabling defense technology in accordance with the threats. Finally, we present some promising applications enabled by DAI-EECC and highlight several research challenges and open issues toward immersive performance acquisition.
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?