Transitioning from TinyML to Edge GenAI: A Review
Generative AI (GenAI) models are designed to produce realistic and natural data, such as images, audio, or written text. Due to their high computational and memory demands, these models traditionally run on powerful remote compute servers. However, there is growing interest in deploying GenAI models at the edge, on resource-constrained embedded devices. Since 2018, the TinyML community has proved that running fixed topology AI models on edge devices offers several benefits, including independence from internet connectivity, low-latency processing, and enhanced privacy. Nevertheless, deploying resource-consuming GenAI models on embedded devices is challenging since the latter have limited computational, memory, and energy resources. This review paper aims to evaluate the progresses made to date in the field of Edge GenAI, an emerging area of research within the broader domain of EdgeAI which focuses on bringing GenAI on edge devices. Papers released between 2022 and 2024 that address the design and deployment of GenAI models on embedded devices are identified and described. Additionally, their approaches and results are compared. This manuscript contributes to understand the ongoing transition from TinyML to Edge GenAI and provides valuable insights to the AI research community on this emerging, impactful, and quite under-explored field.