CMSL: Cross-modal Style Learning for Few-shot Image Generation

Publication typeJournal Article
Publication date2025-03-20
scimago Q1
wos Q1
SJR1.271
CiteScore13.2
Impact factor8.7
ISSN2731538X, 27315398
Abstract
Training generative adversarial networks is data-demanding, which limits the development of these models on target domains with inadequate training data. Recently, researchers have leveraged generative models pretrained on sufficient data and fine-tuned them using small training samples, thus reducing data requirements. However, due to the lack of explicit focus on target styles and disproportionately concentrating on generative consistency, these methods do not perform well in diversity preservation which represents the adaptation ability for few-shot generative models. To mitigate the diversity degradation, we propose a framework with two key strategies: 1) To obtain more diverse styles from limited training data effectively, we propose a cross-modal module that explicitly obtains the target styles with a style prototype space and text-guided style instructions. 2) To inherit the generation capability from the pretrained model, we aim to constrain the similarity between the generated and source images with a structural discrepancy alignment module by maintaining the structure correlation in multiscale areas. We demonstrate the effectiveness of our method, which outperforms state-of-the-art methods in mitigating diversity degradation through extensive experiments and analyses.
Found 

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
0
Share
Cite this
GOST |
Cite this
GOST Copy
Jiang Y. et al. CMSL: Cross-modal Style Learning for Few-shot Image Generation // Machine Intelligence Research. 2025.
GOST all authors (up to 50) Copy
Jiang Y., Lyu Y., Peng B., WANG W., Dong J. CMSL: Cross-modal Style Learning for Few-shot Image Generation // Machine Intelligence Research. 2025.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1007/s11633-024-1511-7
UR - https://link.springer.com/10.1007/s11633-024-1511-7
TI - CMSL: Cross-modal Style Learning for Few-shot Image Generation
T2 - Machine Intelligence Research
AU - Jiang, Yue
AU - Lyu, Yueming
AU - Peng, Bo
AU - WANG, Wei
AU - Dong, Jing
PY - 2025
DA - 2025/03/20
PB - Springer Nature
SN - 2731-538X
SN - 2731-5398
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2025_Jiang,
author = {Yue Jiang and Yueming Lyu and Bo Peng and Wei WANG and Jing Dong},
title = {CMSL: Cross-modal Style Learning for Few-shot Image Generation},
journal = {Machine Intelligence Research},
year = {2025},
publisher = {Springer Nature},
month = {mar},
url = {https://link.springer.com/10.1007/s11633-024-1511-7},
doi = {10.1007/s11633-024-1511-7}
}