Publicação
Enhancing the Efficiency of Diffusion Models: A Retrieval-Based Approach
| Resumo: | Diffusion models have emerged as a powerful class of generative models capable of producing photorealistic images. However, their practical adoption in resource-constrained environments remains limited due to the computational cost of iterative denoising. This thesis explores a novel direction to improve the efficiency of diffusion-based generative models by building upon the Retrieval-based Diffusion (ReDi) framework. This work proposes several enhancements targeting retrieval key compression and adaptive step skipping to reduce sampling time while maintaining image quality. Our approach introduces Principal Component Analysis (PCA) and Product Quantisation (PQ) to compress the knowledge base without significant loss in retrieval accuracy, and an adaptive skipping technique that dynamically selects the number of denoising steps based on the closeness of retrieved neighbours. Experiments were conducted on two image datasets — COCO-10K, derived from the MS-COCO dataset, and a synthetic interior design dataset (ID-2K). Both datasets include denoising trajectories and text prompts for conditioning. Using metrics such as FID, Inception Score, CLIP Score, and Pick Score, a comprehensive evaluation was conducted. The results demonstrate that the proposed methods reduce storage requirements for retrieval keys and queries by 99% and improve inference speed, while maintaining high image quality. These findings highlight the potential of retrieval-based strategies to make diffusion models more applicable to efficient content generation in resource-sensitive domains. |
|---|---|
| Autores principais: | Kutsenko, Anton |
| Assunto: | Diffusion models Generative AI Efficient sampling Retrieval-based diffusion Deep learning |
| Ano: | 2025 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso embargado |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Diffusion models have emerged as a powerful class of generative models capable of producing photorealistic images. However, their practical adoption in resource-constrained environments remains limited due to the computational cost of iterative denoising. This thesis explores a novel direction to improve the efficiency of diffusion-based generative models by building upon the Retrieval-based Diffusion (ReDi) framework. This work proposes several enhancements targeting retrieval key compression and adaptive step skipping to reduce sampling time while maintaining image quality. Our approach introduces Principal Component Analysis (PCA) and Product Quantisation (PQ) to compress the knowledge base without significant loss in retrieval accuracy, and an adaptive skipping technique that dynamically selects the number of denoising steps based on the closeness of retrieved neighbours. Experiments were conducted on two image datasets — COCO-10K, derived from the MS-COCO dataset, and a synthetic interior design dataset (ID-2K). Both datasets include denoising trajectories and text prompts for conditioning. Using metrics such as FID, Inception Score, CLIP Score, and Pick Score, a comprehensive evaluation was conducted. The results demonstrate that the proposed methods reduce storage requirements for retrieval keys and queries by 99% and improve inference speed, while maintaining high image quality. These findings highlight the potential of retrieval-based strategies to make diffusion models more applicable to efficient content generation in resource-sensitive domains. |
|---|