Bridging the Capacity Gap in Diffusion Models via Easy-to-Hard Knowledge Distillation

Han, HyunSoo

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Yoo, Jae Jun	-
dc.contributor.author	Han, HyunSoo	-
dc.date.accessioned	2026-03-26T22:15:21Z	-
dc.date.available	2026-03-26T22:15:21Z	-
dc.date.issued	2026-02	-
dc.description.abstract	Recent advances in diffusion models have dramatically improved image synthesis quality, but at the cost of large model size and heavy computation, making direct deployment of state-of-the-art models on customer-grade GPUs increasingly impractical. This has motivated compression approaches such as structured pruning combined with knowledge distillation (KD), where a lightweight student is trained to mimic a large teacher within a pruning–KD framework. However, we empirically find that, for diffu- sion models, conventional KD objectives become unstable as the teacher–student capacity gap widens: under high compression ratios they fail to provide reliable guidance, leading to degraded or even col- lapsed students. To address this issue, we analyze the distillation error in diffusion models and observe that it naturally decomposes into simple low-order statistical discrepancies and complex fine residuals. Building on this, we propose a “Coarse-to-Fine” distillation framework with LInear FiTting-based dis- tillation (LIFT) and Piecewise Local Adaptive Coefficient Estimation (PLACE). LIFT parameterizes the KD objective with a global linear regression module, explicitly separating a coarse alignment of low-order moments (Coarse-Easy errors) from a residual refinement term that focuses on the remain- ing Fine-Hard structure, and employs an adaptive schedule that gradually shifts emphasis from coarse to fine components during training. PLACE extends LIFT to spatially non-uniform errors by ranking residual magnitudes, partitioning outputs into difficulty-based groups, and applying LIFT independently within each group, yielding locally adaptive guidance without introducing any additional parameters or inference-time overhead. Across pixel and latent space diffusion models, and for both U-Net and DiT backbones, our framework consistently improves over existing KD-based compression baselines under their original pruning–KD configuration. Notably, it achieves stable convergence and strong image qual- ity even under aggressive pruning (e.g., 90% channel reduction), where conventional KD objectives fail, thereby enabling practical lightweight diffusion models on memory-limited hardware.	-
dc.description.degree	Master	-
dc.description	Graduate School of Artificial Intelligence Artificial Intelligence	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/91054	-
dc.identifier.uri	http://unist.dcollection.net/common/orgView/200000966252	-
dc.language	ENG	-
dc.publisher	Ulsan National Institute of Science and Technology	-
dc.rights.embargoReleaseDate	9999-12-31	-
dc.rights.embargoReleaseTerms	9999-12-31	-
dc.subject	Relaxor ferroelectric, PVDF-TrFE-CFE, Solid polymer electrolyte, LLZTO, Li-ion batteries	-
dc.title	Bridging the Capacity Gap in Diffusion Models via Easy-to-Hard Knowledge Distillation	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.