There are no files associated with this item.
Cited time in
Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Yoo, Jaejun | - |
| dc.contributor.author | Kim, Pum Jun | - |
| dc.date.accessioned | 2026-03-26T22:15:06Z | - |
| dc.date.available | 2026-03-26T22:15:06Z | - |
| dc.date.issued | 2026-02 | - |
| dc.description.abstract | Recent advances in deep generative models in computer vision have extended their capabilities from image generation to diverse domains such as video and 3D object generations. What has driven these advancements at their core is the development of evaluation metrics that are reliable and accurate. These metrics assess generative models from a human perceptual perspective, measuring how closely the gen- erated data resembles real-world data and effectively highlighting their differences. This thesis inves- tigates recent advances in evaluation metrics by examining the key contributions of Article 1, Article 2, and Article 3. In addition, it identifies open challenges in evaluation that remain critical for the development of more powerful and reliable deep generative models. Article 1 introduces a novel evaluation metric for image generative models that measures the level of realism along two key aspects: fidelity and diversity. Existing metrics typically estimate the distributions of real and generated data in model embedding spaces that reflect human perception, and compute scores by comparing these distributions. However, generative models that are not properly trained often produce noisy data, and in the presence of such noise, existing metrics are unable to provide reliable and accurate evaluations. To address this issue, this work proposes a robust evaluation approach by estimating statistically and topologically significant supports for both real and generated data. This distribution estimation method is sensitive to subtle variations in the data distribution and provides more accurate and reliable evaluation results, even in the presence of noise. Article 2 introduces a novel evaluation metric for video generative models that measures realism along three aspects: fidelity, diversity, and temporal naturalness. Existing video metrics have largely relied on techniques developed for image generation models, which often fail to capture the temporal characteristics inherent in video data, resulting in incomplete or unreliable evaluations. To address this limitation, this work leverages the observation that frame-wise changes in typical videos exhibit amplitude distributions following a power law in the Fourier domain. By estimating this power law distribution, the proposed metric quantitatively measures the deviation of generated videos from the natural distribution, providing the first principled evaluation of temporal consistency in video generation. Article 3 proposes a benchmark that enables comparison between object recognition models and humans, and allows model analysis from a human visual perspective. The existing benchmark, using stylized images that blend shape and texture within a single image, suggests that humans primarily rely on shape, whereas models focus on texture. However, this prior work suffers from several limitations: (1) it does not utilize data representing pure shape and pure texture, (2) it does not consider images in which shape and texture are present in equal proportion (50:50), and (3) it employs evaluation measures that are not well-suited for model analysis and comparison. To address these limitations, Article 3 generates disentangled datasets that contain pure shape and texture cues and proposes a new metric that enables reliable and precise evaluation of models. This benchmark provides a clear and unbiased assessment of current object recognition models, enabling accurate measurement of how closely their reliance on shape and texture aligns with human perception. | - |
| dc.description.degree | Doctor | - |
| dc.description | Graduate School of Artificial Intelligence Artificial Intelligence | - |
| dc.identifier.uri | https://scholarworks.unist.ac.kr/handle/201301/91046 | - |
| dc.identifier.uri | http://unist.dcollection.net/common/orgView/200000966325 | - |
| dc.language | ENG | - |
| dc.publisher | Ulsan National Institute of Science and Technology | - |
| dc.rights.embargoReleaseDate | 9999-12-31 | - |
| dc.rights.embargoReleaseTerms | 9999-12-31 | - |
| dc.subject | polyimide-glycol hybrid gel|semi-interpenetrating network|photo-thermal imidization|shape-deformable substrate|thermal-mechanical modulation | - |
| dc.title | Reliable and Interpretable Evaluation in Deep Representational Models | - |
| dc.type | Thesis | - |
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr
Copyright (c) 2023 by UNIST LIBRARY. All rights reserved.
ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.