Text2Relight: Creative Portrait Relighting with Text Guidance

Cha, Junuk; Ren, Mengwei; Singh, Krishna Kumar; Zhang, He; Hold-Geoffroy, Yannick; Yoon, Seunghyun; Jung, HyunJoon; Yoon, Jae Shin; Baek, Seungryul

doi:10.1609/aaai.v39i2.32194

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

백승렬

Baek, Seungryul: UNIST VISION AND LEARNING LAB.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Text2Relight: Creative Portrait Relighting with Text Guidance

Author(s): Cha, Junuk, Ren, Mengwei, Singh, Krishna Kumar, Zhang, He, Hold-Geoffroy, Yannick, Yoon, Seunghyun, Jung, HyunJoon, Yoon, Jae Shin, Baek, Seungryul

Issued Date: 2025-02-28

DOI: 10.1609/aaai.v39i2.32194

URI: https://scholarworks.unist.ac.kr/handle/201301/88739

Citation: AAAI Conference on Artificial Intelligence, pp.1980 - 1988

Abstract: We present a lighting-aware image editing pipeline that, given a portrait image and a text prompt, performs single image relighting. Our model modifies the lighting and color of both the foreground and background to align with the provided text description. The unbounded nature in creativeness of a text allows us to describe the lighting of a scene with any sensory features including temperature, emotion, smell, time, and so on. However, the modeling of such mapping between the unbounded text and lighting is extremely challenging due to the lack of dataset where there exists no scalable data that provides large pairs of text and relighting, and therefore, current text-driven image editing models does not generalize to lighting-specific use cases. We overcome this problem by introducing a novel data synthesis pipeline: First, diverse and creative text prompts that describe the scenes with various lighting are automatically generated under a crafted hierarchy using a large language model (e.g., ChatGPT). A text-guided image generation model creates a lighting image that best matches the text. As a condition of the lighting images, we perform image-based relighting for both foreground and background using a single portrait image or a set of OLAT (One-Light-at-A-Time) images captured from lightstage system. Particularly for the background relighting, we represent the lighting image as a set of point lights and transfer them to other background images. A generative diffusion model learns the synthesized large-scale data with auxiliary task augmentation (e.g., portrait delighting and light positioning) to correlate the latent text and lighting distribution for text-guided portrait relighting. In our experiment, we demonstrate that our model outperforms existing text-guided image generation models, showing high-quality portrait relighting results with a strong generalization to unconstrained scenes.

Publisher: Association for the Advancement of Artificial Intelligence

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.