Synthetic Realities: Evaluating Human Ability to Distinguish AI-Generated Videos from Real Footage

Sarfraz, Danyal; Mubashar Karim, Raja; Kim, KwanMyung

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

김관명

Kim, KwanMyung: Intergration and Innovation Design Lab.

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.conferencePlace	CH	-
dc.citation.conferencePlace	타이베이 송산문화창의공원	-
dc.citation.title	IASDR2025	-
dc.contributor.author	Sarfraz, Danyal	-
dc.contributor.author	Mubashar Karim, Raja	-
dc.contributor.author	Kim, KwanMyung	-
dc.date.accessioned	2026-01-12T14:35:58Z	-
dc.date.available	2026-01-12T14:35:58Z	-
dc.date.created	2026-01-11	-
dc.date.issued	2025-12-04	-
dc.description.abstract	Recent advances in generative AI have enabled multiple pathways for high‑fidelity video synthesis; text‑to‑video, image‑to‑video animation, and video outpainting. Empirical side-by-side evaluations of how untrained viewers perceive and distinguish these outputs from real footage remain scarce. In this study, we systematically compare human detection accuracy across these three AI generation techniques within three thematic contexts: historical footage, film and media content, and natural environments. We constructed a balanced stimulus set comprising equal amounts of real and AI-generated videos (18 each). The AI clips were evenly distributed across the three generation methods using Google’s Veo 3, Lightricks LTX Video, and Wan VACE. All videos were produced and standardized within the ComfyUI framework to ensure consistent quality and duration. 87 participants judged every clip in a binary forced-choice task (“real” vs. “AI-generated”). Participants correctly identified videos 60% of the time on average. Image-to-Video clips were recognized most accurately (79%), followed by real footage (64%), outpainting (49%), and text-to-video (43%). Accuracy also varied by theme: film and historical scenes yielded higher detection rates than environmental clips, which were frequently mistaken for AI. Logistic regression confirmed significant effects of both technique and theme as well as their interaction (p < 0.001), indicating that detection success depends jointly on how the content was generated and what it depicts. Findings reveal a consistent bias toward assuming synthetic origins and highlight that perceptual realism in AI video is shaped more by context than by model type, underscoring the importance of media-literacy approaches and context-aware evaluation tools for navigating increasingly synthetic visual media.	-
dc.identifier.bibliographicCitation	IASDR2025	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/90247	-
dc.language	영어	-
dc.publisher	IASDR (TDRI & CID)	-
dc.title.alternative	Synthetic Realities: Evaluating Human Ability to Distinguish AI-Generated Videos from Real Footage	-
dc.title	Synthetic Realities: Evaluating Human Ability to Distinguish AI-Generated Videos from Real Footage	-
dc.type	Conference Paper	-
dc.date.conferenceDate	2025-12-02	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.