File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Context-Aware Sports Highlight Generation Leveraging Large Language Models

Author(s)
Kang, Jeonghun
Advisor
Kim, Taehwan
Issued Date
2024-08
URI
https://scholarworks.unist.ac.kr/handle/201301/84184 http://unist.dcollection.net/common/orgView/200000813212
Abstract
Generating sports highlight videos from broadcast footage presents significant challenges, including the accurate identification of key moments and understanding the broader con- text of the game. Existing methods often fail to capture the full context and dynamics of various sports, resulting in less effective highlight detection. In this paper, we propose a novel framework that leverages large language models (LLMs) to address these challenges. Our approach involves extracting audio commentary from broadcast videos, converting it into text using a Speech-to-Text (STT) model, and segmenting this text into coherent units. Additionally, we analyze the audio to extract volume data, which, combined with the seg- mented commentary, serves as input to the LLM. The LLM identifies potential highlight moments by considering both the textual and audio context, assigning pseudo labels to these segments. This process enables the generation of highlight videos that encapsulate the most exciting and significant moments of the game. Our method adapts to different sports without the need for extensive retraining. Experimental results demonstrate the effectiveness of our framework in producing high-quality sports highlights, outperforming traditional methods in both accuracy and contextual relevance.
Publisher
Ulsan National Institute of Science and Technology
Degree
Master
Major
Graduate School of Artificial Intelligence

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.