Predicting Trading Volume in High-Frequency Financial Data: A Hybrid Machine Learning Approach for Modeling Zero-Inflated and HeavyTailed Characteristics

Koo, Dohyeon

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Predicting Trading Volume in High-Frequency Financial Data: A Hybrid Machine Learning Approach for Modeling Zero-Inflated and HeavyTailed Characteristics

Author(s): Koo, Dohyeon

Advisor: Lim, Dongyoung

Issued Date: 2025-02

URI: https://scholarworks.unist.ac.kr/handle/201301/86500 http://unist.dcollection.net/common/orgView/200000865972

Abstract: In financial markets, trading volume is a very important indicator as it directly affects market liquidity, price volatility, and trading fees. Therefore, forecasting trading volume is an important objective in the financial sector. In particular, the importance of volume forecasting becomes even more important in High-Frequency Trading (HFT). It is important to quickly detect buy and sell signals as trades are executed in a short time term. It is also essential for optimizing trading strategies based on volume and managing risks from volume effects. However, predicting volume is very difficult. This is due to the distributional properties of financial data, such as zero-inflated and heavy-tailed. Existing machine learning approaches often show unsatisfactory performance because they do not reflect the distributional properties of financial data well. Therefore, this paper proposes a new hybrid machine learning approach to solve the problems of existing methodologies and improve the accuracy of good volume prediction in high-frequency trading. The key ideas of our proposed methodology are as follows: First, the prediction of total volume over a short time term can be represented by a compound random sum. Second, the zero-inflated and heavy-tailed distributional properties of total volume are determined by the order frequency and size of the market, which are components of the compound random sum, and each component can be modeled based on appropriate probability distribution assumptions. Third, neural networks can be used to effectively capture the correlation of complex, high-dimensional financial data through a nonlinear structure, and the parameters of the distribution of the modeled components can be effectively expressed through the learning process. Finally, we demonstrate the superiority of our methodology by comparing the performance of our proposed methodology and existing benchmarks in a high-frequency trading environment using real KOSPI200 Futures.

Publisher: Ulsan National Institute of Science and Technology

Degree: Master

Major: Department of Industrial Engineering

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.