Repository Collection:

Repository Collection: https://scholarworks.unist.ac.kr/handle/201301/45 2026-05-13T10:46:03Z Optimal Coasting Time Determination of a Multi-stage Interceptor Considering Engagement Zone https://scholarworks.unist.ac.kr/handle/201301/91321 Title: Optimal Coasting Time Determination of a Multi-stage Interceptor Considering Engagement Zone Author(s): Na, Hyungho; Sung, Taehyun; Ahn, Jaemyung Abstract: This paper proposes a methodology to optimally determine the coasting time of a multi-stage interceptor, considering the engagement zone. Proper coasting time determination is critical for a multi-stage interceptor to extend its engagement boundaries and to engage with a target at a specified engagement point at the estimated impact time. Hence, we first define the optimization problem to determine multiple coasting times for a multi-stage interceptor, considering both radar detection range and potential homing performance. The analytic formulation for the generalized coasting time determination is derived by introducing the ratio of each coasting time over a reference coasting time. With this coasting ratio, the original optimal coasting time determination problem can be simplified to an alternative problem of finding the optimal ratio and the reference coasting time. In addition, by considering the practical implementation, we present the bi-level approach utilizing the solution of the dual problem to solve the alternative problem. Various case studies are carried out to evaluate the proposed method and show its effectiveness and validity. 2023-01-25T15:00:00Z Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning https://scholarworks.unist.ac.kr/handle/201301/91320 Title: Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning Author(s): Na, Hyungho; Seo, Yunkyeong; Moon, Il-Chul Abstract: In cooperative multi-agent reinforcement learning (MARL), agents aim to achieve a common goal, such as defeating enemies or scoring a goal. Existing MARL algorithms are effective but still require significant learning time and often get trapped in local optima by complex tasks, subsequently failing to discover a goal-reaching policy. To address this, we introduce Efficient episodic Memory Utilization (EMU) for MARL, with two primary objectives: (a) accelerating reinforcement learning by leveraging semantically coherent memory from an episodic buffer and (b) selectively promoting desirable transitions to prevent local convergence. To achieve (a), EMU incorporates a trainable encoder/decoder structure alongside MARL, creating coherent memory embeddings that facilitate exploratory memory recall. To achieve (b), EMU introduces a novel reward structure called episodic incentive based on the desirability of states. This reward improves the TD target in Q-learning and acts as an additional incentive for desirable transitions. We provide theoretical support for the proposed incentive and demonstrate the effectiveness of EMU compared to conventional episodic control. The proposed method is evaluated in StarCraft II and Google Research Football, and empirical results indicate further performance improvement over state-of-the-art methods. Our code is available at: https://github.com/HyunghoNa/EMU. 2024-05-07T15:00:00Z LAGMA: LAtent Goal-guided Multi-agent Reinforcement Learning https://scholarworks.unist.ac.kr/handle/201301/91319 Title: LAGMA: LAtent Goal-guided Multi-agent Reinforcement Learning Author(s): Na, Hyungho; Moon, Il-Chul Abstract: In cooperative multi-agent reinforcement learning (MARL), agents collaborate to achieve common goals, such as defeating enemies and scoring a goal. However, learning goal-reaching paths toward such a semantic goal takes a considerable amount of time in complex tasks and the trained model often fails to find such paths. To address this, we present LAtent Goal-guided Multi-Agent reinforcement learning (LAGMA), which generates a goal-reaching trajectory in latent space and provides a latent goal-guided incentive to transitions toward this reference trajectory. LAGMA consists of three major components: (a) quantized latent space constructed via a modified VQ-VAE for efficient sample utilization, (b) goal-reaching trajectory generation via extended VQ codebook, and (c) latent goal-guided intrinsic reward generation to encourage transitions towards the sampled goal-reaching path. The proposed method is evaluated by StarCraft II with both dense and sparse reward settings and Google Research Football. Empirical results show further performance improvement over state-of-the-art baselines. 2024-07-21T15:00:00Z Trajectory-Class-Aware Multi-agent Reinforcement Learning https://scholarworks.unist.ac.kr/handle/201301/91318 Title: Trajectory-Class-Aware Multi-agent Reinforcement Learning Author(s): Na, Hyungho; Lee, Kwanghyeon; Lee, Sumin; Moon, Il-Chul Abstract: In the context of multi-agent reinforcement learning, generalization is a challenge to solve various tasks that may require different joint policies or coordination without relying on policies specialized for each task. We refer to this type of problem as a multi-task, and we train agents to be versatile in this multi-task setting through a single training process. To address this challenge, we introduce TRajectoryclass-Aware Multi-Agent reinforcement learning (TRAMA). In TRAMA, agents recognize a task type by identifying the class of trajectories they are experiencing through partial observations, and the agents use this trajectory awareness or prediction as additional information for action policy. To this end, we introduce three primary objectives in TRAMA: (a) constructing a quantized latent space to generate trajectory embeddings that reflect key similarities among them; (b) conducting trajectory clustering using these trajectory embeddings; and (c) building a trajectory-class-aware policy. Specifically for (c), we introduce a trajectory-class predictor that performs agent-wise predictions on the trajectory class; and we design a trajectory-class representation model for each trajectory class. Each agent takes actions based on this trajectory-class representation along with its partial observation for task-aware execution. The proposed method is evaluated on various tasks, including multi-task problems built upon StarCraft II. Empirical results show further performance improvements over state-of-the-art baselines. 2025-04-24T15:00:00Z