The initial implementation of MCS on graphics processing unit using CUDA fortran

Park, Jeehoon

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

The initial implementation of MCS on graphics processing unit using CUDA fortran

Author(s): Park, Jeehoon

Advisor: Lee, Deokjung

Issued Date: 2024-02

URI: https://scholarworks.unist.ac.kr/handle/201301/82142 http://unist.dcollection.net/common/orgView/200000744162

Abstract: MCS is a neutron/photon transport Monte Carlo (MC) Fortran code developed at Computational Reactor Physics and Experiment (CORE) laboratory, located in Ulsan National Institute of Science and Technology (UNIST), to perform highly precise multiphysics simulation of Pressurized Water Reactors (PWRs). Due to the computational requirements associated with MC simulations, MCS had utilized Central Processing Unit (CPU) parallelization using MPI/OpenMP hybrid parallel simulation and shown excellent scalability. However, relying on CPUs is not practical in reality due to their cost, physical limitations, and the performance gap between Dynamic Random-Access Memory (DRAM). To overcome these constraints and deliver a substantial improvement in computational efficiency, a new solution has been pursued through the utilization of Graphics Processing Units (GPUs). This thesis demonstrates the initial development of MCS GPU, the MCS code implemented on GPU using CUDA fortran, to accelerate the MC transport simulation. Porting the original CPU-based code to GPU requires modifying the procedures and transferring the data from CPU to GPU. The subroutines and functions were modified as a kernel or a device code accordingly. OpenACC directives were applied when the transfer of derived-type variables unable to be done by CUDA Fortran was required. The benchmark problems including a depletion simulation and 4 criticality simulations were solved on GPU with 256 threads per block and on CPU with 1, 2, and 4 cores, respectively. The GPU code utilizing a single GPU calculated k-effective values with the differences less than 0.002 and was 3.569 times faster than the CPU code using 4 processes throughout the benchmark. The results demonstrate that a MC simulation on GPU can outperform the simulation on CPU without much optimization as well as maintaining the accuracy. By applying the proper optimization techniques, the GPU code is expected to yield even better speedup.

Publisher: Ulsan National Institute of Science and Technology

Degree: Master

Major: Department of Nuclear Engineering

Show Full Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.