OpenMP, OpenMP/MPI, and CUDA/MPI C programs for solving the time-dependent dipolar Gross-Pitaevskii equation

by   Vladimir Loncar, et al.

We present new versions of the previously published C and CUDA programs for solving the dipolar Gross-Pitaevskii equation in one, two, and three spatial dimensions, which calculate stationary and non-stationary solutions by propagation in imaginary or real time. Presented programs are improved and parallelized versions of previous programs, divided into three packages according to the type of parallelization. First package contains improved and threaded version of sequential C programs using OpenMP. Second package additionally parallelizes three-dimensional variants of the OpenMP programs using MPI, allowing them to be run on distributed-memory systems. Finally, previous three-dimensional CUDA-parallelized programs are further parallelized using MPI, similarly as the OpenMP programs. We also present speedup test results obtained using new versions of programs in comparison with the previous sequential C and parallel CUDA programs. The improvements to the sequential version yield a speedup of 1.1 to 1.9, depending on the program. OpenMP parallelization yields further speedup of 2 to 12 on a 16-core workstation, while OpenMP/MPI version demonstrates a speedup of 11.5 to 16.5 on a computer cluster with 32 nodes used. CUDA/MPI version shows a speedup of 9 to 10 on a computer cluster with 32 nodes.



There are no comments yet.


page 1

page 2

page 3

page 4


OpenMP GNU and Intel Fortran programs for solving the time-dependent Gross-Pitaevskii equation

We present Open Multi-Processing (OpenMP) version of Fortran 90 programs...

HPC optimal parallel communication algorithm for the simulation of fractional-order systems

A parallel numerical simulation algorithm is presented for fractional-or...

Automatic Parallelization of Sequential Programs

Prior work on Automatically Scalable Computation (ASC) suggests that it ...

Performance Comparison of MPICH and MPI4py on Raspberry Pi-3B Beowulf Cluster

Moore's Law is running out. Instead of making powerful computer by incre...

GooFit 2.0

The GooFit package provides physicists a simple, familiar syntax for man...

Optimizing the hybrid parallelization of BHAC

We present our experience with the modernization on the GR-MHD code BHAC...

Machine Learning for CUDA+MPI Design Rules

We present a new strategy for automatically exploring the design space o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


V. L., S. Š. and A. B. acknowledge support by the Ministry of Education, Science, and Technological Development of the Republic of Serbia under projects ON171017, OI1611005, and III43007, as well as SCOPES project IZ74Z0-160453. L. E. Y.-S. acknowledges support by the FAPESP of Brazil under project 2012/21871-7 and 2014/16363-8. P. M. acknowledges support by the Science and Engineering Research Board, Department of Science and Technology, Government of India under project No. EMR/2014/000644. S. K. A. acknowledges support by the CNPq of Brazil under project 303280/2014-0, and by the FAPESP of Brazil under project 2012/00451-0.