2.13. Parallelization with MPI¶
Particularly with larger simulation times and system sizes, complex calculations are computationally intensive. In order to keep the actual simulation time manageable, tkwant is parallelized with the Message Passing Interface (MPI). With MPI, tkwant can run it in parallel, e.g. on the local computer or on a cluster.
Parallel programming and in particular MPI is a vast subjet and the following tutorial cannot explain these techniques here, but we refer to dedicated material which can be found in the web. This tutorial will explain however the very basic concept which is sufficient to run tkwant simulations in parallel without having a deeper knowledge in MPI. This is possible since compute-intensive routines are natively MPI parallelized in tkwant, such that only minor changes to a simulation script are required. We will explain them in the following.
2.13.1. Running code in parallel with MPI¶
As an example, let us focus on the tkwant example script
fabry_perot.py
.
To execute this script on 8 cores, the scripts
must be called with the following command:
mpirun -n 8 python3 fabry_perot.py
Calling the script with:
python3 fabry_perot.py
will run it in standard serial mode.
2.13.2. Enabling output only from the MPI root rank¶
Running a script with the prefix mpirun -n x
will execute the entire
script x times. While tkwant is designed to benefit from this parallelization,
any output in the script as printing, plotting or writing to a file
will be also repeated x times. This is unpractical, as e.g. a print()
call in the simulation
script will print some information x times instead of only once.
Moreover, not all of the x parallel runs are equivalent,
but the result of a calculations from the tkwant solvers is returned
by default only on one MPI rank, the so-called master or root rank, which has the rank index 0.
It is sufficient however to add a few additional lines of code, in order to
redirect all plotting and printing to the MPI root rank, such that both serial and parallel execution
will lead to the same result.
As an example, we look again at the script
fabry_perot.py
.
In this script, a few additional lines of code
redirect all plotting and printing to the MPI root rank.
For plotting and saving the result, the following block of code can be used:
import tkwant
comm = tkwant.mpi.get_communicator()
def am_master():
return comm.rank == 0
# do the actual tkwant calculation
if am_master():
# plot or save result
pass
Quite similar, printing a message only by the master rank is possible by the following lines of code:
import sys
def print_master(*args, **kwargs):
if am_master():
print(*args, **kwargs)
sys.stdout.flush()
print_master('this message is printed only by the master rank')
this message is printed only by the master rank
Note the flush command to prevent buffering of the messages.
2.13.3. MPI communicator¶
The following information is not relevant for tkwant users, but inteded for
tkwant developers working with MPI.
Tkwant initializes automatically the MPI communicator, if needed.
To uses MPI, the function mpi.get_communicator()
returns tkwant’s global
MPI communicator which is used by all routines by default:
import tkwant
comm = tkwant.mpi.get_communicator()
print('rank={}, size={}'.format(comm.rank, comm.size))
rank=0, size=1
comm
is basically a copy of the MPI COMM_WORLD
communicator.
If tkwant should be used as an external library with a different MPI communicator,
the routine mpi.communicator_init()
allows to change the default
communicator:
import tkwant
from mpi4py import MPI
my_comm = MPI.COMM_WORLD
tkwant.mpi.communicator_init(my_comm)
WARNING:tkwant.mpi:56:rank=0: tkwant MPI communicator cannot be reinitialized.
Note that the MPI communicator must be set after importing the tkwant module and before executing any tkwant code.
2.13.4. Multi-threading¶
Tkwant does not support multi-threading as OpenMP. The environment variable
OMP_NUM_THREADS
must be set to one:
export OMP_NUM_THREADS=1
2.13.5. Examples¶
The example scripts
fabry_perot.py
and
voltage_raise.py
can both be executed in parallel using MPI.