Posts Tagged ‘MPI’


Charm/Charm++ programming language

December 28, 2010

I have recently started developing a numerical solver using the Charm++ package. Charm++ is designed to abstract parallel computing from the typical MPI paradigm. The programming language is based on asynchronous communication and the construction of multiple objects per processor core (‘chares’). The system implements dynamic load balancing via the migration of chares across processors.

The paradigm itself is efficient and powers some applications which scale to extreme levels (e.g. NAMD & OpenAtom) and has a number of very useful features (visualisation, profiling & debugging), however, it may be a mess to start with for the developer.

The manual on the charm++ website is a great start along with the examples within the charm source directory. Additional resources include the webcasts of tutorial lectures. However the language desperately needs a clear API to support development using charm++.

An API will make development easier especially when using associated libraries and their limitations (i.e. C++ STL support and Barrier support).

More to come as I get further along.


mpi4py parallel IO example

September 23, 2010

For about 9 months I have  been running python jobs in parallel using mpi4py and NumPy. I had to write a new algorithm with MPI  so I decided to do the IO in parallel. Below is a small example of reading data in parallel. Mpi4py is lacking examples. It is not pretty, however, it does work.

import mpi4py.MPI as MPI
import numpy as np
class Particle_parallel():
    """ Particle_parallel - distributed reading of x-y-z coordinates.

    Designed to split the vectors as evenly as possible except for rounding
    ont the last processor.

    File format:
        32bit int :Data dimensions which should = 3
        32bit int :n_particles
        64bit float (n_particles) : x-coordinates
        64bit float (n_particles) : y-coordinates
        64bit float (n_particles) : z-coordinates
    def __init__(self, file_name,comm):
        self.comm = comm
        self.rank = self.comm.Get_rank()
        self.size = self.comm.Get_size()
        self.data_type_size = 8
        self.mpi_file = MPI.File.Open(self.comm, file_name)
        self.data_dim = np.zeros(1, dtype = np.dtype('i4'))
        self.n_particles = np.zeros(1, dtype = np.dtype('i4'))
        self.file_name = file_name
        self.debug = True

    def info(self):
        """ Distrubute the required information for reading to all ranks.

        Every rank must run this funciton.
        Each machine needs data_dim and n_particles.
        # get info on all machines
        self.mpi_file.Read_all([self.data_dim, MPI.INT])
        self.mpi_file.Read_all([self.n_particles, MPI.INT])
        self.data_start = self.mpi_file.Get_position()
    def read(self):
        """ Read data and return the processors part of the coordinates to:
        assert self.data_dim != 0
        # First establish rank's vector sizes
        default_size = np.ceil(self.n_particles / self.size)
        # Rounding errors here should not be a problem unless
        # default size is very small
        end_size = self.n_particles - (default_size * (self.size - 1))
        assert end_size >= 1
        if (self.rank == (self.size - 1)):
            self.proc_vector_size = end_size
            self.proc_vector_size = default_size
        # Create individual processor pointers
        x_start = int(self.data_start + self.rank * default_size *
        y_start = int(self.data_start + self.rank * default_size *
                self.data_type_size +  self.n_particles *
                self.data_type_size * 1)
        z_start = int(self.data_start + self.rank * default_size *
                self.data_type_size + self.n_particles *
                self.data_type_size * 2)
        self.x_proc = np.zeros(self.proc_vector_size)
        self.y_proc = np.zeros(self.proc_vector_size)
        self.z_proc = np.zeros(self.proc_vector_size)
        # Seek to x
        if self.debug:
            print 'MPI Read'
        self.mpi_file.Read([self.x_proc, MPI.DOUBLE])
        if self.rank:
            print 'MPI Read done'
        self.mpi_file.Read([self.y_proc, MPI.DOUBLE])
        self.mpi_file.Read([self.z_proc, MPI.DOUBLE])
        return self.x_proc, self.y_proc, self.z_proc
    def Close(self):

Python for scientific computing

March 27, 2010

I am currently undertaking a Ph.D. where i am researching blood flow using an academic computational fluid dynamics (CFD) code (Viper). Like many numerical investigations the pre-processing and post-processing is as important as the algorithm itself. Up until late early this year MATLAB was my language of choice for processing data, however, I have recently embraced Python (particularly NumPy) as a fantastic alternative.

Matlab is a nice clean language for dealing with vectors, however, even with the alternative of octave the vendor lock-in becomes terrible when you want to scale up your code for production runs on clusters. Python in contrast is completely free and scientific computation implemented by several  fantastic open source packages such as SciPy and  NumPy which easily duplicate the core functionality of Matlab in a pythonic, object orientated fashion. Additionally unlike Matlab the overhead of starting an interpreter is small and Python is almost universally available on *nix systems.

Yet the numerical capacity of Python is not the primary reason I have come to like the language. The core features of the language such as:

  • Clean code and indentation based control
  • Fantastic support for file I/O
  • Huge standard library including integrated support for debugging
  • A fantastic community with plenty of practical resources and superb documentation

All said the transition (despite some moments of pain) has sharpened my programming and extended my abilities appreciably (including writing my first decent MPI code).