Python and VTK

September 8, 2010

I recently have been working on moving data gathered in vitro as the geometric basis for some computational fluid dynamics (CFD). simulations I am running. The simulations are solved using openFOAM, therefore, I import the geometry as a series of .STL files.

The idea is that the data provided to me will be able to describe where the solid–fluid boundary is. From this I should be able to generate an .STL surface. The most reliable way (I have observed) to do this, given the data I am provided, is to generate a volume such that solid and fluid phases are distinguishable. This allows a iso-surface (and from this an .STL) to be generated.

I can employ Tecplot or Paraview to do this assuming I have an appropriate data file. Rather than painstakingly duplicate the VTK data format IO for paraview I decided to use the VTK python bindings and generate the files, and later the contours, myself.

VTK is an excellent tool. The python bindings are comprehensive and despite the package size I managed to get things moving without too much trouble. The interface to NumPy arrays allows it to interface nicely with any python based calculations I had. The errors messages were informative and the Doxygen documentation has decent descriptions for many classes. All of the classes even have help available in the interpreter. This is somewhat hidden (you need to use dir to get the available functions and ask for help for each of them individually).

The downsides: Python examples are sparse compared with C++ / tcl and some of the classes have very similar functions with slightly unpredictable behaviour i.e.

# vtk_data is of type vtk.vtkImageData()

im_FFT = vtk.vtkImageFFT()



The examples (and to some extent the book) presume that you have a compatible data file to start with. There are no examples of how to bring in large quantities of data from another part of a program.

Recommendations: Get yourself a copy of the vtk book for the first few days of working with VTK. It introduces concepts in a straightforward manner and increased my understanding substantially. After you are familiar with VTK it is not required.

Next project: Tecplot (.plt) to .vt* converter.. I have a very limited version working, however, it requires work to be robust.


Homebrew package manager for Mac OS

August 15, 2010

Homebrew is a new package manager for mac OS rivalling Fink and Macports.  Based on Git it is extensible and flexible. Critically it focuses on minimising the massive duplication of libraries typically observed within Fink and Macports.

Conceptually the idea is great, however, it is immature when compared with Fink or Macports. Primarily this is due to the limited number of packages available. However, I do want to point out that sometimes it is necessary to have different versions of software to the system default.

An example is gcc. While it is a pain to install (it takes a LOOONG time to compile) I need 4.3 for some software. Snow Leopard comes with 4.2. And another similar issue with open-mpi.

Package duplication is a problem, however, sometimes there is no way to avoid it. Minimization is the key and I think Homebrew has gone one step to far. With a little more sophistication it may work very well.


Failing Bourne/Bash line continuations

April 8, 2010

When writing any code to keep the code width reasonable it is necessary to use line continuations. The unix shell is not my favourite programming environment, however,  I wanted to wrap an extremely long line from a colleagues script.

For example

# Without line continuation
my extremely long code line featuring many long file names and directories
# With line continuation
my extremely long code line featuring\
many long file names and directories

This seems relatively simple, however, I ran into some difficulty. One error was simple, the line continuation sequence is \ followed by a EOL character. It is easy to forget to white spaces between the \ and the EOL character.

The second problem is one many resources fail to mention. The newline character must be a UNIX EOL character (see Newline). Unfortunately the script had been first edited under windows. This resulted in a failure to recognise the combination as a line continuation.

Run script through dos2unix or similar tool.


Binary data and Python: Just use NumPy!

April 6, 2010

To post-process some CFD data, I have manipulated binary files generated for Tecplot with Python. The challenge here is how to import large vectors of binary numbers into NumPy ndarrays while processing binary metadata.

This task appears straightforward as shown below:

import numpy as np
import struct

file_in = 'strange_binary_format.dat'
fd = open(file_in,'rb')
# buffer data -- this is bulky
buffer = fd.read()
# read 1000 doubles from the buffer from byte "position" forward
position = 0
no_of_doubles = 1000
read_format = str(no_of_doubles) + 'd'
read_size = struct.calcsize(read_format)
# put data into numpy arrays . . this is very slow and memory intensive
# might be due to struct.unpack returning as a tuple of floats
numpy_data =np.array(struct.unpack(read_format, buffer[position:
            (position + read_size)]))

However, this method is inefficient (and possibly cause memory leaks!). The struct.unpack function returns a tuple with 1000 individual floats resulting in significant overhead. The result is both memory intensive and slow. A later attempt is shown below:

import numpy as np
import struct

file_in = 'strange_binary_format.dat'
fd = open(file_in,'rb')
position = 0
no_of_doubles = 1000
# move to position in file

# straight to numpy data (no buffering) 
numpy_data = np.fromfile(fd, dtype = np.dtype('d'), count = no_of_doubles)

The NumPy function fromfile is significantly more efficient in terms of  both time and memory.

From this experience I have a rule for numerically intensive computing with Python: NumPy / SciPy functions will almost always be faster!


Jagungal wilderness

March 28, 2010

Jagungal wilderness area in the Kosciuszko national park, NSW, Australia.


Python for scientific computing

March 27, 2010

I am currently undertaking a Ph.D. where i am researching blood flow using an academic computational fluid dynamics (CFD) code (Viper). Like many numerical investigations the pre-processing and post-processing is as important as the algorithm itself. Up until late early this year MATLAB was my language of choice for processing data, however, I have recently embraced Python (particularly NumPy) as a fantastic alternative.

Matlab is a nice clean language for dealing with vectors, however, even with the alternative of octave the vendor lock-in becomes terrible when you want to scale up your code for production runs on clusters. Python in contrast is completely free and scientific computation implemented by several  fantastic open source packages such as SciPy and  NumPy which easily duplicate the core functionality of Matlab in a pythonic, object orientated fashion. Additionally unlike Matlab the overhead of starting an interpreter is small and Python is almost universally available on *nix systems.

Yet the numerical capacity of Python is not the primary reason I have come to like the language. The core features of the language such as:

  • Clean code and indentation based control
  • Fantastic support for file I/O
  • Huge standard library including integrated support for debugging
  • A fantastic community with plenty of practical resources and superb documentation

All said the transition (despite some moments of pain) has sharpened my programming and extended my abilities appreciably (including writing my first decent MPI code).


NY, NY Photos

March 24, 2010

A few of my favourite photos from New York. They were all taken on my canon 350D during my visit in November 2009.

City Hall Park

Firestation 10 WTC tribute

At Columbus circle

Battery City Esplanade

Jacqueline Kennedy Onassis reservoir looking towards upper east side

Turtle Pond - Central Park

Turtle Pond - Central Park

Brooklyn bridge

Brooklyn bridge

Brooklyn bridge

Brooklyn bridgeBrooklyn heights promenade

Brooklyn heights promenade

Brooklyn heights

Grand Central Station

Times Square

Times Square