Duy Le | Department of Physics | University of Central Florida

Read complex array from Fortran binary file with Python (06/14/2013)
- Today, I faced a tiny challenge when using Python to read complex arrays from a binary file that is generated by Fortran. Why do I need to use another language to read a binary file generated by Fortran? Why do I not use ASCII instead? The reasons are:
  
  (1) many of computational codes (such as PWSCF, VASP, Gaussian) that I use are written in Fortran
  (2) Python is my favorite language for post-processing tasks
  (3) unformatted binary files are chosen for saving disk space.
  
  I did google for help and found few useful posts, including the FortranFile class by Neil Martinsen-Burrell. None of these seemed to co-operate with me. I decided not to spend times to dig solutions out of the W3 but to find a workaround. My solution consists of 4 steps:
  
  (1) breaking the complex array into two equal-length real arrays
  (2) writing the two arrays in a direct access binary file
  (3) using numpy.fromfile to read the arrays
  (4) reconstructing the original complex array.
  
  Let's first see how to use Python read a real array in a binary file written by Fortran. I noted that numpy.fromfile could read correctly (only?) from a direct access file. Thus, the Fortran code should look like:
```
PROGRAM bin_real 

IMPLICIT NONE
INTEGER , PARAMETER :: DP = 8
REAL ( DP ) , ALLOCATABLE :: x ( : )
INTEGER :: n, i, reclength

! Generate real data for the array x
n = 100
ALLOCATE( x ( n ) )
DO i = 1, n
    x ( i ) = i**2 / 9._DP  ! Who cares?
END DO

INQUIRE ( IOLENGTH = reclength ) x
OPEN ( 100 , FILE = 'DATA' , STATUS = 'UNKNOWN' , &
       FORM = 'UNFORMATTED' , ACCESS = 'DIRECT' , RECL = reclength )
WRITE ( 100 , REC = 1 ) x
CLOSE ( 100 )

END PROGRAM bin_real
```
  The Python code that reads correctly the DATA file created above looks like:
```
import numpy as np

f = open ( 'DATA' , 'rb' )
x = np.fromfile ( f , dtype = np.float64 , count = -1 )
f.close()
```
  Note that the DP parameter and dtype must be compatible.
  
  The Fortran code below will write a complex array to DATA binary file as two real arrays:
```
PROGRAM bin_cmplx 

IMPLICIT NONE
INTEGER , PARAMETER :: DP = 8
COMPLEX ( DP ) , ALLOCATABLE :: wfc ( : )
REAL ( DP )   , ALLOCATABLE :: wfc_real ( : ) , wfc_imag ( : )
INTEGER :: n, i, reclength

n = 100000000
ALLOCATE( wfc ( n ) , wfc_real ( n ) , wfc_imag ( n ) )

! Generate data for the array wfc
DO i = 1, n
    wfc ( i ) = cmplx ( 2*i + 0._DP , 2*i+1._DP )  ! Who cares?
END DO

! Break complex array wfc into two real arrays
wfc_real ( : ) = real ( wfc ( : ) )
wfc_imag ( : ) = imag ( wfc ( : ) )

INQUIRE ( IOLENGTH = reclength ) wfc_real
OPEN ( 100 , FILE = 'DATA' , STATUS = 'UNKNOWN' , &
       FORM = 'UNFORMATTED' , ACCESS = 'DIRECT' , RECL = reclength )
WRITE ( 100 , REC = 1 ) wfc_real 
WRITE ( 100 , REC = 2 ) wfc_imag 
CLOSE ( 100 )

END PROGRAM bin_cmplx
```
  And the Python code:
```
import numpy as np
from datetime import datetime

f = open ( 'DATA' , 'rb' )
x = np.fromfile ( f , dtype = np.float64 , count = -1 )
f.close()

n = len ( x ) / 2

""" break x into two arrays    """
""" or reshape x to ( 2 , n )  """
x = x.reshape ( 2 , n )

""" Reconstruct the original complex array """
wfc = np.zeros ( [ n ] , dtype = np.complex )
wfc.real = x [ 0 ]
wfc.imag = x [ 1 ]
```
  will read the above two real arrays and reconstruct the original complex array.
  
  Final remarks:
  
  (1) Since numpy.fromfile can read all records at once (with count = -1), I wrote the two arrays as two records for convenience.
  (2) For the same reason, each binary file contains data of only one complex array should it be large.
  (3) On a poor AMD node, it took 0.07 and 9.35 seconds for handling a complex array of, respectively, 1 and 100 million elements.
  (4) I guess that there are many other ways to do the same job but I did not invest my whole day for finding solutions. The above method, however, works very consistently. Should you have/know any better method and want to share, do not hesitate to email me.