Reading writing fortran direct access unformatted files with different compilers
I have a section in a program that writes a direct-access binary file as follows:
open (53, file=filename, form='unformatted', status='unknown',
& access='direct',action='write',recl=320*385*8)
write (53,rec=1) ulat
write (53,rec=2) ulng
close(53)
This program is compiled with ifort. However, I cannot reconstruct the data correctly if I read the data file from a different program compiled with gfortran. If the program reading the data is also compiled in ifort, then I can correctly reconstruct the data. Here's the code reading the data file:
OPEN(53, FILE=fname, form="unformatted", status="unknown", access="direct", action="read", recl=320*385*8)
READ(53,REC=2) DAT
I do not understand why this is happening? I can read the first record correctly with both compilers, it's the second record that I cannot reconstruct properly if I mix the compilers.
Solution 1:
Ifort and gfortran do not use the same block size for record length by default. In ifort, the value of recl
in your open
statement is in 4-byte blocks, so your record length isn't 985,600 bytes, it is 3,942,400 bytes long. That means the records are written at intervals of 3.9 million bytes.
gfortran uses a recl
block size of 1 byte and your record length is 985,600 byes. When you read the first record, everything works, but when you read the second record you look at 985,600 bytes into the file but the data is at 3,942,400 bytes into the file. This also means you are wasting a ton of data in the file, as you are using only 1/4 of its size.
There are a couple ways to fix this:
- In ifort specify recl in 4-byte blocks, e.g.
320*385*2
instead of*8
- In ifort, use the compile flag
-assume byterecl
to haverecl
values interpreted as bytes. - In gfortran compensate for the size and use
recl=320*385*32
so that your reads are correctly positioned.
A better way, however, is to engineer agnosticism in the recl
unit size. You can use inquire
to figure out the recl of an array. For example:
real(kind=wp), allocatable, dimension(:,:) :: recltest
integer :: reclen
allocate(recltest(320,385))
inquire(iolength=reclen) recltest
deallocate(recltest)
...
open (53, file=filename, form='unformatted', status='unknown',
& access='direct',action='write',recl=reclen)
...
OPEN(53, FILE=fname, form="unformatted", status="unknown", &
access="direct", action="read", recl=reclen)
This will set reclen
to the value needed to store a 320x385
array based on the that compilers base unit for record length. If you use this when both writing and reading your code will work with both compilers without having to use compile-time flags in ifort
or compensate with hardcoded recl differences between compilers.
An illustrative example
Testcase 1
program test
use iso_fortran_env
implicit none
integer(kind=int64), dimension(5) :: array
integer :: io_output, reclen, i
reclen = 5*8 ! 5 elements of 8 byte integers.
open(newunit=io_output, file='output', form='unformatted', status='new', &
access='direct', action='write', recl=reclen)
array = [(i,i=1,5)]
write (io_output, rec=1) array
array = [(i,i=101,105)]
write (io_output, rec=2) array
array = [(i,i=1001,1005)]
write (io_output, rec=3) array
close(io_output)
end program test
This program writes an array of 5 8-byte integers 3 times to the file in records 1,2 and 3. The array is 5*8 bytes and I have hardcoded that number as the recl value.
Testcase 1 with gfortran 5.2
I compiled this testcase with the command line:
gfortran -o write-gfortran write.f90
This produces the output file (interpreted with od -A d -t d8
):
0000000 1 2
0000016 3 4
0000032 5 101
0000048 102 103
0000064 104 105
0000080 1001 1002
0000096 1003 1004
0000112 1005
0000120
The arrays of 5 8-bye elements are packed contiguously into the file and record number 2 (101 ... 105
) starts where we would expect it to at offset 40, which is the recl value in the file 5*8
.
Testcase 1 with ifort 16
This is compiled similarly:
ifort -o write-ifort write.f90
And this, for the exact same code, produces the output file (interpreted with od -A d -t d8
):
0000000 1 2
0000016 3 4
0000032 5 0
0000048 0 0
*
0000160 101 102
0000176 103 104
0000192 105 0
0000208 0 0
*
0000320 1001 1002
0000336 1003 1004
0000352 1005 0
0000368 0 0
*
0000480
The data is all there but the file is full of 0 valued elements. The lines starting with *
indicate every line between the offsets is 0. Record number 2 starts at offset 160 instead of 40. Notice that 160 is 40*4, where 40 is our specified recl of 5*8
. By default ifort uses 4-byte blocks, so a recl of 40 means a physical record size of 160 bytes.
If code compiled with gfortran were to read this, records 2,3 and 4 would contain all 0 elements and a read of record 5 would correctly read the array written as record 2 by ifort. An alternative to have gfortran read record 2 where it lies in the file would be to use recl=160
(4*5*4) so that the physical record size matches what was written by ifort.
Another consequence of this is wasted space. Over-specifying the recl means you are using 4 times the necessary disk space to store your records.
Testcase 1 with ifort 16 and -assume byterecl
This was compiled as:
ifort -assume byterecl -o write-ifort write.f90
And produces the output file:
0000000 1 2
0000016 3 4
0000032 5 101
0000048 102 103
0000064 104 105
0000080 1001 1002
0000096 1003 1004
0000112 1005
0000120
This produces the file as expected. The command line argument -assume byterecl
tells ifort to interpret any recl
values as bytes rather than double words (4-byte blocks). This will produce writes and reads that match code compiled with gfortran.
Testcase 2
program test
use iso_fortran_env
implicit none
integer(kind=int64), dimension(5) :: array
integer :: io_output, reclen, i
inquire(iolength=reclen) array
print *,'Using recl=',reclen
open(newunit=io_output, file='output', form='unformatted', status='new', &
access='direct', action='write', recl=reclen)
array = [(i,i=1,5)]
write (io_output, rec=1) array
array = [(i,i=101,105)]
write (io_output, rec=2) array
array = [(i,i=1001,1005)]
write (io_output, rec=3) array
close(io_output)
end program test
The only difference in this testcase is that I am inquiring the proper recl to represent my 40-byte array (5 8-byte integers).
The output
gfortran 5.2:
Using recl= 40
ifort 16, no options:
Using recl= 10
ifort 16, -assume byterecl
:
Using recl= 40
We see that for the 1-byte blocks used by gfortran and ifort with the byterecl
assumption that recl is 40
, which equals our 40 byte array. We also see that by default, ifort uses a recl of 10, which means 10 4-byte blocks or 10 double words, both of which mean 40 bytes. All three of these testcases produce identical file output and read/writes from either compiler will function properly.
Summary
To have record-based, unformatted, direct data be portable between ifort and gfortran the easiest option is to just add -assume byterecl
to the flags used by ifort. You really should have been doing this already since you are specifying record lengths in bytes, so this would be a straightforward change that probably has no consequences for you.
The other alternative is to not worry about the option and use the inquire
intrinsic to query the iolength
for your array.