How to read data from external .txt file, store them in arrays, filter and write them in a new file in Fortran 90?

Let's get one thing out of the way before moving on to the next part: if this is simply a "filtering" task, treat it as a filtering task.

In Fortran 2018 this could be as simple as

  implicit none
  character(1234) line
  integer iostat, nchars

  do
    read (*,'(A)',iostat=iostat,size=nchars) line
    if (iostat.lt.0) exit
    if (KEEP_LINE) print *, line(:nchars)    ! Implement conditional
  end do
end program

(If your compiler isn't a Fortran 2018 compiler you will need to complicate this.) This program acts as a filter in the Unix-sense:

./program < input_file > output_file

For this question the filter would be something like "pass first two lines; pass later lines where the sixth column as numeric is less than 2". I'll leave the exact specification as an exercise, noting that we can do the job with

awk 'NR<3||$6<2' < input_file > output_file

Note that you can simply extract the sixth column without creating variables for each column - or you can note that it's the first column of line(52:).


That's the filtering out of the way. Let's look at how we can create a data structure and do something with it in the Fortran program.

As High Performance Mark commented, and veryreverie expanded on we can create a derived type for this "data table" (if all columns are the same data type we can possibly get away with just a rank-2 intrinsic type, although even in such cases a derived type helps):

  type asteroids_t
     integer :: num
     character(18) :: name
     integer :: epoch
     real :: a, e, i, w, node, m, h, g
     character(10) :: ref
  end type asteroids_t

(set the kind parameter of each component as desired, but probably double precision for the reals)

We have a format for the input and output:

  character(*), parameter :: FMT='(i6,a,i7,f10.7,f11.8,3f10.5,f12.7,2f6.2,a)'

(Note that we can't use list-directed formatting for the input, because the final column has a space in the character. Again, working around this is an exercise.)

Assuming we have an array appropriately sized (see general questions about reading files with unknown number of rows, or veryreverie's answer here for detail) we're good to go. For clarity here I'm going to use an explicit size.

  type(asteroids_t) asteroids(NUMBER_OF_ASTEROIDS)
  integer, allocatable :: pass(:)
  read FMT, asteroids
  ... ! Work, including setting pass for filter
  print FMT, asteroids(pass)

Putting that all together for a quick-and-dirty program:

  implicit none

  type asteroids_t
     integer :: num
     character(18) :: name
     integer :: epoch
     real(KIND(0d0)) :: a, e, i, w, node, m, h, g
     character(10) :: ref
  end type asteroids_t

  type(asteroids_t) :: asteroids(22)
  character(118) :: header(2)
  character(*), parameter :: FMT='(i6,a,i7,f10.7,f11.8,3f10.5,f12.7,2f6.2,a)'
  integer :: i
  
  read '(A)', header
  print '(A)', header

  read FMT, asteroids
  print FMT, asteroids(PACK([(i,i=1,SIZE(asteroids))], asteroids%i<2))
end program

The key point to note is that we can process our derived type with "normal" input/output: the item asteroids is expanded as array elements and each array element is then expanded as components. Because our derived type has no private, pointer or allocatable component we can use this simple form of processing.


As more advanced material, a note on the "magic numbers" in the example here. We already know how to remove the magic asteroid count (22) and the magic number and length of the header lines (2 and 118). But maybe we're worried about the lengths of those character components (18 and 10).

Our data structure is tightly coupled to the form of the input data file, but what if we have two datasets where the names differ in length? It will indeed be a pain to rewrite or duplicate our derived type to handle this. Let's solve that problem:

  type asteroids_t(len_name, len_ref)
     integer, len :: len_name=18, len_ref=10
     integer :: num
     character(len_name) :: name
     integer :: epoch
     real(KIND(0d0)) :: a, e, i, w, node, m, h, g
     character(len_ref) :: ref
  end type asteroids_t

  type(asteroids_t) asteroids_set_1(22)
  type(asteroids(25,8)) asteroids_set_2(22)

! There's no magic character length in FMT
  read FMT, asteroids_set_1
  read FMT, asteroids_set_2

The column widths can even be deferred to be resolved at run-time (not shown). You can read about these parameterized derived types in more detail elsewhere.


To expand on @HighPerformanceMark's comments, the best thing to do is to define an Asteroid type which holds all of the information about an asteroid, and then to create an array of Asteroids.

The Asteroid type

The Asteroid type should initially just contain the data about an asteroid,

type :: Asteroid
  integer :: num
  character(:), allocatable :: name
  integer :: epoch
  real(dp) :: a
  real(dp) :: e
  real(dp) :: i
  real(dp) :: w
  real(dp) :: node
  real(dp) :: m
  real(dp) :: h
  real(dp) :: g
  character(:), allocatable :: ref_name
  integer :: ref_number
end type

where dp defines double precision.

This allows you to have an array of Asteroids, e.g.

type(Asteroid) :: asteroids(22)

asteroids(1) = Asteroid(1, "Ceres", ..., "JPL", 48)
...
asteroids(22) = Asteroid(22, "Kalliope", ..., "JPL", 111)

write(*,*) asteroids(1)%name ! Writes "Ceres".

Reading and Writing Asteroids

You want to be able to read and write asteroids to and from file, and you can do this using user defined input/output. For this you need a subroutine to read an Asteroid, e.g.

subroutine read_Asteroid(this, unit, iotype, v_list, iostat, iomsg)
  class(Asteroid), intent(inout) :: this
  integer, intent(in) :: unit
  character(*), intent(in) :: iotype
  integer, intent(in) :: v_list(:)
  integer, intent(out) :: iostat
  character(*), intent(inout) :: iomsg
  
  character(100) :: name
  character(100) :: ref_name
  
  read(unit, *, iostat=iostat, iomsg=iomsg) &
     & this%num, &
     & name, &
     & this%epoch, &
     & this%a, &
     & this%e, &
     & this%i, &
     & this%w, &
     & this%node, &
     & this%m, &
     & this%h, &
     & this%g, &
     & ref_name, &
     & this%ref_number
  
  this%name = trim(name)
  this%ref_name = trim(ref_name)
end subroutine

and another to write an Asteroid, e.g.

subroutine write_Asteroid(this, unit, iotype, v_list, iostat, iomsg)
  class(Asteroid), intent(in) :: this
  integer, intent(in) :: unit
  character(*), intent(in) :: iotype
  integer, intent(in) :: v_list(:)
  integer, intent(out) :: iostat
  character(*), intent(inout) :: iomsg
  
  write(unit, *, iostat=iostat, iomsg=iomsg) &
     & this%num, &
     & this%name, &
     & this%epoch, &
     & this%a, &
     & this%e, &
     & this%i, &
     & this%w, &
     & this%node, &
     & this%m, &
     & this%h, &
     & this%g, &
     & this%ref_name, &
     & this%ref_number
end subroutine

You also need to add bindings to the Asteroid type so that it knows to use read_Asteroid and write_Asteroid for reading and writing. This looks like

type :: Asteroid
  integer :: num
  ...
  integer :: ref_number
contains
  ! `read` binding.
  generic :: read(formatted) => read_Asteroid
  procedure :: read_Asteroid
  
  ! `write` binding.
  generic :: write(formatted) => write_Asteroid
  procedure :: write_Asteroid
end type

N.B. because the Asteroid type has allocatable components (name and ref_name), which are not allocated by read statements, care must be taken when writing read_Asteroid. This method can be used to read allocatables; first reading to an over-large buffer, and then copying the data to the allocatable variable. (Thanks @francescalus for pointing out previous problems with my code here).

It's now possible to read and write asteroids directly, e.g.

character(1000) :: line
type(Asteroid) :: Ceres

line = "1 Ceres 59600 2.766 0.07850 10.58 73.63 80.26 291.3 3.54 0.12 JPL 48"
read(line, *) Ceres
write(*, *) Ceres

An example code

Putting this all together, here is an example code which reads a file full of asteroids and then writes those with i < 2:

module asteroid_module
  implicit none
  
  ! Define `dp`, which defines double precision.
  integer, parameter :: dp = selected_real_kind(15, 307)
  
  ! Define the `Asteroid` type.
  type :: Asteroid
    integer :: num
    character(:), allocatable :: name
    integer :: epoch
    real(dp) :: a
    real(dp) :: e
    real(dp) :: i
    real(dp) :: w
    real(dp) :: node
    real(dp) :: m
    real(dp) :: h
    real(dp) :: g
    character(:), allocatable :: ref_name
    integer :: ref_number
  contains
    ! `read` binding.
    generic :: read(formatted) => read_Asteroid
    procedure :: read_Asteroid
    
    ! `write` binding.
    generic :: write(formatted) => write_Asteroid
    procedure :: write_Asteroid
  end type
contains

! Define how to `read` an `Asteroid`.
subroutine read_Asteroid(this, unit, iotype, v_list, iostat, iomsg)
  class(Asteroid), intent(inout) :: this
  integer, intent(in) :: unit
  character(*), intent(in) :: iotype
  integer, intent(in) :: v_list(:)
  integer, intent(out) :: iostat
  character(*), intent(inout) :: iomsg
  
  character(100) :: name
  character(100) :: ref_name
  
  read(unit, *, iostat=iostat, iomsg=iomsg) &
     & this%num, &
     & name, &
     & this%epoch, &
     & this%a, &
     & this%e, &
     & this%i, &
     & this%w, &
     & this%node, &
     & this%m, &
     & this%h, &
     & this%g, &
     & ref_name, &
     & this%ref_number
  
  this%name = trim(name)
  this%ref_name = trim(ref_name)
end subroutine

! Define how to `write` an `Asteroid`.
subroutine write_Asteroid(this, unit, iotype, v_list, iostat, iomsg)
  class(Asteroid), intent(in) :: this
  integer, intent(in) :: unit
  character(*), intent(in) :: iotype
  integer, intent(in) :: v_list(:)
  integer, intent(out) :: iostat
  character(*), intent(inout) :: iomsg
  
  write(unit, *, iostat=iostat, iomsg=iomsg) &
     & this%num, &
     & this%name, &
     & this%epoch, &
     & this%a, &
     & this%e, &
     & this%i, &
     & this%w, &
     & this%node, &
     & this%m, &
     & this%h, &
     & this%g, &
     & this%ref_name, &
     & this%ref_number
end subroutine
end module

program example
  use asteroid_module
  implicit none
  
  character(1000) :: line
  integer :: iostat
  integer :: file_length
  type(Asteroid), allocatable :: asteroids(:)
  integer :: i
  
  ! Count the number of lines in the file.
  file_length = 0
  open(10, file="input.txt")
  do
    read(10, '(A)',iostat=iostat) line
    if (iostat/=0) then
      exit
    endif
    file_length = file_length + 1
  enddo
  close(10)
  
  ! Allocate the array to hold the asteroids.
  allocate(asteroids(file_length-2))
  
  ! Read the asteroids into the array.
  open(10, file="input.txt")
  read(10, '(A)') line
  read(10, '(A)') line
  do i=1,size(asteroids)
    read(10, '(A)') line
    read(line, *) asteroids(i)
  enddo
  close(10)
  
  ! Write the asteroids with `i` < 2 to a file.
  open(10, file="output.txt")
  do i=1,size(asteroids)
    if (asteroids(i)%i < 2.0_dp) then
      write(10,*) asteroids(i)
    endif
  enddo
  close(10)
end program