How to read a data file with some condition faster in Fortran?












2














I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)



-15.00000       -15.00000       -15.00000      700.00000     -590.00000      100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000


What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).



   subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file

path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')

xi_1 = xi +step
yi_1 = yi +step

do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end


I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).



I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.



How can I make this code work more efficiently?










share|improve this question




















  • 2




    30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
    – chw21
    Nov 13 '18 at 5:12










  • Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
    – chw21
    Nov 13 '18 at 5:20










  • @chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
    – Mehdi
    Nov 13 '18 at 5:28












  • Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
    – chw21
    Nov 13 '18 at 5:32










  • If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
    – chw21
    Nov 13 '18 at 5:35
















2














I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)



-15.00000       -15.00000       -15.00000      700.00000     -590.00000      100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000


What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).



   subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file

path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')

xi_1 = xi +step
yi_1 = yi +step

do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end


I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).



I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.



How can I make this code work more efficiently?










share|improve this question




















  • 2




    30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
    – chw21
    Nov 13 '18 at 5:12










  • Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
    – chw21
    Nov 13 '18 at 5:20










  • @chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
    – Mehdi
    Nov 13 '18 at 5:28












  • Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
    – chw21
    Nov 13 '18 at 5:32










  • If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
    – chw21
    Nov 13 '18 at 5:35














2












2








2







I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)



-15.00000       -15.00000       -15.00000      700.00000     -590.00000      100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000


What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).



   subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file

path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')

xi_1 = xi +step
yi_1 = yi +step

do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end


I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).



I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.



How can I make this code work more efficiently?










share|improve this question















I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)



-15.00000       -15.00000       -15.00000      700.00000     -590.00000      100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000


What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).



   subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file

path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')

xi_1 = xi +step
yi_1 = yi +step

do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end


I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).



I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.



How can I make this code work more efficiently?







file-io fortran






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 7:34









Ian Bush

2,2841714




2,2841714










asked Nov 13 '18 at 4:47









Mehdi

111




111








  • 2




    30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
    – chw21
    Nov 13 '18 at 5:12










  • Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
    – chw21
    Nov 13 '18 at 5:20










  • @chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
    – Mehdi
    Nov 13 '18 at 5:28












  • Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
    – chw21
    Nov 13 '18 at 5:32










  • If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
    – chw21
    Nov 13 '18 at 5:35














  • 2




    30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
    – chw21
    Nov 13 '18 at 5:12










  • Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
    – chw21
    Nov 13 '18 at 5:20










  • @chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
    – Mehdi
    Nov 13 '18 at 5:28












  • Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
    – chw21
    Nov 13 '18 at 5:32










  • If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
    – chw21
    Nov 13 '18 at 5:35








2




2




30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12




30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12












Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
– chw21
Nov 13 '18 at 5:20




Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is == even Fortran 77, or would that technically need to be .eq.?)
– chw21
Nov 13 '18 at 5:20












@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28






@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28














Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32




Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32












If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35




If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35












1 Answer
1






active

oldest

votes


















1














Before I begin this answer, let me reiterate what I said in the comments to your question:



Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.



But let's assume that the data really gets too big.



Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.



If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.



Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:



program convert

use iso_fortran_env, only: real64
implicit none

integer, parameter :: reclength = 6*8 ! Six 8-byte values

integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)

open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0

do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do

close(u_out)
close(u_in)

end program convert


Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.



program read_txt
use iso_fortran_env, only: real64
implicit none

integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64

real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values

open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt


Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.



But if you have control over it, it might be a useful way to speed things up.



PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.






share|improve this answer























  • Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
    – Mehdi
    Dec 22 '18 at 5:09













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273986%2fhow-to-read-a-data-file-with-some-condition-faster-in-fortran%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Before I begin this answer, let me reiterate what I said in the comments to your question:



Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.



But let's assume that the data really gets too big.



Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.



If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.



Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:



program convert

use iso_fortran_env, only: real64
implicit none

integer, parameter :: reclength = 6*8 ! Six 8-byte values

integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)

open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0

do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do

close(u_out)
close(u_in)

end program convert


Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.



program read_txt
use iso_fortran_env, only: real64
implicit none

integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64

real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values

open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt


Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.



But if you have control over it, it might be a useful way to speed things up.



PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.






share|improve this answer























  • Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
    – Mehdi
    Dec 22 '18 at 5:09


















1














Before I begin this answer, let me reiterate what I said in the comments to your question:



Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.



But let's assume that the data really gets too big.



Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.



If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.



Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:



program convert

use iso_fortran_env, only: real64
implicit none

integer, parameter :: reclength = 6*8 ! Six 8-byte values

integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)

open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0

do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do

close(u_out)
close(u_in)

end program convert


Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.



program read_txt
use iso_fortran_env, only: real64
implicit none

integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64

real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values

open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt


Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.



But if you have control over it, it might be a useful way to speed things up.



PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.






share|improve this answer























  • Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
    – Mehdi
    Dec 22 '18 at 5:09
















1












1








1






Before I begin this answer, let me reiterate what I said in the comments to your question:



Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.



But let's assume that the data really gets too big.



Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.



If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.



Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:



program convert

use iso_fortran_env, only: real64
implicit none

integer, parameter :: reclength = 6*8 ! Six 8-byte values

integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)

open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0

do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do

close(u_out)
close(u_in)

end program convert


Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.



program read_txt
use iso_fortran_env, only: real64
implicit none

integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64

real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values

open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt


Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.



But if you have control over it, it might be a useful way to speed things up.



PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.






share|improve this answer














Before I begin this answer, let me reiterate what I said in the comments to your question:



Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.



But let's assume that the data really gets too big.



Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.



If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.



Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:



program convert

use iso_fortran_env, only: real64
implicit none

integer, parameter :: reclength = 6*8 ! Six 8-byte values

integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)

open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0

do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do

close(u_out)
close(u_in)

end program convert


Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.



program read_txt
use iso_fortran_env, only: real64
implicit none

integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64

real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values

open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt


Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.



But if you have control over it, it might be a useful way to speed things up.



PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 '18 at 6:34

























answered Nov 15 '18 at 6:19









chw21

5,879620




5,879620












  • Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
    – Mehdi
    Dec 22 '18 at 5:09




















  • Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
    – Mehdi
    Dec 22 '18 at 5:09


















Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09






Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273986%2fhow-to-read-a-data-file-with-some-condition-faster-in-fortran%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

List item for chat from Array inside array React Native

Thiostrepton

Caerphilly