How to read a data file with some condition faster in Fortran?
I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)
-15.00000 -15.00000 -15.00000 700.00000 -590.00000 100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000
What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).
subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file
path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')
xi_1 = xi +step
yi_1 = yi +step
do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end
I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).
I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.
How can I make this code work more efficiently?
file-io fortran
|
show 3 more comments
I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)
-15.00000 -15.00000 -15.00000 700.00000 -590.00000 100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000
What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).
subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file
path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')
xi_1 = xi +step
yi_1 = yi +step
do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end
I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).
I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.
How can I make this code work more efficiently?
file-io fortran
2
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is==even Fortran 77, or would that technically need to be.eq.?)
– chw21
Nov 13 '18 at 5:20
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35
|
show 3 more comments
I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)
-15.00000 -15.00000 -15.00000 700.00000 -590.00000 100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000
What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).
subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file
path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')
xi_1 = xi +step
yi_1 = yi +step
do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end
I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).
I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.
How can I make this code work more efficiently?
file-io fortran
I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz).
(Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15].
so the number of rows will be 31*31*31=29791)
-15.00000 -15.00000 -15.00000 700.00000 -590.00000 100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000
What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).
subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file
path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')
xi_1 = xi +step
yi_1 = yi +step
do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end
I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).
I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.
How can I make this code work more efficiently?
file-io fortran
file-io fortran
edited Nov 13 '18 at 7:34
Ian Bush
2,2841714
2,2841714
asked Nov 13 '18 at 4:47
Mehdi
111
111
2
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is==even Fortran 77, or would that technically need to be.eq.?)
– chw21
Nov 13 '18 at 5:20
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35
|
show 3 more comments
2
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is==even Fortran 77, or would that technically need to be.eq.?)
– chw21
Nov 13 '18 at 5:20
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35
2
2
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is
== even Fortran 77, or would that technically need to be .eq.?)– chw21
Nov 13 '18 at 5:20
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is
== even Fortran 77, or would that technically need to be .eq.?)– chw21
Nov 13 '18 at 5:20
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35
|
show 3 more comments
1 Answer
1
active
oldest
votes
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273986%2fhow-to-read-a-data-file-with-some-condition-faster-in-fortran%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
add a comment |
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
add a comment |
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z-coordinate changes fastest and the x-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
edited Nov 15 '18 at 6:34
answered Nov 15 '18 at 6:19
chw21
5,879620
5,879620
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
add a comment |
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
Thanks for your kind reply. the problem with this approach is that u still use a "do" loop in order to read all the input it: do ii = ii + 1 read(u_in, *, iostat=ios) pos, B if (ios /= 0) exit write(u_out, rec=ii) pos, B end do And the problem is that I have to do this in each step of my code because my step (x,y,z) and the respected Bx,By and Bz is changing over time so this loop is running on each step and still its time consuming.
– Mehdi
Dec 22 '18 at 5:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273986%2fhow-to-read-a-data-file-with-some-condition-faster-in-fortran%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
30k values isn't all that much. At 8 bytes per value, that's 240kB per array. Read it all in at the beginning of your program, then work of the memory.
– chw21
Nov 13 '18 at 5:12
Is there a specific reason you're using Fortran 77? Several features that would make your life a lot easier have been implemented in the last 40 years. (plus, is
==even Fortran 77, or would that technically need to be.eq.?)– chw21
Nov 13 '18 at 5:20
@chw21 Thanks for the response. This small file is 4.5MB on itself! I don't know how did you calculate its capacity. I was thinking about it at the beginning but first, the data set is too large. second, I don't know how to do it. I don't know how to do it.
– Mehdi
Nov 13 '18 at 5:28
Again, in your example, you have 16 bytes per number. In memory, that would be only 8 bytes, for a total of 300MB. My desktop computer is about 6 years old and has 20 times as much memory.
– chw21
Nov 13 '18 at 5:32
If you do run into problems, I'd recommend changing the file format. Convert it so something like NetCDF, which makes it much easier to read just the specific parts of the file that you need.
– chw21
Nov 13 '18 at 5:35