Python script moving data to next line while reading the data
up vote
0
down vote
favorite
I had wrote a code, which counts the number of delimiters in a line and if the number of delimiters present in a line is more or less than the expected number of delimiters per line then that line would be printed and copied to another file(Lines_FILE.txt) for analysis. For example:
1,a,b,c,d
2,e,f,g,h
3,r,h,,u,j
Above the third line will be copied and pasted in a new file.The script is:
import string
### PLEASE DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT
k = 0
linecount=0
with open('Mock.txt',encoding="latin1") as myfile: #input file name with extension also if required update file encoding
for line in myfile:
k=0
linecount=linecount+1
words = line.split()
for i in words:
for letter in i:
#k=line.count('"|"') #Unhash and Update delimiter and Text Qualifier if text qualifier present
k=line.count(',') #Unhash and Update delimiter if no text qualifier
print("Lines:",linecount)
print(k)
if(k!=94): #Update the number of delimiters present in the first line or the expected delimiters per line.
print(line)
f = open("Lines_FILE.txt","a")
f.write(line)
It was working fine but suddenly i noticed for one file, the script had picked up a line which was not an error and pasted it in the Lines_FILE.txt. I noticed that the script had picked up a line and
in the Lines_FILE.txt file half the line was moved to the next line whereas in the actual data this was not the case. This was the line :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
And the extracted line looked like :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
The line got pushed to next line after the 'with Sanmina' and 'Jabil' text. I noticed the same pattern for few more lines. I guess it has to do something with the gap after those texts.
To sum up the problem, while reading through the data the script is breaking few lines and considering those as error lines. As im new to python, it would be of great help if some one can guide me regarding this issue.
python csv
add a comment |
up vote
0
down vote
favorite
I had wrote a code, which counts the number of delimiters in a line and if the number of delimiters present in a line is more or less than the expected number of delimiters per line then that line would be printed and copied to another file(Lines_FILE.txt) for analysis. For example:
1,a,b,c,d
2,e,f,g,h
3,r,h,,u,j
Above the third line will be copied and pasted in a new file.The script is:
import string
### PLEASE DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT
k = 0
linecount=0
with open('Mock.txt',encoding="latin1") as myfile: #input file name with extension also if required update file encoding
for line in myfile:
k=0
linecount=linecount+1
words = line.split()
for i in words:
for letter in i:
#k=line.count('"|"') #Unhash and Update delimiter and Text Qualifier if text qualifier present
k=line.count(',') #Unhash and Update delimiter if no text qualifier
print("Lines:",linecount)
print(k)
if(k!=94): #Update the number of delimiters present in the first line or the expected delimiters per line.
print(line)
f = open("Lines_FILE.txt","a")
f.write(line)
It was working fine but suddenly i noticed for one file, the script had picked up a line which was not an error and pasted it in the Lines_FILE.txt. I noticed that the script had picked up a line and
in the Lines_FILE.txt file half the line was moved to the next line whereas in the actual data this was not the case. This was the line :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
And the extracted line looked like :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
The line got pushed to next line after the 'with Sanmina' and 'Jabil' text. I noticed the same pattern for few more lines. I guess it has to do something with the gap after those texts.
To sum up the problem, while reading through the data the script is breaking few lines and considering those as error lines. As im new to python, it would be of great help if some one can guide me regarding this issue.
python csv
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I had wrote a code, which counts the number of delimiters in a line and if the number of delimiters present in a line is more or less than the expected number of delimiters per line then that line would be printed and copied to another file(Lines_FILE.txt) for analysis. For example:
1,a,b,c,d
2,e,f,g,h
3,r,h,,u,j
Above the third line will be copied and pasted in a new file.The script is:
import string
### PLEASE DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT
k = 0
linecount=0
with open('Mock.txt',encoding="latin1") as myfile: #input file name with extension also if required update file encoding
for line in myfile:
k=0
linecount=linecount+1
words = line.split()
for i in words:
for letter in i:
#k=line.count('"|"') #Unhash and Update delimiter and Text Qualifier if text qualifier present
k=line.count(',') #Unhash and Update delimiter if no text qualifier
print("Lines:",linecount)
print(k)
if(k!=94): #Update the number of delimiters present in the first line or the expected delimiters per line.
print(line)
f = open("Lines_FILE.txt","a")
f.write(line)
It was working fine but suddenly i noticed for one file, the script had picked up a line which was not an error and pasted it in the Lines_FILE.txt. I noticed that the script had picked up a line and
in the Lines_FILE.txt file half the line was moved to the next line whereas in the actual data this was not the case. This was the line :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
And the extracted line looked like :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
The line got pushed to next line after the 'with Sanmina' and 'Jabil' text. I noticed the same pattern for few more lines. I guess it has to do something with the gap after those texts.
To sum up the problem, while reading through the data the script is breaking few lines and considering those as error lines. As im new to python, it would be of great help if some one can guide me regarding this issue.
python csv
I had wrote a code, which counts the number of delimiters in a line and if the number of delimiters present in a line is more or less than the expected number of delimiters per line then that line would be printed and copied to another file(Lines_FILE.txt) for analysis. For example:
1,a,b,c,d
2,e,f,g,h
3,r,h,,u,j
Above the third line will be copied and pasted in a new file.The script is:
import string
### PLEASE DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT
k = 0
linecount=0
with open('Mock.txt',encoding="latin1") as myfile: #input file name with extension also if required update file encoding
for line in myfile:
k=0
linecount=linecount+1
words = line.split()
for i in words:
for letter in i:
#k=line.count('"|"') #Unhash and Update delimiter and Text Qualifier if text qualifier present
k=line.count(',') #Unhash and Update delimiter if no text qualifier
print("Lines:",linecount)
print(k)
if(k!=94): #Update the number of delimiters present in the first line or the expected delimiters per line.
print(line)
f = open("Lines_FILE.txt","a")
f.write(line)
It was working fine but suddenly i noticed for one file, the script had picked up a line which was not an error and pasted it in the Lines_FILE.txt. I noticed that the script had picked up a line and
in the Lines_FILE.txt file half the line was moved to the next line whereas in the actual data this was not the case. This was the line :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
And the extracted line looked like :
10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil
,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2
The line got pushed to next line after the 'with Sanmina' and 'Jabil' text. I noticed the same pattern for few more lines. I guess it has to do something with the gap after those texts.
To sum up the problem, while reading through the data the script is breaking few lines and considering those as error lines. As im new to python, it would be of great help if some one can guide me regarding this issue.
python csv
python csv
asked Nov 11 at 3:15
Siddhartha
195
195
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.
line_no = 1
with open("Mock.txt", "r", encoding="latin1") as infile:
with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:
for line in infile:
delim_count = line.count(",")
print("Line: ", line_no)
if delim_count != 94:
print(line)
outfile.write(line)
This should read and write the files in the same encoding.
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.
line_no = 1
with open("Mock.txt", "r", encoding="latin1") as infile:
with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:
for line in infile:
delim_count = line.count(",")
print("Line: ", line_no)
if delim_count != 94:
print(line)
outfile.write(line)
This should read and write the files in the same encoding.
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
add a comment |
up vote
0
down vote
The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.
line_no = 1
with open("Mock.txt", "r", encoding="latin1") as infile:
with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:
for line in infile:
delim_count = line.count(",")
print("Line: ", line_no)
if delim_count != 94:
print(line)
outfile.write(line)
This should read and write the files in the same encoding.
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
add a comment |
up vote
0
down vote
up vote
0
down vote
The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.
line_no = 1
with open("Mock.txt", "r", encoding="latin1") as infile:
with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:
for line in infile:
delim_count = line.count(",")
print("Line: ", line_no)
if delim_count != 94:
print(line)
outfile.write(line)
This should read and write the files in the same encoding.
The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.
line_no = 1
with open("Mock.txt", "r", encoding="latin1") as infile:
with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:
for line in infile:
delim_count = line.count(",")
print("Line: ", line_no)
if delim_count != 94:
print(line)
outfile.write(line)
This should read and write the files in the same encoding.
answered Nov 11 at 3:57
Arunmozhi
451210
451210
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
add a comment |
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245548%2fpython-script-moving-data-to-next-line-while-reading-the-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown