Python script moving data to next line while reading the data

up vote
0
down vote

favorite

I had wrote a code, which counts the number of delimiters in a line and if the number of delimiters present in a line is more or less than the expected number of delimiters per line then that line would be printed and copied to another file(Lines_FILE.txt) for analysis. For example:

1,a,b,c,d

2,e,f,g,h

3,r,h,,u,j

Above the third line will be copied and pasted in a new file.The script is:

import string



### PLEASE  DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT



k = 0

linecount=0



with open('Mock.txt',encoding="latin1") as myfile:  #input file name with extension also if required update file encoding

    for line in myfile:

        k=0

        linecount=linecount+1

        words = line.split()

        for i in words:

            for letter in i:

                    #k=line.count('"|"')  #Unhash and Update delimiter and Text Qualifier if text qualifier present

                    k=line.count(',')    #Unhash and Update delimiter if no text qualifier

        print("Lines:",linecount)

        print(k)

        if(k!=94):  #Update the number of delimiters present in the first line or the expected delimiters per line.

            print(line)

            f = open("Lines_FILE.txt","a")

            f.write(line)

It was working fine but suddenly i noticed for one file, the script had picked up a line which was not an error and pasted it in the Lines_FILE.txt. I noticed that the script had picked up a line and
in the Lines_FILE.txt file half the line was moved to the next line whereas in the actual data this was not the case. This was the line :

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

And the extracted line looked like :

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

The line got pushed to next line after the 'with Sanmina' and 'Jabil' text. I noticed the same pattern for few more lines. I guess it has to do something with the gap after those texts.
To sum up the problem, while reading through the data the script is breaking few lines and considering those as error lines. As im new to python, it would be of great help if some one can guide me regarding this issue.

asked Nov 11 at 3:15

Siddhartha

195

add a comment |

up vote
0
down vote

favorite

1,a,b,c,d

2,e,f,g,h

3,r,h,,u,j

Above the third line will be copied and pasted in a new file.The script is:

import string



### PLEASE  DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT



k = 0

linecount=0



with open('Mock.txt',encoding="latin1") as myfile:  #input file name with extension also if required update file encoding

    for line in myfile:

        k=0

        linecount=linecount+1

        words = line.split()

        for i in words:

            for letter in i:

                    #k=line.count('"|"')  #Unhash and Update delimiter and Text Qualifier if text qualifier present

                    k=line.count(',')    #Unhash and Update delimiter if no text qualifier

        print("Lines:",linecount)

        print(k)

        if(k!=94):  #Update the number of delimiters present in the first line or the expected delimiters per line.

            print(line)

            f = open("Lines_FILE.txt","a")

            f.write(line)

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

And the extracted line looked like :

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

asked Nov 11 at 3:15

Siddhartha

195

add a comment |

up vote
0
down vote

favorite

1,a,b,c,d

2,e,f,g,h

3,r,h,,u,j

Above the third line will be copied and pasted in a new file.The script is:

import string



### PLEASE  DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT



k = 0

linecount=0



with open('Mock.txt',encoding="latin1") as myfile:  #input file name with extension also if required update file encoding

    for line in myfile:

        k=0

        linecount=linecount+1

        words = line.split()

        for i in words:

            for letter in i:

                    #k=line.count('"|"')  #Unhash and Update delimiter and Text Qualifier if text qualifier present

                    k=line.count(',')    #Unhash and Update delimiter if no text qualifier

        print("Lines:",linecount)

        print(k)

        if(k!=94):  #Update the number of delimiters present in the first line or the expected delimiters per line.

            print(line)

            f = open("Lines_FILE.txt","a")

            f.write(line)

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

And the extracted line looked like :

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

asked Nov 11 at 3:15

Siddhartha

195

1,a,b,c,d

2,e,f,g,h

3,r,h,,u,j

Above the third line will be copied and pasted in a new file.The script is:

import string



### PLEASE  DELETE THE FILE "Lines_FILE.txt" BEFORE RUNNING THIS SCRIPT



k = 0

linecount=0



with open('Mock.txt',encoding="latin1") as myfile:  #input file name with extension also if required update file encoding

    for line in myfile:

        k=0

        linecount=linecount+1

        words = line.split()

        for i in words:

            for letter in i:

                    #k=line.count('"|"')  #Unhash and Update delimiter and Text Qualifier if text qualifier present

                    k=line.count(',')    #Unhash and Update delimiter if no text qualifier

        print("Lines:",linecount)

        print(k)

        if(k!=94):  #Update the number of delimiters present in the first line or the expected delimiters per line.

            print(line)

            f = open("Lines_FILE.txt","a")

            f.write(line)

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil    ,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

And the extracted line looked like :

10804395,1,10/4/2018 6:45:27 PM,742443,23,2122804,OCT-18,10/4/2018,P,10/4/2018 6:44:34 PM,742443,,,2779094.44,,2779094.44,Reclass since no Physical inventory with Sanmina

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

10804396,1,10/4/2018 6:45:27 PM,742443,23,2122805,OCT-18,10/4/2018,P,10/4/2018 6:44:35 PM,742443,,235530.26,,235530.26,,Fresh billing to Jabil against sanmina inventory movement reconciled to open POs from Jabil

,,,,,,,,,JE_AUTO_FILE_renurana_Sep-18_11_6720973_10-04-2018_104704_36,,,,,,,,,,,,,,,,,,Manual JE File Name,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2

python csv

asked Nov 11 at 3:15

Siddhartha

195

asked Nov 11 at 3:15

Siddhartha

195

asked Nov 11 at 3:15

Siddhartha

195

asked Nov 11 at 3:15

Siddhartha

195

asked Nov 11 at 3:15

Siddhartha

195

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

The reason could be the different ways you are handling the two files. The first file is in a specific encoding and the second one is in default. I can offer some improvements to the script you are using.

line_no = 1

with open("Mock.txt", "r", encoding="latin1") as infile:

  with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:

    for line in infile:

      delim_count = line.count(",")

      print("Line: ", line_no)

      if delim_count != 94:

        print(line)

        outfile.write(line)

This should read and write the files in the same encoding.

answered Nov 11 at 3:57

Arunmozhi

451210

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245548%2fpython-script-moving-data-to-next-line-while-reading-the-data%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

line_no = 1

with open("Mock.txt", "r", encoding="latin1") as infile:

  with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:

    for line in infile:

      delim_count = line.count(",")

      print("Line: ", line_no)

      if delim_count != 94:

        print(line)

        outfile.write(line)

This should read and write the files in the same encoding.

answered Nov 11 at 3:57

Arunmozhi

451210

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

add a comment |

up vote
0
down vote

line_no = 1

with open("Mock.txt", "r", encoding="latin1") as infile:

  with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:

    for line in infile:

      delim_count = line.count(",")

      print("Line: ", line_no)

      if delim_count != 94:

        print(line)

        outfile.write(line)

This should read and write the files in the same encoding.

answered Nov 11 at 3:57

Arunmozhi

451210

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

add a comment |

up vote
0
down vote

line_no = 1

with open("Mock.txt", "r", encoding="latin1") as infile:

  with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:

    for line in infile:

      delim_count = line.count(",")

      print("Line: ", line_no)

      if delim_count != 94:

        print(line)

        outfile.write(line)

This should read and write the files in the same encoding.

answered Nov 11 at 3:57

Arunmozhi

451210

line_no = 1

with open("Mock.txt", "r", encoding="latin1") as infile:

  with open("Lines_FILE.txt", "w", encoding="latin1") as outfile:

    for line in infile:

      delim_count = line.count(",")

      print("Line: ", line_no)

      if delim_count != 94:

        print(line)

        outfile.write(line)

This should read and write the files in the same encoding.

answered Nov 11 at 3:57

Arunmozhi

451210

answered Nov 11 at 3:57

Arunmozhi

451210

answered Nov 11 at 3:57

Arunmozhi

451210

answered Nov 11 at 3:57

Arunmozhi

451210

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

add a comment |

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

Hi, should i be using "w" ? as that will only override other lines which have delimiter mismatch.
– Siddhartha
Nov 11 at 4:46

I updated my script with yours. But i noticed that did not solve the issue. The same lines showed up :/
– Siddhartha
Nov 11 at 4:53

I used "w" so that a new file would be opened automatically without having to manually delete it every, and we aren't closing it until all the lines are saved, so everything in a single run would be caught
– Arunmozhi
Nov 11 at 6:58

If the encoding isn't affecting your output, I am not sure what is affecting it. Sorry.
– Arunmozhi
Nov 11 at 6:59

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

v3Q4737iRlmPfw WLz3hL6s nCZZh 0jOQ76ral6Aw njpAUoR1xEiivJyY,QeR2sA5bBcLFU1PpIz

搜尋此網誌

Vfrdtyky