When processing CSV data, how do I ignore the first line of data?












85














I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don't want Python to take the top row into account. How can I make sure Python ignores the first line?



This is the code so far:



import csv

with open('all16.csv', 'rb') as inf:
incsv = csv.reader(inf)
column = 1
datatype = float
data = (datatype(column) for row in incsv)
least_value = min(data)

print least_value


Could you also explain what you are doing, not just give the code? I am very very new to Python and would like to make sure I understand everything.










share|improve this question




















  • 4




    Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
    – Wooble
    Jul 5 '12 at 17:24












  • @Wooble Technically, it's a big generator of 1.0. :)
    – Dougal
    Jul 5 '12 at 17:26










  • @Dougal: comment fixed.
    – Wooble
    Jul 5 '12 at 17:27










  • @Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
    – Jon Clements
    Jul 5 '12 at 17:36












  • i had someone write up that code for me and didnt catch that, so thanks haha!
    – user1496646
    Jul 5 '12 at 18:41
















85














I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don't want Python to take the top row into account. How can I make sure Python ignores the first line?



This is the code so far:



import csv

with open('all16.csv', 'rb') as inf:
incsv = csv.reader(inf)
column = 1
datatype = float
data = (datatype(column) for row in incsv)
least_value = min(data)

print least_value


Could you also explain what you are doing, not just give the code? I am very very new to Python and would like to make sure I understand everything.










share|improve this question




















  • 4




    Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
    – Wooble
    Jul 5 '12 at 17:24












  • @Wooble Technically, it's a big generator of 1.0. :)
    – Dougal
    Jul 5 '12 at 17:26










  • @Dougal: comment fixed.
    – Wooble
    Jul 5 '12 at 17:27










  • @Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
    – Jon Clements
    Jul 5 '12 at 17:36












  • i had someone write up that code for me and didnt catch that, so thanks haha!
    – user1496646
    Jul 5 '12 at 18:41














85












85








85


27





I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don't want Python to take the top row into account. How can I make sure Python ignores the first line?



This is the code so far:



import csv

with open('all16.csv', 'rb') as inf:
incsv = csv.reader(inf)
column = 1
datatype = float
data = (datatype(column) for row in incsv)
least_value = min(data)

print least_value


Could you also explain what you are doing, not just give the code? I am very very new to Python and would like to make sure I understand everything.










share|improve this question















I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don't want Python to take the top row into account. How can I make sure Python ignores the first line?



This is the code so far:



import csv

with open('all16.csv', 'rb') as inf:
incsv = csv.reader(inf)
column = 1
datatype = float
data = (datatype(column) for row in incsv)
least_value = min(data)

print least_value


Could you also explain what you are doing, not just give the code? I am very very new to Python and would like to make sure I understand everything.







python csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 28 '14 at 15:21









martineau

66.2k989178




66.2k989178










asked Jul 5 '12 at 17:20







user1496646















  • 4




    Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
    – Wooble
    Jul 5 '12 at 17:24












  • @Wooble Technically, it's a big generator of 1.0. :)
    – Dougal
    Jul 5 '12 at 17:26










  • @Dougal: comment fixed.
    – Wooble
    Jul 5 '12 at 17:27










  • @Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
    – Jon Clements
    Jul 5 '12 at 17:36












  • i had someone write up that code for me and didnt catch that, so thanks haha!
    – user1496646
    Jul 5 '12 at 18:41














  • 4




    Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
    – Wooble
    Jul 5 '12 at 17:24












  • @Wooble Technically, it's a big generator of 1.0. :)
    – Dougal
    Jul 5 '12 at 17:26










  • @Dougal: comment fixed.
    – Wooble
    Jul 5 '12 at 17:27










  • @Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
    – Jon Clements
    Jul 5 '12 at 17:36












  • i had someone write up that code for me and didnt catch that, so thanks haha!
    – user1496646
    Jul 5 '12 at 18:41








4




4




Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
– Wooble
Jul 5 '12 at 17:24






Are you aware that you're just creating a generator that returns a 1.0 for each line in your file and then taking the minimum, which is going to be 1.0?
– Wooble
Jul 5 '12 at 17:24














@Wooble Technically, it's a big generator of 1.0. :)
– Dougal
Jul 5 '12 at 17:26




@Wooble Technically, it's a big generator of 1.0. :)
– Dougal
Jul 5 '12 at 17:26












@Dougal: comment fixed.
– Wooble
Jul 5 '12 at 17:27




@Dougal: comment fixed.
– Wooble
Jul 5 '12 at 17:27












@Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
– Jon Clements
Jul 5 '12 at 17:36






@Wooble good catch - ...datatype(row[column]... is what I guess the OP is trying to achieve though
– Jon Clements
Jul 5 '12 at 17:36














i had someone write up that code for me and didnt catch that, so thanks haha!
– user1496646
Jul 5 '12 at 18:41




i had someone write up that code for me and didnt catch that, so thanks haha!
– user1496646
Jul 5 '12 at 18:41












14 Answers
14






active

oldest

votes


















91














You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:



import csv

with open('all16.csv', 'r', newline='') as file:
has_header = csv.Sniffer().has_header(file.read(1024))
file.seek(0) # Rewind.
reader = csv.reader(file)
if has_header:
next(reader) # Skip header row.
column = 1
datatype = float
data = (datatype(row[column]) for row in reader)
least_value = min(data)

print(least_value)


Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:



    data = (float(row[1]) for row in reader)


Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:



with open('all16.csv', 'rb') as file:





share|improve this answer



















  • 1




    Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
    – Anto
    Jan 12 '18 at 17:40






  • 1




    @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
    – martineau
    Jan 12 '18 at 18:40












  • Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
    – Anto
    Jan 15 '18 at 19:58










  • @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
    – martineau
    Jan 15 '18 at 20:03












  • I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
    – Anto
    Jan 15 '18 at 20:35



















46














To skip the first line just call:



next(inf)


Files in Python are iterators over lines.






share|improve this answer





























    20














    You would normally use next(incsv) which advances the iterator one row, so you skip the header. The other (say you wanted to skip 30 rows) would be:



    from itertools import islice
    for row in islice(incsv, 30, None):
    # process





    share|improve this answer





























      18














      In a similar use case I had to skip annoying lines before the line with my actual column names. This solution worked nicely. Read the file first, then pass the list to csv.DictReader.



      with open('all16.csv') as tmp:
      # Skip first line (if any)
      next(tmp, None)

      # {line_num: row}
      data = dict(enumerate(csv.DictReader(tmp)))





      share|improve this answer























      • Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
        – Maarten
        May 27 '15 at 14:25






      • 1




        I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
        – Veedrac
        May 27 '15 at 14:42










      • Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
        – Maarten
        May 28 '15 at 18:33






      • 1




        It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
        – Veedrac
        May 28 '15 at 19:46






      • 1




        FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
        – Veedrac
        May 28 '15 at 19:46





















      8














      Borrowed from python cookbook,

      A more concise template code might look like this:



      import csv
      with open('stocks.csv') as f:
      f_csv = csv.reader(f)
      headers = next(f_csv)
      for row in f_csv:
      # Process row ...





      share|improve this answer





























        6














        use csv.DictReader instead of csv.Reader.
        If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc






        share|improve this answer





























          2














          The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.



          import pandas as pd

          data = pd.read_csv('all16.csv')
          data.min()





          share|improve this answer





















          • and you can write it in one line too: pd.read_csv('all16.csv').min()
            – Finn Årup Nielsen
            Aug 28 '14 at 15:46



















          1














          Well, my mini wrapper library would do the job as well.



          >>> import pyexcel as pe
          >>> data = pe.load('all16.csv', name_columns_by_row=0)
          >>> min(data.column[1])


          Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:



          >>> min(data.column["Column 1"])





          share|improve this answer





























            1














            For me the easiest way to go is to use range.



            import csv

            with open('files/filename.csv') as I:
            reader = csv.reader(I)
            fulllist = list(reader)

            # Starting with data skipping header
            for item in range(1, len(fulllist)):
            # Print each row using "item" as the index value
            print (fulllist[item])





            share|improve this answer





























              1














              The documentation for the Python 3 CSV module provides this example:



              with open('example.csv', newline='') as csvfile:
              dialect = csv.Sniffer().sniff(csvfile.read(1024))
              csvfile.seek(0)
              reader = csv.reader(csvfile, dialect)
              # ... process CSV file contents here ...


              The Sniffer will try to auto-detect many things about the CSV file. You need to explicitly call its has_header() method to determine whether the file has a header line. If it does, then skip the first row when iterating the CSV rows. You can do it like this:



              if sniffer.has_header():
              for header_row in reader:
              break
              for data_row in reader:
              # do something with the row





              share|improve this answer































                0














                I would use tail to get rid of the unwanted first line:



                tail -n +2 $INFIL | whatever_script.py 





                share|improve this answer





























                  0














                  just add [1:]



                  example below:



                  data = pd.read_csv("/Users/xyz/Desktop/xyxData/xyz.csv", sep=',', header=None)**[1:]**


                  that works for me in iPython






                  share|improve this answer































                    0














                    Python 3.X



                    Handles UTF8 BOM + HEADER



                    It was quite frustrating that the csv module could not easily get the header, there is also a bug with the UTF-8 BOM (first char in file).
                    This works for me using only the csv module:



                    import csv

                    def read_csv(self, csv_path, delimiter):
                    with open(csv_path, newline='', encoding='utf-8') as f:
                    # https://bugs.python.org/issue7185
                    # Remove UTF8 BOM.
                    txt = f.read()[1:]

                    # Remove header line.
                    header = txt.splitlines()[:1]
                    lines = txt.splitlines()[1:]

                    # Convert to list.
                    csv_rows = list(csv.reader(lines, delimiter=delimiter))

                    for row in csv_rows:
                    value = row[INDEX_HERE]





                    share|improve this answer































                      0














                      Because this is related to something I was doing, I'll share here.



                      What if we're not sure if there's a header and you also don't feel like importing sniffer and other things?



                      If your task is basic, such as printing or appending to a list or array, you could just use an if statement:



                      # Let's say there's 4 columns
                      with open('file.csv') as csvfile:
                      csvreader = csv.reader(csvfile)
                      # read first line
                      first_line = next(csvreader)
                      # My headers were just text. You can use any suitable conditional here
                      if len(first_line) == 4:
                      array.append(first_line)
                      # Now we'll just iterate over everything else as usual:
                      for row in csvreader:
                      array.append(row)





                      share|improve this answer





















                        Your Answer






                        StackExchange.ifUsing("editor", function () {
                        StackExchange.using("externalEditor", function () {
                        StackExchange.using("snippets", function () {
                        StackExchange.snippets.init();
                        });
                        });
                        }, "code-snippets");

                        StackExchange.ready(function() {
                        var channelOptions = {
                        tags: "".split(" "),
                        id: "1"
                        };
                        initTagRenderer("".split(" "), "".split(" "), channelOptions);

                        StackExchange.using("externalEditor", function() {
                        // Have to fire editor after snippets, if snippets enabled
                        if (StackExchange.settings.snippets.snippetsEnabled) {
                        StackExchange.using("snippets", function() {
                        createEditor();
                        });
                        }
                        else {
                        createEditor();
                        }
                        });

                        function createEditor() {
                        StackExchange.prepareEditor({
                        heartbeatType: 'answer',
                        autoActivateHeartbeat: false,
                        convertImagesToLinks: true,
                        noModals: true,
                        showLowRepImageUploadWarning: true,
                        reputationToPostImages: 10,
                        bindNavPrevention: true,
                        postfix: "",
                        imageUploader: {
                        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                        allowUrls: true
                        },
                        onDemand: true,
                        discardSelector: ".discard-answer"
                        ,immediatelyShowMarkdownHelp:true
                        });


                        }
                        });














                        draft saved

                        draft discarded


















                        StackExchange.ready(
                        function () {
                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11349333%2fwhen-processing-csv-data-how-do-i-ignore-the-first-line-of-data%23new-answer', 'question_page');
                        }
                        );

                        Post as a guest















                        Required, but never shown
























                        14 Answers
                        14






                        active

                        oldest

                        votes








                        14 Answers
                        14






                        active

                        oldest

                        votes









                        active

                        oldest

                        votes






                        active

                        oldest

                        votes









                        91














                        You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:



                        import csv

                        with open('all16.csv', 'r', newline='') as file:
                        has_header = csv.Sniffer().has_header(file.read(1024))
                        file.seek(0) # Rewind.
                        reader = csv.reader(file)
                        if has_header:
                        next(reader) # Skip header row.
                        column = 1
                        datatype = float
                        data = (datatype(row[column]) for row in reader)
                        least_value = min(data)

                        print(least_value)


                        Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:



                            data = (float(row[1]) for row in reader)


                        Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:



                        with open('all16.csv', 'rb') as file:





                        share|improve this answer



















                        • 1




                          Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                          – Anto
                          Jan 12 '18 at 17:40






                        • 1




                          @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                          – martineau
                          Jan 12 '18 at 18:40












                        • Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                          – Anto
                          Jan 15 '18 at 19:58










                        • @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                          – martineau
                          Jan 15 '18 at 20:03












                        • I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                          – Anto
                          Jan 15 '18 at 20:35
















                        91














                        You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:



                        import csv

                        with open('all16.csv', 'r', newline='') as file:
                        has_header = csv.Sniffer().has_header(file.read(1024))
                        file.seek(0) # Rewind.
                        reader = csv.reader(file)
                        if has_header:
                        next(reader) # Skip header row.
                        column = 1
                        datatype = float
                        data = (datatype(row[column]) for row in reader)
                        least_value = min(data)

                        print(least_value)


                        Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:



                            data = (float(row[1]) for row in reader)


                        Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:



                        with open('all16.csv', 'rb') as file:





                        share|improve this answer



















                        • 1




                          Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                          – Anto
                          Jan 12 '18 at 17:40






                        • 1




                          @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                          – martineau
                          Jan 12 '18 at 18:40












                        • Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                          – Anto
                          Jan 15 '18 at 19:58










                        • @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                          – martineau
                          Jan 15 '18 at 20:03












                        • I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                          – Anto
                          Jan 15 '18 at 20:35














                        91












                        91








                        91






                        You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:



                        import csv

                        with open('all16.csv', 'r', newline='') as file:
                        has_header = csv.Sniffer().has_header(file.read(1024))
                        file.seek(0) # Rewind.
                        reader = csv.reader(file)
                        if has_header:
                        next(reader) # Skip header row.
                        column = 1
                        datatype = float
                        data = (datatype(row[column]) for row in reader)
                        least_value = min(data)

                        print(least_value)


                        Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:



                            data = (float(row[1]) for row in reader)


                        Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:



                        with open('all16.csv', 'rb') as file:





                        share|improve this answer














                        You could use an instance of the csv module's Sniffer class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next() function to skip over the first row only when necessary:



                        import csv

                        with open('all16.csv', 'r', newline='') as file:
                        has_header = csv.Sniffer().has_header(file.read(1024))
                        file.seek(0) # Rewind.
                        reader = csv.reader(file)
                        if has_header:
                        next(reader) # Skip header row.
                        column = 1
                        datatype = float
                        data = (datatype(row[column]) for row in reader)
                        least_value = min(data)

                        print(least_value)


                        Since datatype and column are hardcoded in your example, it would be slightly faster to process the row like this:



                            data = (float(row[1]) for row in reader)


                        Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:



                        with open('all16.csv', 'rb') as file:






                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Dec 9 '17 at 5:23

























                        answered Jul 5 '12 at 18:11









                        martineaumartineau

                        66.2k989178




                        66.2k989178








                        • 1




                          Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                          – Anto
                          Jan 12 '18 at 17:40






                        • 1




                          @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                          – martineau
                          Jan 12 '18 at 18:40












                        • Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                          – Anto
                          Jan 15 '18 at 19:58










                        • @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                          – martineau
                          Jan 15 '18 at 20:03












                        • I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                          – Anto
                          Jan 15 '18 at 20:35














                        • 1




                          Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                          – Anto
                          Jan 12 '18 at 17:40






                        • 1




                          @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                          – martineau
                          Jan 12 '18 at 18:40












                        • Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                          – Anto
                          Jan 15 '18 at 19:58










                        • @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                          – martineau
                          Jan 15 '18 at 20:03












                        • I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                          – Anto
                          Jan 15 '18 at 20:35








                        1




                        1




                        Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                        – Anto
                        Jan 12 '18 at 17:40




                        Instead of has_header(file.read(1024)), does it make sense to write has_header(file.readline()) ? I see that a lot, but I don't understand how has_reader() could detect whether or not there's a header from a single line of the CSV file...
                        – Anto
                        Jan 12 '18 at 17:40




                        1




                        1




                        @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                        – martineau
                        Jan 12 '18 at 18:40






                        @Anto: The code in my answer is based on the "example for Sniffer use" in the documentation, so I assume it's the prescribed way to do it. I agree that doing it on the basis of one line of data doesn't seem like it would always be enough data to make such a determination—but I have no idea since how the Sniffer works isn't described. FWIW I've never seen has_header(file.readline()) being used and even if it worked most of time, I would be highly suspicious of the approach for the reasons stated.
                        – martineau
                        Jan 12 '18 at 18:40














                        Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                        – Anto
                        Jan 15 '18 at 19:58




                        Thanks for your input. Nevertheless it seems that using file.read(1024) generates errors in python's csv lib: . See also here for instance.
                        – Anto
                        Jan 15 '18 at 19:58












                        @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                        – martineau
                        Jan 15 '18 at 20:03






                        @Anto: I've never encountered such an error—1024 bytes is not a lot of memory after all—nor has it been a problem for many other folks based on the up-votes this answer has received (as well as the thousands of of people who have read and followed the documentation). For those reasons I strongly suspect something else is causing your issue.
                        – martineau
                        Jan 15 '18 at 20:03














                        I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                        – Anto
                        Jan 15 '18 at 20:35




                        I ran into this exact same error as soon as I switched from readline() to read(1024). So far I've only managed to find people who have switched to readline to solve the csv.dialect issue.
                        – Anto
                        Jan 15 '18 at 20:35













                        46














                        To skip the first line just call:



                        next(inf)


                        Files in Python are iterators over lines.






                        share|improve this answer


























                          46














                          To skip the first line just call:



                          next(inf)


                          Files in Python are iterators over lines.






                          share|improve this answer
























                            46












                            46








                            46






                            To skip the first line just call:



                            next(inf)


                            Files in Python are iterators over lines.






                            share|improve this answer












                            To skip the first line just call:



                            next(inf)


                            Files in Python are iterators over lines.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jul 5 '12 at 18:15









                            jfsjfs

                            262k785531082




                            262k785531082























                                20














                                You would normally use next(incsv) which advances the iterator one row, so you skip the header. The other (say you wanted to skip 30 rows) would be:



                                from itertools import islice
                                for row in islice(incsv, 30, None):
                                # process





                                share|improve this answer


























                                  20














                                  You would normally use next(incsv) which advances the iterator one row, so you skip the header. The other (say you wanted to skip 30 rows) would be:



                                  from itertools import islice
                                  for row in islice(incsv, 30, None):
                                  # process





                                  share|improve this answer
























                                    20












                                    20








                                    20






                                    You would normally use next(incsv) which advances the iterator one row, so you skip the header. The other (say you wanted to skip 30 rows) would be:



                                    from itertools import islice
                                    for row in islice(incsv, 30, None):
                                    # process





                                    share|improve this answer












                                    You would normally use next(incsv) which advances the iterator one row, so you skip the header. The other (say you wanted to skip 30 rows) would be:



                                    from itertools import islice
                                    for row in islice(incsv, 30, None):
                                    # process






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Jul 5 '12 at 17:26









                                    Jon ClementsJon Clements

                                    98.6k19174219




                                    98.6k19174219























                                        18














                                        In a similar use case I had to skip annoying lines before the line with my actual column names. This solution worked nicely. Read the file first, then pass the list to csv.DictReader.



                                        with open('all16.csv') as tmp:
                                        # Skip first line (if any)
                                        next(tmp, None)

                                        # {line_num: row}
                                        data = dict(enumerate(csv.DictReader(tmp)))





                                        share|improve this answer























                                        • Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                          – Maarten
                                          May 27 '15 at 14:25






                                        • 1




                                          I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                          – Veedrac
                                          May 27 '15 at 14:42










                                        • Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                          – Maarten
                                          May 28 '15 at 18:33






                                        • 1




                                          It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                          – Veedrac
                                          May 28 '15 at 19:46






                                        • 1




                                          FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                          – Veedrac
                                          May 28 '15 at 19:46


















                                        18














                                        In a similar use case I had to skip annoying lines before the line with my actual column names. This solution worked nicely. Read the file first, then pass the list to csv.DictReader.



                                        with open('all16.csv') as tmp:
                                        # Skip first line (if any)
                                        next(tmp, None)

                                        # {line_num: row}
                                        data = dict(enumerate(csv.DictReader(tmp)))





                                        share|improve this answer























                                        • Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                          – Maarten
                                          May 27 '15 at 14:25






                                        • 1




                                          I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                          – Veedrac
                                          May 27 '15 at 14:42










                                        • Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                          – Maarten
                                          May 28 '15 at 18:33






                                        • 1




                                          It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                          – Veedrac
                                          May 28 '15 at 19:46






                                        • 1




                                          FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                          – Veedrac
                                          May 28 '15 at 19:46
















                                        18












                                        18








                                        18






                                        In a similar use case I had to skip annoying lines before the line with my actual column names. This solution worked nicely. Read the file first, then pass the list to csv.DictReader.



                                        with open('all16.csv') as tmp:
                                        # Skip first line (if any)
                                        next(tmp, None)

                                        # {line_num: row}
                                        data = dict(enumerate(csv.DictReader(tmp)))





                                        share|improve this answer














                                        In a similar use case I had to skip annoying lines before the line with my actual column names. This solution worked nicely. Read the file first, then pass the list to csv.DictReader.



                                        with open('all16.csv') as tmp:
                                        # Skip first line (if any)
                                        next(tmp, None)

                                        # {line_num: row}
                                        data = dict(enumerate(csv.DictReader(tmp)))






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited May 27 '15 at 14:40









                                        Veedrac

                                        39.2k1178129




                                        39.2k1178129










                                        answered Dec 18 '14 at 23:16









                                        MaartenMaarten

                                        1,038187




                                        1,038187












                                        • Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                          – Maarten
                                          May 27 '15 at 14:25






                                        • 1




                                          I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                          – Veedrac
                                          May 27 '15 at 14:42










                                        • Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                          – Maarten
                                          May 28 '15 at 18:33






                                        • 1




                                          It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                          – Veedrac
                                          May 28 '15 at 19:46






                                        • 1




                                          FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                          – Veedrac
                                          May 28 '15 at 19:46




















                                        • Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                          – Maarten
                                          May 27 '15 at 14:25






                                        • 1




                                          I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                          – Veedrac
                                          May 27 '15 at 14:42










                                        • Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                          – Maarten
                                          May 28 '15 at 18:33






                                        • 1




                                          It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                          – Veedrac
                                          May 28 '15 at 19:46






                                        • 1




                                          FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                          – Veedrac
                                          May 28 '15 at 19:46


















                                        Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                        – Maarten
                                        May 27 '15 at 14:25




                                        Thanks Veedrac. Happy to learn here, can you suggest edits that would solve the problems you cite? My solution gets the job done, but it looks like it could be further improved?
                                        – Maarten
                                        May 27 '15 at 14:25




                                        1




                                        1




                                        I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                        – Veedrac
                                        May 27 '15 at 14:42




                                        I gave you an edit that replaces the code with something that should be identical (untested). Feel free to revert if it's not in line with what you mean. I'm still not sure why you're making the data dictionary, nor does this answer really add anything over the accepted one.
                                        – Veedrac
                                        May 27 '15 at 14:42












                                        Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                        – Maarten
                                        May 28 '15 at 18:33




                                        Thanks Veedrac! That looks very efficient indeed. I posted my answer because the accepted one was not working for me (can't remember the reason now). What would be the problem with defining data = dict() and then immediately filling it (as compared to your suggestion)?
                                        – Maarten
                                        May 28 '15 at 18:33




                                        1




                                        1




                                        It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                        – Veedrac
                                        May 28 '15 at 19:46




                                        It's not wrong to do data = dict() and fill it in, but it's inefficient and not idiomatic. Plus, one should use dict literals ({}) and enumerate even then.
                                        – Veedrac
                                        May 28 '15 at 19:46




                                        1




                                        1




                                        FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                        – Veedrac
                                        May 28 '15 at 19:46






                                        FWIW, you should reply to my posts with @Veedrac if you want to be sure I'm notified, although Stack Overflow seems to be able to guess from the username along. (I don't write @Maarten because the answerer will be notified by default.)
                                        – Veedrac
                                        May 28 '15 at 19:46













                                        8














                                        Borrowed from python cookbook,

                                        A more concise template code might look like this:



                                        import csv
                                        with open('stocks.csv') as f:
                                        f_csv = csv.reader(f)
                                        headers = next(f_csv)
                                        for row in f_csv:
                                        # Process row ...





                                        share|improve this answer


























                                          8














                                          Borrowed from python cookbook,

                                          A more concise template code might look like this:



                                          import csv
                                          with open('stocks.csv') as f:
                                          f_csv = csv.reader(f)
                                          headers = next(f_csv)
                                          for row in f_csv:
                                          # Process row ...





                                          share|improve this answer
























                                            8












                                            8








                                            8






                                            Borrowed from python cookbook,

                                            A more concise template code might look like this:



                                            import csv
                                            with open('stocks.csv') as f:
                                            f_csv = csv.reader(f)
                                            headers = next(f_csv)
                                            for row in f_csv:
                                            # Process row ...





                                            share|improve this answer












                                            Borrowed from python cookbook,

                                            A more concise template code might look like this:



                                            import csv
                                            with open('stocks.csv') as f:
                                            f_csv = csv.reader(f)
                                            headers = next(f_csv)
                                            for row in f_csv:
                                            # Process row ...






                                            share|improve this answer












                                            share|improve this answer



                                            share|improve this answer










                                            answered Mar 31 '18 at 11:02









                                            shinshin

                                            11115




                                            11115























                                                6














                                                use csv.DictReader instead of csv.Reader.
                                                If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc






                                                share|improve this answer


























                                                  6














                                                  use csv.DictReader instead of csv.Reader.
                                                  If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc






                                                  share|improve this answer
























                                                    6












                                                    6








                                                    6






                                                    use csv.DictReader instead of csv.Reader.
                                                    If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc






                                                    share|improve this answer












                                                    use csv.DictReader instead of csv.Reader.
                                                    If the fieldnames parameter is omitted, the values in the first row of the csvfile will be used as field names. you would then be able to access field values using row["1"] etc







                                                    share|improve this answer












                                                    share|improve this answer



                                                    share|improve this answer










                                                    answered Jul 5 '12 at 17:53









                                                    iruvariruvar

                                                    17.6k53466




                                                    17.6k53466























                                                        2














                                                        The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.



                                                        import pandas as pd

                                                        data = pd.read_csv('all16.csv')
                                                        data.min()





                                                        share|improve this answer





















                                                        • and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                          – Finn Årup Nielsen
                                                          Aug 28 '14 at 15:46
















                                                        2














                                                        The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.



                                                        import pandas as pd

                                                        data = pd.read_csv('all16.csv')
                                                        data.min()





                                                        share|improve this answer





















                                                        • and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                          – Finn Årup Nielsen
                                                          Aug 28 '14 at 15:46














                                                        2












                                                        2








                                                        2






                                                        The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.



                                                        import pandas as pd

                                                        data = pd.read_csv('all16.csv')
                                                        data.min()





                                                        share|improve this answer












                                                        The new 'pandas' package might be more relevant than 'csv'. The code below will read a CSV file, by default interpreting the first line as the column header and find the minimum across columns.



                                                        import pandas as pd

                                                        data = pd.read_csv('all16.csv')
                                                        data.min()






                                                        share|improve this answer












                                                        share|improve this answer



                                                        share|improve this answer










                                                        answered Aug 28 '14 at 15:43









                                                        Finn Årup NielsenFinn Årup Nielsen

                                                        3,07011622




                                                        3,07011622












                                                        • and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                          – Finn Årup Nielsen
                                                          Aug 28 '14 at 15:46


















                                                        • and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                          – Finn Årup Nielsen
                                                          Aug 28 '14 at 15:46
















                                                        and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                        – Finn Årup Nielsen
                                                        Aug 28 '14 at 15:46




                                                        and you can write it in one line too: pd.read_csv('all16.csv').min()
                                                        – Finn Årup Nielsen
                                                        Aug 28 '14 at 15:46











                                                        1














                                                        Well, my mini wrapper library would do the job as well.



                                                        >>> import pyexcel as pe
                                                        >>> data = pe.load('all16.csv', name_columns_by_row=0)
                                                        >>> min(data.column[1])


                                                        Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:



                                                        >>> min(data.column["Column 1"])





                                                        share|improve this answer


























                                                          1














                                                          Well, my mini wrapper library would do the job as well.



                                                          >>> import pyexcel as pe
                                                          >>> data = pe.load('all16.csv', name_columns_by_row=0)
                                                          >>> min(data.column[1])


                                                          Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:



                                                          >>> min(data.column["Column 1"])





                                                          share|improve this answer
























                                                            1












                                                            1








                                                            1






                                                            Well, my mini wrapper library would do the job as well.



                                                            >>> import pyexcel as pe
                                                            >>> data = pe.load('all16.csv', name_columns_by_row=0)
                                                            >>> min(data.column[1])


                                                            Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:



                                                            >>> min(data.column["Column 1"])





                                                            share|improve this answer












                                                            Well, my mini wrapper library would do the job as well.



                                                            >>> import pyexcel as pe
                                                            >>> data = pe.load('all16.csv', name_columns_by_row=0)
                                                            >>> min(data.column[1])


                                                            Meanwhile, if you know what header column index one is, for example "Column 1", you can do this instead:



                                                            >>> min(data.column["Column 1"])






                                                            share|improve this answer












                                                            share|improve this answer



                                                            share|improve this answer










                                                            answered Dec 1 '14 at 10:18









                                                            chfwchfw

                                                            2,79621823




                                                            2,79621823























                                                                1














                                                                For me the easiest way to go is to use range.



                                                                import csv

                                                                with open('files/filename.csv') as I:
                                                                reader = csv.reader(I)
                                                                fulllist = list(reader)

                                                                # Starting with data skipping header
                                                                for item in range(1, len(fulllist)):
                                                                # Print each row using "item" as the index value
                                                                print (fulllist[item])





                                                                share|improve this answer


























                                                                  1














                                                                  For me the easiest way to go is to use range.



                                                                  import csv

                                                                  with open('files/filename.csv') as I:
                                                                  reader = csv.reader(I)
                                                                  fulllist = list(reader)

                                                                  # Starting with data skipping header
                                                                  for item in range(1, len(fulllist)):
                                                                  # Print each row using "item" as the index value
                                                                  print (fulllist[item])





                                                                  share|improve this answer
























                                                                    1












                                                                    1








                                                                    1






                                                                    For me the easiest way to go is to use range.



                                                                    import csv

                                                                    with open('files/filename.csv') as I:
                                                                    reader = csv.reader(I)
                                                                    fulllist = list(reader)

                                                                    # Starting with data skipping header
                                                                    for item in range(1, len(fulllist)):
                                                                    # Print each row using "item" as the index value
                                                                    print (fulllist[item])





                                                                    share|improve this answer












                                                                    For me the easiest way to go is to use range.



                                                                    import csv

                                                                    with open('files/filename.csv') as I:
                                                                    reader = csv.reader(I)
                                                                    fulllist = list(reader)

                                                                    # Starting with data skipping header
                                                                    for item in range(1, len(fulllist)):
                                                                    # Print each row using "item" as the index value
                                                                    print (fulllist[item])






                                                                    share|improve this answer












                                                                    share|improve this answer



                                                                    share|improve this answer










                                                                    answered Mar 12 '18 at 12:44









                                                                    Clint HartClint Hart

                                                                    111




                                                                    111























                                                                        1














                                                                        The documentation for the Python 3 CSV module provides this example:



                                                                        with open('example.csv', newline='') as csvfile:
                                                                        dialect = csv.Sniffer().sniff(csvfile.read(1024))
                                                                        csvfile.seek(0)
                                                                        reader = csv.reader(csvfile, dialect)
                                                                        # ... process CSV file contents here ...


                                                                        The Sniffer will try to auto-detect many things about the CSV file. You need to explicitly call its has_header() method to determine whether the file has a header line. If it does, then skip the first row when iterating the CSV rows. You can do it like this:



                                                                        if sniffer.has_header():
                                                                        for header_row in reader:
                                                                        break
                                                                        for data_row in reader:
                                                                        # do something with the row





                                                                        share|improve this answer




























                                                                          1














                                                                          The documentation for the Python 3 CSV module provides this example:



                                                                          with open('example.csv', newline='') as csvfile:
                                                                          dialect = csv.Sniffer().sniff(csvfile.read(1024))
                                                                          csvfile.seek(0)
                                                                          reader = csv.reader(csvfile, dialect)
                                                                          # ... process CSV file contents here ...


                                                                          The Sniffer will try to auto-detect many things about the CSV file. You need to explicitly call its has_header() method to determine whether the file has a header line. If it does, then skip the first row when iterating the CSV rows. You can do it like this:



                                                                          if sniffer.has_header():
                                                                          for header_row in reader:
                                                                          break
                                                                          for data_row in reader:
                                                                          # do something with the row





                                                                          share|improve this answer


























                                                                            1












                                                                            1








                                                                            1






                                                                            The documentation for the Python 3 CSV module provides this example:



                                                                            with open('example.csv', newline='') as csvfile:
                                                                            dialect = csv.Sniffer().sniff(csvfile.read(1024))
                                                                            csvfile.seek(0)
                                                                            reader = csv.reader(csvfile, dialect)
                                                                            # ... process CSV file contents here ...


                                                                            The Sniffer will try to auto-detect many things about the CSV file. You need to explicitly call its has_header() method to determine whether the file has a header line. If it does, then skip the first row when iterating the CSV rows. You can do it like this:



                                                                            if sniffer.has_header():
                                                                            for header_row in reader:
                                                                            break
                                                                            for data_row in reader:
                                                                            # do something with the row





                                                                            share|improve this answer














                                                                            The documentation for the Python 3 CSV module provides this example:



                                                                            with open('example.csv', newline='') as csvfile:
                                                                            dialect = csv.Sniffer().sniff(csvfile.read(1024))
                                                                            csvfile.seek(0)
                                                                            reader = csv.reader(csvfile, dialect)
                                                                            # ... process CSV file contents here ...


                                                                            The Sniffer will try to auto-detect many things about the CSV file. You need to explicitly call its has_header() method to determine whether the file has a header line. If it does, then skip the first row when iterating the CSV rows. You can do it like this:



                                                                            if sniffer.has_header():
                                                                            for header_row in reader:
                                                                            break
                                                                            for data_row in reader:
                                                                            # do something with the row






                                                                            share|improve this answer














                                                                            share|improve this answer



                                                                            share|improve this answer








                                                                            edited Nov 13 '18 at 10:37

























                                                                            answered Oct 9 '18 at 18:21









                                                                            LassiLassi

                                                                            620517




                                                                            620517























                                                                                0














                                                                                I would use tail to get rid of the unwanted first line:



                                                                                tail -n +2 $INFIL | whatever_script.py 





                                                                                share|improve this answer


























                                                                                  0














                                                                                  I would use tail to get rid of the unwanted first line:



                                                                                  tail -n +2 $INFIL | whatever_script.py 





                                                                                  share|improve this answer
























                                                                                    0












                                                                                    0








                                                                                    0






                                                                                    I would use tail to get rid of the unwanted first line:



                                                                                    tail -n +2 $INFIL | whatever_script.py 





                                                                                    share|improve this answer












                                                                                    I would use tail to get rid of the unwanted first line:



                                                                                    tail -n +2 $INFIL | whatever_script.py 






                                                                                    share|improve this answer












                                                                                    share|improve this answer



                                                                                    share|improve this answer










                                                                                    answered Sep 13 '15 at 10:26









                                                                                    Karel AdamsKarel Adams

                                                                                    76113




                                                                                    76113























                                                                                        0














                                                                                        just add [1:]



                                                                                        example below:



                                                                                        data = pd.read_csv("/Users/xyz/Desktop/xyxData/xyz.csv", sep=',', header=None)**[1:]**


                                                                                        that works for me in iPython






                                                                                        share|improve this answer




























                                                                                          0














                                                                                          just add [1:]



                                                                                          example below:



                                                                                          data = pd.read_csv("/Users/xyz/Desktop/xyxData/xyz.csv", sep=',', header=None)**[1:]**


                                                                                          that works for me in iPython






                                                                                          share|improve this answer


























                                                                                            0












                                                                                            0








                                                                                            0






                                                                                            just add [1:]



                                                                                            example below:



                                                                                            data = pd.read_csv("/Users/xyz/Desktop/xyxData/xyz.csv", sep=',', header=None)**[1:]**


                                                                                            that works for me in iPython






                                                                                            share|improve this answer














                                                                                            just add [1:]



                                                                                            example below:



                                                                                            data = pd.read_csv("/Users/xyz/Desktop/xyxData/xyz.csv", sep=',', header=None)**[1:]**


                                                                                            that works for me in iPython







                                                                                            share|improve this answer














                                                                                            share|improve this answer



                                                                                            share|improve this answer








                                                                                            edited Feb 29 '16 at 16:37









                                                                                            cricket_007

                                                                                            79.8k1142110




                                                                                            79.8k1142110










                                                                                            answered Nov 1 '15 at 0:02









                                                                                            aybukeaybuke

                                                                                            16324




                                                                                            16324























                                                                                                0














                                                                                                Python 3.X



                                                                                                Handles UTF8 BOM + HEADER



                                                                                                It was quite frustrating that the csv module could not easily get the header, there is also a bug with the UTF-8 BOM (first char in file).
                                                                                                This works for me using only the csv module:



                                                                                                import csv

                                                                                                def read_csv(self, csv_path, delimiter):
                                                                                                with open(csv_path, newline='', encoding='utf-8') as f:
                                                                                                # https://bugs.python.org/issue7185
                                                                                                # Remove UTF8 BOM.
                                                                                                txt = f.read()[1:]

                                                                                                # Remove header line.
                                                                                                header = txt.splitlines()[:1]
                                                                                                lines = txt.splitlines()[1:]

                                                                                                # Convert to list.
                                                                                                csv_rows = list(csv.reader(lines, delimiter=delimiter))

                                                                                                for row in csv_rows:
                                                                                                value = row[INDEX_HERE]





                                                                                                share|improve this answer




























                                                                                                  0














                                                                                                  Python 3.X



                                                                                                  Handles UTF8 BOM + HEADER



                                                                                                  It was quite frustrating that the csv module could not easily get the header, there is also a bug with the UTF-8 BOM (first char in file).
                                                                                                  This works for me using only the csv module:



                                                                                                  import csv

                                                                                                  def read_csv(self, csv_path, delimiter):
                                                                                                  with open(csv_path, newline='', encoding='utf-8') as f:
                                                                                                  # https://bugs.python.org/issue7185
                                                                                                  # Remove UTF8 BOM.
                                                                                                  txt = f.read()[1:]

                                                                                                  # Remove header line.
                                                                                                  header = txt.splitlines()[:1]
                                                                                                  lines = txt.splitlines()[1:]

                                                                                                  # Convert to list.
                                                                                                  csv_rows = list(csv.reader(lines, delimiter=delimiter))

                                                                                                  for row in csv_rows:
                                                                                                  value = row[INDEX_HERE]





                                                                                                  share|improve this answer


























                                                                                                    0












                                                                                                    0








                                                                                                    0






                                                                                                    Python 3.X



                                                                                                    Handles UTF8 BOM + HEADER



                                                                                                    It was quite frustrating that the csv module could not easily get the header, there is also a bug with the UTF-8 BOM (first char in file).
                                                                                                    This works for me using only the csv module:



                                                                                                    import csv

                                                                                                    def read_csv(self, csv_path, delimiter):
                                                                                                    with open(csv_path, newline='', encoding='utf-8') as f:
                                                                                                    # https://bugs.python.org/issue7185
                                                                                                    # Remove UTF8 BOM.
                                                                                                    txt = f.read()[1:]

                                                                                                    # Remove header line.
                                                                                                    header = txt.splitlines()[:1]
                                                                                                    lines = txt.splitlines()[1:]

                                                                                                    # Convert to list.
                                                                                                    csv_rows = list(csv.reader(lines, delimiter=delimiter))

                                                                                                    for row in csv_rows:
                                                                                                    value = row[INDEX_HERE]





                                                                                                    share|improve this answer














                                                                                                    Python 3.X



                                                                                                    Handles UTF8 BOM + HEADER



                                                                                                    It was quite frustrating that the csv module could not easily get the header, there is also a bug with the UTF-8 BOM (first char in file).
                                                                                                    This works for me using only the csv module:



                                                                                                    import csv

                                                                                                    def read_csv(self, csv_path, delimiter):
                                                                                                    with open(csv_path, newline='', encoding='utf-8') as f:
                                                                                                    # https://bugs.python.org/issue7185
                                                                                                    # Remove UTF8 BOM.
                                                                                                    txt = f.read()[1:]

                                                                                                    # Remove header line.
                                                                                                    header = txt.splitlines()[:1]
                                                                                                    lines = txt.splitlines()[1:]

                                                                                                    # Convert to list.
                                                                                                    csv_rows = list(csv.reader(lines, delimiter=delimiter))

                                                                                                    for row in csv_rows:
                                                                                                    value = row[INDEX_HERE]






                                                                                                    share|improve this answer














                                                                                                    share|improve this answer



                                                                                                    share|improve this answer








                                                                                                    edited Oct 26 '16 at 9:42

























                                                                                                    answered Oct 26 '16 at 9:32









                                                                                                    Christophe RoussyChristophe Roussy

                                                                                                    9,15615457




                                                                                                    9,15615457























                                                                                                        0














                                                                                                        Because this is related to something I was doing, I'll share here.



                                                                                                        What if we're not sure if there's a header and you also don't feel like importing sniffer and other things?



                                                                                                        If your task is basic, such as printing or appending to a list or array, you could just use an if statement:



                                                                                                        # Let's say there's 4 columns
                                                                                                        with open('file.csv') as csvfile:
                                                                                                        csvreader = csv.reader(csvfile)
                                                                                                        # read first line
                                                                                                        first_line = next(csvreader)
                                                                                                        # My headers were just text. You can use any suitable conditional here
                                                                                                        if len(first_line) == 4:
                                                                                                        array.append(first_line)
                                                                                                        # Now we'll just iterate over everything else as usual:
                                                                                                        for row in csvreader:
                                                                                                        array.append(row)





                                                                                                        share|improve this answer


























                                                                                                          0














                                                                                                          Because this is related to something I was doing, I'll share here.



                                                                                                          What if we're not sure if there's a header and you also don't feel like importing sniffer and other things?



                                                                                                          If your task is basic, such as printing or appending to a list or array, you could just use an if statement:



                                                                                                          # Let's say there's 4 columns
                                                                                                          with open('file.csv') as csvfile:
                                                                                                          csvreader = csv.reader(csvfile)
                                                                                                          # read first line
                                                                                                          first_line = next(csvreader)
                                                                                                          # My headers were just text. You can use any suitable conditional here
                                                                                                          if len(first_line) == 4:
                                                                                                          array.append(first_line)
                                                                                                          # Now we'll just iterate over everything else as usual:
                                                                                                          for row in csvreader:
                                                                                                          array.append(row)





                                                                                                          share|improve this answer
























                                                                                                            0












                                                                                                            0








                                                                                                            0






                                                                                                            Because this is related to something I was doing, I'll share here.



                                                                                                            What if we're not sure if there's a header and you also don't feel like importing sniffer and other things?



                                                                                                            If your task is basic, such as printing or appending to a list or array, you could just use an if statement:



                                                                                                            # Let's say there's 4 columns
                                                                                                            with open('file.csv') as csvfile:
                                                                                                            csvreader = csv.reader(csvfile)
                                                                                                            # read first line
                                                                                                            first_line = next(csvreader)
                                                                                                            # My headers were just text. You can use any suitable conditional here
                                                                                                            if len(first_line) == 4:
                                                                                                            array.append(first_line)
                                                                                                            # Now we'll just iterate over everything else as usual:
                                                                                                            for row in csvreader:
                                                                                                            array.append(row)





                                                                                                            share|improve this answer












                                                                                                            Because this is related to something I was doing, I'll share here.



                                                                                                            What if we're not sure if there's a header and you also don't feel like importing sniffer and other things?



                                                                                                            If your task is basic, such as printing or appending to a list or array, you could just use an if statement:



                                                                                                            # Let's say there's 4 columns
                                                                                                            with open('file.csv') as csvfile:
                                                                                                            csvreader = csv.reader(csvfile)
                                                                                                            # read first line
                                                                                                            first_line = next(csvreader)
                                                                                                            # My headers were just text. You can use any suitable conditional here
                                                                                                            if len(first_line) == 4:
                                                                                                            array.append(first_line)
                                                                                                            # Now we'll just iterate over everything else as usual:
                                                                                                            for row in csvreader:
                                                                                                            array.append(row)






                                                                                                            share|improve this answer












                                                                                                            share|improve this answer



                                                                                                            share|improve this answer










                                                                                                            answered May 1 '18 at 18:06









                                                                                                            Roy W.Roy W.

                                                                                                            86




                                                                                                            86






























                                                                                                                draft saved

                                                                                                                draft discarded




















































                                                                                                                Thanks for contributing an answer to Stack Overflow!


                                                                                                                • Please be sure to answer the question. Provide details and share your research!

                                                                                                                But avoid



                                                                                                                • Asking for help, clarification, or responding to other answers.

                                                                                                                • Making statements based on opinion; back them up with references or personal experience.


                                                                                                                To learn more, see our tips on writing great answers.




                                                                                                                draft saved


                                                                                                                draft discarded














                                                                                                                StackExchange.ready(
                                                                                                                function () {
                                                                                                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11349333%2fwhen-processing-csv-data-how-do-i-ignore-the-first-line-of-data%23new-answer', 'question_page');
                                                                                                                }
                                                                                                                );

                                                                                                                Post as a guest















                                                                                                                Required, but never shown





















































                                                                                                                Required, but never shown














                                                                                                                Required, but never shown












                                                                                                                Required, but never shown







                                                                                                                Required, but never shown

































                                                                                                                Required, but never shown














                                                                                                                Required, but never shown












                                                                                                                Required, but never shown







                                                                                                                Required, but never shown







                                                                                                                Popular posts from this blog

                                                                                                                Xamarin.iOS Cant Deploy on Iphone

                                                                                                                Glorious Revolution

                                                                                                                Dulmage-Mendelsohn matrix decomposition in Python