how to define parser when using BS4 in python











up vote
1
down vote

favorite












#!/usr/bin/env python

import requests
from bs4 import BeautifulSoup

url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
response = requests.get(url)
# parse html
page = str(BeautifulSoup(response.content))


def getURL(page):
"""

:param page: html of web page (here: Python home page)
:return: urls in that page
"""
start_link = page.find("a href")
if start_link == -1:
return None, 0
start_quote = page.find('"', start_link)
end_quote = page.find('"', start_quote + 1)
url = page[start_quote + 1: end_quote]
return url, end_quote

while True:
url, n = getURL(page)
page = page[n:]
if url:
print(url)
else:
break


I am using above code to get list of all youtube videos on webpage. If i try to do this. I get following error



The code that caused this warning is on line 9 of the file C:/Users/PycharmProjects/ReadCSVFile/venv/Links.py. To get rid of this warning, change code that looks like this:


I did and started using html but some different error came .



I am using Python 3.0 . I am using IDE Pycharm.



Can someone please help me this.










share|improve this question




























    up vote
    1
    down vote

    favorite












    #!/usr/bin/env python

    import requests
    from bs4 import BeautifulSoup

    url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
    response = requests.get(url)
    # parse html
    page = str(BeautifulSoup(response.content))


    def getURL(page):
    """

    :param page: html of web page (here: Python home page)
    :return: urls in that page
    """
    start_link = page.find("a href")
    if start_link == -1:
    return None, 0
    start_quote = page.find('"', start_link)
    end_quote = page.find('"', start_quote + 1)
    url = page[start_quote + 1: end_quote]
    return url, end_quote

    while True:
    url, n = getURL(page)
    page = page[n:]
    if url:
    print(url)
    else:
    break


    I am using above code to get list of all youtube videos on webpage. If i try to do this. I get following error



    The code that caused this warning is on line 9 of the file C:/Users/PycharmProjects/ReadCSVFile/venv/Links.py. To get rid of this warning, change code that looks like this:


    I did and started using html but some different error came .



    I am using Python 3.0 . I am using IDE Pycharm.



    Can someone please help me this.










    share|improve this question


























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      #!/usr/bin/env python

      import requests
      from bs4 import BeautifulSoup

      url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
      response = requests.get(url)
      # parse html
      page = str(BeautifulSoup(response.content))


      def getURL(page):
      """

      :param page: html of web page (here: Python home page)
      :return: urls in that page
      """
      start_link = page.find("a href")
      if start_link == -1:
      return None, 0
      start_quote = page.find('"', start_link)
      end_quote = page.find('"', start_quote + 1)
      url = page[start_quote + 1: end_quote]
      return url, end_quote

      while True:
      url, n = getURL(page)
      page = page[n:]
      if url:
      print(url)
      else:
      break


      I am using above code to get list of all youtube videos on webpage. If i try to do this. I get following error



      The code that caused this warning is on line 9 of the file C:/Users/PycharmProjects/ReadCSVFile/venv/Links.py. To get rid of this warning, change code that looks like this:


      I did and started using html but some different error came .



      I am using Python 3.0 . I am using IDE Pycharm.



      Can someone please help me this.










      share|improve this question















      #!/usr/bin/env python

      import requests
      from bs4 import BeautifulSoup

      url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
      response = requests.get(url)
      # parse html
      page = str(BeautifulSoup(response.content))


      def getURL(page):
      """

      :param page: html of web page (here: Python home page)
      :return: urls in that page
      """
      start_link = page.find("a href")
      if start_link == -1:
      return None, 0
      start_quote = page.find('"', start_link)
      end_quote = page.find('"', start_quote + 1)
      url = page[start_quote + 1: end_quote]
      return url, end_quote

      while True:
      url, n = getURL(page)
      page = page[n:]
      if url:
      print(url)
      else:
      break


      I am using above code to get list of all youtube videos on webpage. If i try to do this. I get following error



      The code that caused this warning is on line 9 of the file C:/Users/PycharmProjects/ReadCSVFile/venv/Links.py. To get rid of this warning, change code that looks like this:


      I did and started using html but some different error came .



      I am using Python 3.0 . I am using IDE Pycharm.



      Can someone please help me this.







      python-3.x beautifulsoup






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 11 at 11:13









      ewwink

      6,22122233




      6,22122233










      asked Nov 11 at 0:54









      NewtoPython

      296




      296
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          its not error, but warning you didn't set parser which can be 'html.parser', 'lxml', 'xml'. change it to like



          page = BeautifulSoup(response.content, 'html.parser')


          your code above actually not doing what BeautifulSoup do, but here the example using it.



          #!/usr/bin/env python

          import requests
          from bs4 import BeautifulSoup

          def getURL(url):
          """
          :param url: url of web page
          :return: urls in that page
          """
          response = requests.get(url)
          # parse html
          page = BeautifulSoup(response.content, 'html.parser')
          link_tags = page.find_all('a')
          urls = [x.get('href') for x in link_tags]
          return urls

          url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
          all_url = getURL(url)
          print('n'.join(all_url))





          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244883%2fhow-to-define-parser-when-using-bs4-in-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            its not error, but warning you didn't set parser which can be 'html.parser', 'lxml', 'xml'. change it to like



            page = BeautifulSoup(response.content, 'html.parser')


            your code above actually not doing what BeautifulSoup do, but here the example using it.



            #!/usr/bin/env python

            import requests
            from bs4 import BeautifulSoup

            def getURL(url):
            """
            :param url: url of web page
            :return: urls in that page
            """
            response = requests.get(url)
            # parse html
            page = BeautifulSoup(response.content, 'html.parser')
            link_tags = page.find_all('a')
            urls = [x.get('href') for x in link_tags]
            return urls

            url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
            all_url = getURL(url)
            print('n'.join(all_url))





            share|improve this answer

























              up vote
              0
              down vote













              its not error, but warning you didn't set parser which can be 'html.parser', 'lxml', 'xml'. change it to like



              page = BeautifulSoup(response.content, 'html.parser')


              your code above actually not doing what BeautifulSoup do, but here the example using it.



              #!/usr/bin/env python

              import requests
              from bs4 import BeautifulSoup

              def getURL(url):
              """
              :param url: url of web page
              :return: urls in that page
              """
              response = requests.get(url)
              # parse html
              page = BeautifulSoup(response.content, 'html.parser')
              link_tags = page.find_all('a')
              urls = [x.get('href') for x in link_tags]
              return urls

              url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
              all_url = getURL(url)
              print('n'.join(all_url))





              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                its not error, but warning you didn't set parser which can be 'html.parser', 'lxml', 'xml'. change it to like



                page = BeautifulSoup(response.content, 'html.parser')


                your code above actually not doing what BeautifulSoup do, but here the example using it.



                #!/usr/bin/env python

                import requests
                from bs4 import BeautifulSoup

                def getURL(url):
                """
                :param url: url of web page
                :return: urls in that page
                """
                response = requests.get(url)
                # parse html
                page = BeautifulSoup(response.content, 'html.parser')
                link_tags = page.find_all('a')
                urls = [x.get('href') for x in link_tags]
                return urls

                url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
                all_url = getURL(url)
                print('n'.join(all_url))





                share|improve this answer












                its not error, but warning you didn't set parser which can be 'html.parser', 'lxml', 'xml'. change it to like



                page = BeautifulSoup(response.content, 'html.parser')


                your code above actually not doing what BeautifulSoup do, but here the example using it.



                #!/usr/bin/env python

                import requests
                from bs4 import BeautifulSoup

                def getURL(url):
                """
                :param url: url of web page
                :return: urls in that page
                """
                response = requests.get(url)
                # parse html
                page = BeautifulSoup(response.content, 'html.parser')
                link_tags = page.find_all('a')
                urls = [x.get('href') for x in link_tags]
                return urls

                url = "https://www.youtube.com/channel/UCaKt8dvEIPnEHWSbLYhzrxg/videos"
                all_url = getURL(url)
                print('n'.join(all_url))






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 11 at 11:05









                ewwink

                6,22122233




                6,22122233






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244883%2fhow-to-define-parser-when-using-bs4-in-python%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Bressuire

                    Vorschmack

                    Quarantine