Removing everything from the taglist











up vote
0
down vote

favorite












I'm trying to understand the necessity to delete everything from the array in the last string.



The task is:
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.



    #Position / count - 3 variant
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

taglist=list()
url=input("Enter URL: ")
count=int(input("Enter count:"))
position=int(input("Enter position:"))
for i in range(count):
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
print ("Retrieving:",url)









share|improve this question




























    up vote
    0
    down vote

    favorite












    I'm trying to understand the necessity to delete everything from the array in the last string.



    The task is:
    Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.



        #Position / count - 3 variant
    import urllib.request, urllib.parse, urllib.error
    from bs4 import BeautifulSoup
    import ssl
    # Ignore SSL certificate errors
    ctx = ssl.create_default_context()
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE

    taglist=list()
    url=input("Enter URL: ")
    count=int(input("Enter count:"))
    position=int(input("Enter position:"))
    for i in range(count):
    html = urllib.request.urlopen(url, context=ctx).read()
    soup = BeautifulSoup(html, 'html.parser')
    tags=soup('a')
    for tag in tags:
    taglist.append(tag)
    url = taglist[position-1].get('href', None)
    del taglist[:]
    print ("Retrieving:",url)









    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I'm trying to understand the necessity to delete everything from the array in the last string.



      The task is:
      Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.



          #Position / count - 3 variant
      import urllib.request, urllib.parse, urllib.error
      from bs4 import BeautifulSoup
      import ssl
      # Ignore SSL certificate errors
      ctx = ssl.create_default_context()
      ctx.check_hostname = False
      ctx.verify_mode = ssl.CERT_NONE

      taglist=list()
      url=input("Enter URL: ")
      count=int(input("Enter count:"))
      position=int(input("Enter position:"))
      for i in range(count):
      html = urllib.request.urlopen(url, context=ctx).read()
      soup = BeautifulSoup(html, 'html.parser')
      tags=soup('a')
      for tag in tags:
      taglist.append(tag)
      url = taglist[position-1].get('href', None)
      del taglist[:]
      print ("Retrieving:",url)









      share|improve this question















      I'm trying to understand the necessity to delete everything from the array in the last string.



      The task is:
      Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.



          #Position / count - 3 variant
      import urllib.request, urllib.parse, urllib.error
      from bs4 import BeautifulSoup
      import ssl
      # Ignore SSL certificate errors
      ctx = ssl.create_default_context()
      ctx.check_hostname = False
      ctx.verify_mode = ssl.CERT_NONE

      taglist=list()
      url=input("Enter URL: ")
      count=int(input("Enter count:"))
      position=int(input("Enter position:"))
      for i in range(count):
      html = urllib.request.urlopen(url, context=ctx).read()
      soup = BeautifulSoup(html, 'html.parser')
      tags=soup('a')
      for tag in tags:
      taglist.append(tag)
      url = taglist[position-1].get('href', None)
      del taglist[:]
      print ("Retrieving:",url)






      python html beautifulsoup tags






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 11 at 8:50









      The Infected Drake

      317111




      317111










      asked Nov 11 at 4:04









      Maria Lavrovskaya

      111




      111
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote













          Although that isn't the way I would do it, this is so you start with a new taglist every time. In this line:



          for tag in tags:
          taglist.append(tag)


          you append to the taglist. If you delete the content of the list, you will start fresh each iteration of the outer for loop.



          The function would act differently when you index into the taglist if you had all the tags in there from the previous iterations. The key lines to look at for this are:



          position=int(input("Enter position:"))


          and



          url = taglist[position-1].get('href', None)


          If you didn't reset the taglist, position-1 would correspond to a different element.





          I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.



          # Instead of this
          tags=soup('a')
          for tag in tags:
          taglist.append(tag)
          url = taglist[position-1].get('href', None)
          del taglist[:]

          # I would use this:
          taglist = [tag for tag in soup('a')]
          url = taglist[position-1].get('href', None)





          share|improve this answer























          • Thank you, got it. How would you do this?
            – Maria Lavrovskaya
            Nov 11 at 9:23










          • @MariaLavrovskaya See addition in answer
            – Stephen Cowley
            Nov 11 at 13:43











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245746%2fremoving-everything-from-the-taglist%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote













          Although that isn't the way I would do it, this is so you start with a new taglist every time. In this line:



          for tag in tags:
          taglist.append(tag)


          you append to the taglist. If you delete the content of the list, you will start fresh each iteration of the outer for loop.



          The function would act differently when you index into the taglist if you had all the tags in there from the previous iterations. The key lines to look at for this are:



          position=int(input("Enter position:"))


          and



          url = taglist[position-1].get('href', None)


          If you didn't reset the taglist, position-1 would correspond to a different element.





          I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.



          # Instead of this
          tags=soup('a')
          for tag in tags:
          taglist.append(tag)
          url = taglist[position-1].get('href', None)
          del taglist[:]

          # I would use this:
          taglist = [tag for tag in soup('a')]
          url = taglist[position-1].get('href', None)





          share|improve this answer























          • Thank you, got it. How would you do this?
            – Maria Lavrovskaya
            Nov 11 at 9:23










          • @MariaLavrovskaya See addition in answer
            – Stephen Cowley
            Nov 11 at 13:43















          up vote
          1
          down vote













          Although that isn't the way I would do it, this is so you start with a new taglist every time. In this line:



          for tag in tags:
          taglist.append(tag)


          you append to the taglist. If you delete the content of the list, you will start fresh each iteration of the outer for loop.



          The function would act differently when you index into the taglist if you had all the tags in there from the previous iterations. The key lines to look at for this are:



          position=int(input("Enter position:"))


          and



          url = taglist[position-1].get('href', None)


          If you didn't reset the taglist, position-1 would correspond to a different element.





          I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.



          # Instead of this
          tags=soup('a')
          for tag in tags:
          taglist.append(tag)
          url = taglist[position-1].get('href', None)
          del taglist[:]

          # I would use this:
          taglist = [tag for tag in soup('a')]
          url = taglist[position-1].get('href', None)





          share|improve this answer























          • Thank you, got it. How would you do this?
            – Maria Lavrovskaya
            Nov 11 at 9:23










          • @MariaLavrovskaya See addition in answer
            – Stephen Cowley
            Nov 11 at 13:43













          up vote
          1
          down vote










          up vote
          1
          down vote









          Although that isn't the way I would do it, this is so you start with a new taglist every time. In this line:



          for tag in tags:
          taglist.append(tag)


          you append to the taglist. If you delete the content of the list, you will start fresh each iteration of the outer for loop.



          The function would act differently when you index into the taglist if you had all the tags in there from the previous iterations. The key lines to look at for this are:



          position=int(input("Enter position:"))


          and



          url = taglist[position-1].get('href', None)


          If you didn't reset the taglist, position-1 would correspond to a different element.





          I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.



          # Instead of this
          tags=soup('a')
          for tag in tags:
          taglist.append(tag)
          url = taglist[position-1].get('href', None)
          del taglist[:]

          # I would use this:
          taglist = [tag for tag in soup('a')]
          url = taglist[position-1].get('href', None)





          share|improve this answer














          Although that isn't the way I would do it, this is so you start with a new taglist every time. In this line:



          for tag in tags:
          taglist.append(tag)


          you append to the taglist. If you delete the content of the list, you will start fresh each iteration of the outer for loop.



          The function would act differently when you index into the taglist if you had all the tags in there from the previous iterations. The key lines to look at for this are:



          position=int(input("Enter position:"))


          and



          url = taglist[position-1].get('href', None)


          If you didn't reset the taglist, position-1 would correspond to a different element.





          I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.



          # Instead of this
          tags=soup('a')
          for tag in tags:
          taglist.append(tag)
          url = taglist[position-1].get('href', None)
          del taglist[:]

          # I would use this:
          taglist = [tag for tag in soup('a')]
          url = taglist[position-1].get('href', None)






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 11 at 13:43

























          answered Nov 11 at 4:15









          Stephen Cowley

          798215




          798215












          • Thank you, got it. How would you do this?
            – Maria Lavrovskaya
            Nov 11 at 9:23










          • @MariaLavrovskaya See addition in answer
            – Stephen Cowley
            Nov 11 at 13:43


















          • Thank you, got it. How would you do this?
            – Maria Lavrovskaya
            Nov 11 at 9:23










          • @MariaLavrovskaya See addition in answer
            – Stephen Cowley
            Nov 11 at 13:43
















          Thank you, got it. How would you do this?
          – Maria Lavrovskaya
          Nov 11 at 9:23




          Thank you, got it. How would you do this?
          – Maria Lavrovskaya
          Nov 11 at 9:23












          @MariaLavrovskaya See addition in answer
          – Stephen Cowley
          Nov 11 at 13:43




          @MariaLavrovskaya See addition in answer
          – Stephen Cowley
          Nov 11 at 13:43


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245746%2fremoving-everything-from-the-taglist%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Bressuire

          Vorschmack

          Quarantine