Removing everything from the taglist
up vote
0
down vote
favorite
I'm trying to understand the necessity to delete everything from the array in the last string.
The task is:
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.
#Position / count - 3 variant
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
taglist=list()
url=input("Enter URL: ")
count=int(input("Enter count:"))
position=int(input("Enter position:"))
for i in range(count):
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
print ("Retrieving:",url)
python html beautifulsoup tags
add a comment |
up vote
0
down vote
favorite
I'm trying to understand the necessity to delete everything from the array in the last string.
The task is:
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.
#Position / count - 3 variant
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
taglist=list()
url=input("Enter URL: ")
count=int(input("Enter count:"))
position=int(input("Enter position:"))
for i in range(count):
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
print ("Retrieving:",url)
python html beautifulsoup tags
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm trying to understand the necessity to delete everything from the array in the last string.
The task is:
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.
#Position / count - 3 variant
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
taglist=list()
url=input("Enter URL: ")
count=int(input("Enter count:"))
position=int(input("Enter position:"))
for i in range(count):
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
print ("Retrieving:",url)
python html beautifulsoup tags
I'm trying to understand the necessity to delete everything from the array in the last string.
The task is:
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.
#Position / count - 3 variant
import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
taglist=list()
url=input("Enter URL: ")
count=int(input("Enter count:"))
position=int(input("Enter position:"))
for i in range(count):
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
print ("Retrieving:",url)
python html beautifulsoup tags
python html beautifulsoup tags
edited Nov 11 at 8:50
The Infected Drake
317111
317111
asked Nov 11 at 4:04
Maria Lavrovskaya
111
111
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
Although that isn't the way I would do it, this is so you start with a new taglist
every time. In this line:
for tag in tags:
taglist.append(tag)
you append to the taglist
. If you delete the content of the list, you will start fresh each iteration of the outer for loop.
The function would act differently when you index into the taglist
if you had all the tags in there from the previous iterations. The key lines to look at for this are:
position=int(input("Enter position:"))
and
url = taglist[position-1].get('href', None)
If you didn't reset the taglist
, position-1
would correspond to a different element.
I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.
# Instead of this
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
# I would use this:
taglist = [tag for tag in soup('a')]
url = taglist[position-1].get('href', None)
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
Although that isn't the way I would do it, this is so you start with a new taglist
every time. In this line:
for tag in tags:
taglist.append(tag)
you append to the taglist
. If you delete the content of the list, you will start fresh each iteration of the outer for loop.
The function would act differently when you index into the taglist
if you had all the tags in there from the previous iterations. The key lines to look at for this are:
position=int(input("Enter position:"))
and
url = taglist[position-1].get('href', None)
If you didn't reset the taglist
, position-1
would correspond to a different element.
I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.
# Instead of this
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
# I would use this:
taglist = [tag for tag in soup('a')]
url = taglist[position-1].get('href', None)
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
add a comment |
up vote
1
down vote
Although that isn't the way I would do it, this is so you start with a new taglist
every time. In this line:
for tag in tags:
taglist.append(tag)
you append to the taglist
. If you delete the content of the list, you will start fresh each iteration of the outer for loop.
The function would act differently when you index into the taglist
if you had all the tags in there from the previous iterations. The key lines to look at for this are:
position=int(input("Enter position:"))
and
url = taglist[position-1].get('href', None)
If you didn't reset the taglist
, position-1
would correspond to a different element.
I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.
# Instead of this
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
# I would use this:
taglist = [tag for tag in soup('a')]
url = taglist[position-1].get('href', None)
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
add a comment |
up vote
1
down vote
up vote
1
down vote
Although that isn't the way I would do it, this is so you start with a new taglist
every time. In this line:
for tag in tags:
taglist.append(tag)
you append to the taglist
. If you delete the content of the list, you will start fresh each iteration of the outer for loop.
The function would act differently when you index into the taglist
if you had all the tags in there from the previous iterations. The key lines to look at for this are:
position=int(input("Enter position:"))
and
url = taglist[position-1].get('href', None)
If you didn't reset the taglist
, position-1
would correspond to a different element.
I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.
# Instead of this
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
# I would use this:
taglist = [tag for tag in soup('a')]
url = taglist[position-1].get('href', None)
Although that isn't the way I would do it, this is so you start with a new taglist
every time. In this line:
for tag in tags:
taglist.append(tag)
you append to the taglist
. If you delete the content of the list, you will start fresh each iteration of the outer for loop.
The function would act differently when you index into the taglist
if you had all the tags in there from the previous iterations. The key lines to look at for this are:
position=int(input("Enter position:"))
and
url = taglist[position-1].get('href', None)
If you didn't reset the taglist
, position-1
would correspond to a different element.
I'm not sure I would say what you did is wrong, but without actually knowing about the site you are using this for, I would be inclined to use a list comprehension. The second way seems more Pythonic to me, and I also think it's more efficient.
# Instead of this
tags=soup('a')
for tag in tags:
taglist.append(tag)
url = taglist[position-1].get('href', None)
del taglist[:]
# I would use this:
taglist = [tag for tag in soup('a')]
url = taglist[position-1].get('href', None)
edited Nov 11 at 13:43
answered Nov 11 at 4:15
Stephen Cowley
798215
798215
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
add a comment |
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
Thank you, got it. How would you do this?
– Maria Lavrovskaya
Nov 11 at 9:23
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
@MariaLavrovskaya See addition in answer
– Stephen Cowley
Nov 11 at 13:43
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245746%2fremoving-everything-from-the-taglist%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown