How to get my Python script to go to a URL, download the latest file

up vote
0
down vote

favorite

I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.

#import the writer

import xlwt

#import the reader

import xlrd

#open the rankings spreadsheet

book = xlrd.open_workbook('rankings.xls')

#open the first sheet

first_sheet = book.sheet_by_index(0)

#print the values in the second column of the first sheet

print first_sheet.col_values(1)





#open the spreadsheet

workbook = xlwt.Workbook()

#add a sheet named "Club BFA ranking"

worksheet1 = workbook.add_sheet("Club BFA ranking")

#in cell 0,0 (first cell of the first row) write "Ranking"

worksheet1.write(0, 0, "Ranking")

#in cell 0,1 (second cell of the first row) write "Name"

worksheet1.write(0, 1, "Name")    

#save and create the spreadsheet file

workbook.save("saxons.xls")



name = 

rank = 

for i in range(first_sheet.nrows):

    #print(first_sheet.cell_value(i,3)) 

    if('Saxon' in first_sheet.cell_value(i,3)):  

        name.append(first_sheet.cell_value(i,1))

        rank.append(first_sheet.cell_value(i,8))    

        print('a')

for j in range(len(name)):

    worksheet1.write(j+1,0,rank[j])

    worksheet1.write(j+1,1,name[j])





workbook.save("saxons.xls")

As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls

How can I do that?

asked Nov 11 at 11:20

J4G

108110

docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25

add a comment |

up vote
0
down vote

favorite

I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.

#import the writer

import xlwt

#import the reader

import xlrd

#open the rankings spreadsheet

book = xlrd.open_workbook('rankings.xls')

#open the first sheet

first_sheet = book.sheet_by_index(0)

#print the values in the second column of the first sheet

print first_sheet.col_values(1)





#open the spreadsheet

workbook = xlwt.Workbook()

#add a sheet named "Club BFA ranking"

worksheet1 = workbook.add_sheet("Club BFA ranking")

#in cell 0,0 (first cell of the first row) write "Ranking"

worksheet1.write(0, 0, "Ranking")

#in cell 0,1 (second cell of the first row) write "Name"

worksheet1.write(0, 1, "Name")    

#save and create the spreadsheet file

workbook.save("saxons.xls")



name = 

rank = 

for i in range(first_sheet.nrows):

    #print(first_sheet.cell_value(i,3)) 

    if('Saxon' in first_sheet.cell_value(i,3)):  

        name.append(first_sheet.cell_value(i,1))

        rank.append(first_sheet.cell_value(i,8))    

        print('a')

for j in range(len(name)):

    worksheet1.write(j+1,0,rank[j])

    worksheet1.write(j+1,1,name[j])





workbook.save("saxons.xls")

As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls

How can I do that?

asked Nov 11 at 11:20

J4G

108110

docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25

add a comment |

up vote
0
down vote

favorite

I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.

#import the writer

import xlwt

#import the reader

import xlrd

#open the rankings spreadsheet

book = xlrd.open_workbook('rankings.xls')

#open the first sheet

first_sheet = book.sheet_by_index(0)

#print the values in the second column of the first sheet

print first_sheet.col_values(1)





#open the spreadsheet

workbook = xlwt.Workbook()

#add a sheet named "Club BFA ranking"

worksheet1 = workbook.add_sheet("Club BFA ranking")

#in cell 0,0 (first cell of the first row) write "Ranking"

worksheet1.write(0, 0, "Ranking")

#in cell 0,1 (second cell of the first row) write "Name"

worksheet1.write(0, 1, "Name")    

#save and create the spreadsheet file

workbook.save("saxons.xls")



name = 

rank = 

for i in range(first_sheet.nrows):

    #print(first_sheet.cell_value(i,3)) 

    if('Saxon' in first_sheet.cell_value(i,3)):  

        name.append(first_sheet.cell_value(i,1))

        rank.append(first_sheet.cell_value(i,8))    

        print('a')

for j in range(len(name)):

    worksheet1.write(j+1,0,rank[j])

    worksheet1.write(j+1,1,name[j])





workbook.save("saxons.xls")

As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls

How can I do that?

asked Nov 11 at 11:20

J4G

108110

I have written this Python script to create a sheet with only the athletes from our sports club from the national rankings. At the moment I have to download the rankings file and then re-name it.

#import the writer

import xlwt

#import the reader

import xlrd

#open the rankings spreadsheet

book = xlrd.open_workbook('rankings.xls')

#open the first sheet

first_sheet = book.sheet_by_index(0)

#print the values in the second column of the first sheet

print first_sheet.col_values(1)





#open the spreadsheet

workbook = xlwt.Workbook()

#add a sheet named "Club BFA ranking"

worksheet1 = workbook.add_sheet("Club BFA ranking")

#in cell 0,0 (first cell of the first row) write "Ranking"

worksheet1.write(0, 0, "Ranking")

#in cell 0,1 (second cell of the first row) write "Name"

worksheet1.write(0, 1, "Name")    

#save and create the spreadsheet file

workbook.save("saxons.xls")



name = 

rank = 

for i in range(first_sheet.nrows):

    #print(first_sheet.cell_value(i,3)) 

    if('Saxon' in first_sheet.cell_value(i,3)):  

        name.append(first_sheet.cell_value(i,1))

        rank.append(first_sheet.cell_value(i,8))    

        print('a')

for j in range(len(name)):

    worksheet1.write(j+1,0,rank[j])

    worksheet1.write(j+1,1,name[j])





workbook.save("saxons.xls")

As a next iteration I would like it to go to a specific URL and download the latest spreadsheet to use as rankings.xls

How can I do that?

python url xls xlrd xlwt

asked Nov 11 at 11:20

J4G

108110

asked Nov 11 at 11:20

J4G

108110

asked Nov 11 at 11:20

J4G

108110

asked Nov 11 at 11:20

J4G

108110

asked Nov 11 at 11:20

J4G

108110

docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25

add a comment |

docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25

docs.python-requests.org/en/master
– petezurich
Nov 11 at 11:25

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You could use the requests library. For example,

import requests



url = "YOUR_URL" 

downloaded_file = requests.get(url)



with open("YOUR_PATH/rankings.xls", 'wb') as file:  

    file.write(downloaded_file.content)

EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.

time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")

as YOUR_URLto get the latest month's rankings.

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

add a comment |

up vote
1
down vote

I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.

Do a

pip install requests

before doing a

import requests

url = "http://foobar.com/rankings.xls"

r = requests.get(url)

then push the contents into a file

with open('./rankings.xls', 'w') as f:

    f.write(r.content)

So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.

EDIT: OP asked for a method to extract the latest xls file from the page. I'd suggest to parse the html for hrefs containing xls (as the page OP wants to parse is providing no common format for the xls files to be downloaded).

Best way to do this would be BeautifulSoup:

 pip install bs4



 from bs4 import BeautifulSoup

 import requests



 x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')

 soup = BeautifulSoup(x.content, 'html.parser')

 result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]



 print(result[0])

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

1

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248198%2fhow-to-get-my-python-script-to-go-to-a-url-download-the-latest-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You could use the requests library. For example,

import requests



url = "YOUR_URL" 

downloaded_file = requests.get(url)



with open("YOUR_PATH/rankings.xls", 'wb') as file:  

    file.write(downloaded_file.content)

EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.

time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")

as YOUR_URLto get the latest month's rankings.

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

add a comment |

up vote
1
down vote

accepted

You could use the requests library. For example,

import requests



url = "YOUR_URL" 

downloaded_file = requests.get(url)



with open("YOUR_PATH/rankings.xls", 'wb') as file:  

    file.write(downloaded_file.content)

EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.

time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")

as YOUR_URLto get the latest month's rankings.

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

add a comment |

up vote
1
down vote

accepted

You could use the requests library. For example,

import requests



url = "YOUR_URL" 

downloaded_file = requests.get(url)



with open("YOUR_PATH/rankings.xls", 'wb') as file:  

    file.write(downloaded_file.content)

EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.

time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")

as YOUR_URLto get the latest month's rankings.

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

You could use the requests library. For example,

import requests



url = "YOUR_URL" 

downloaded_file = requests.get(url)



with open("YOUR_PATH/rankings.xls", 'wb') as file:  

    file.write(downloaded_file.content)

EDIT: You mentioned that you wanted to download the latest version of the file, you can use time as below to fill in the month & year.

time.strftime("https://www.britishfencing.com/wp-content/uploads/%Y/%m/ranking_file.xls")

as YOUR_URLto get the latest month's rankings.

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

edited Nov 11 at 22:15

answered Nov 11 at 11:30

Faquarl

3239

answered Nov 11 at 11:30

Faquarl

3239

answered Nov 11 at 11:30

Faquarl

3239

add a comment |

up vote
1
down vote

I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.

Do a

pip install requests

before doing a

import requests

url = "http://foobar.com/rankings.xls"

r = requests.get(url)

then push the contents into a file

with open('./rankings.xls', 'w') as f:

    f.write(r.content)

So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.

Best way to do this would be BeautifulSoup:

 pip install bs4



 from bs4 import BeautifulSoup

 import requests



 x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')

 soup = BeautifulSoup(x.content, 'html.parser')

 result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]



 print(result[0])

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

1

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

add a comment |

up vote
1
down vote

I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.

Do a

pip install requests

before doing a

import requests

url = "http://foobar.com/rankings.xls"

r = requests.get(url)

then push the contents into a file

with open('./rankings.xls', 'w') as f:

    f.write(r.content)

So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.

Best way to do this would be BeautifulSoup:

 pip install bs4



 from bs4 import BeautifulSoup

 import requests



 x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')

 soup = BeautifulSoup(x.content, 'html.parser')

 result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]



 print(result[0])

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

1

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

add a comment |

up vote
1
down vote

I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.

Do a

pip install requests

before doing a

import requests

url = "http://foobar.com/rankings.xls"

r = requests.get(url)

then push the contents into a file

with open('./rankings.xls', 'w') as f:

    f.write(r.content)

So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.

Best way to do this would be BeautifulSoup:

 pip install bs4



 from bs4 import BeautifulSoup

 import requests



 x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')

 soup = BeautifulSoup(x.content, 'html.parser')

 result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]



 print(result[0])

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

I'm not sure, what you mean with "latest" spreadsheet, but you have various options to download files from the net. I'd suggest to use the famous requests library which is very, very easy to use.

Do a

pip install requests

before doing a

import requests

url = "http://foobar.com/rankings.xls"

r = requests.get(url)

then push the contents into a file

with open('./rankings.xls', 'w') as f:

    f.write(r.content)

So it would be possible to check if your recently downloaded rankings.xls is newer than a previously downloaded rankins.xls by comparing them using a hashcode or so.

Best way to do this would be BeautifulSoup:

 pip install bs4



 from bs4 import BeautifulSoup

 import requests



 x=requests.get('https://www.britishfencing.com/results-rankings/mens-foil-ranking-archive/')

 soup = BeautifulSoup(x.content, 'html.parser')

 result = [ xls['href'] for xls in soup.find_all('a', href=True) if 'xls' in xls['href']]



 print(result[0])

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

edited Nov 12 at 23:10

answered Nov 11 at 11:32

ferdy

3,42212432

answered Nov 11 at 11:32

ferdy

3,42212432

answered Nov 11 at 11:32

ferdy

3,42212432

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

1

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

add a comment |

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

1

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

apologies I should have mentioned that the page I want to download from is an archive that has a file added once a month: britishfencing.com/results-rankings/mens-foil-ranking-archive is it possible to download the last uploaded file?
– J4G
Nov 11 at 19:25

I'd go for beautifulsoup to get all links, then parse them for xls files and by order of their entrance, the first one will be the most recent.
– ferdy
Nov 12 at 22:57

how would I do that? I like the sound of it
– J4G
Nov 12 at 22:58

updated my answer. this should be helping you. cheers!
– ferdy
Nov 12 at 23:12

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

o XoWFce74RyhC54Hhw,SMBUG8xX CjHEAgUXQOaZX1Urg7eTP fCKWT

搜尋此網誌

Vfrdtyky