How to deal with bad HTML in scrapping with BeautifulSoup











up vote
-1
down vote

favorite












During accessing the respective table, the table ends with in middle. While accessing the website code through Ctrl+U, i found the complete table.
Screen Shots are attached below



Accessing through Soup



and
Accessing through Ctrl+U or inspecting element



Accessing in Soup Like



soup = BeautifulSoup(r.text, 'html.parser')

table =soup.findAll('table',{'align':'center', 'border':'1', 'cellpadding':'1' ,'cellspacing':'0', 'width':'800'})
print(table)


Website Link is










share|improve this question
























  • i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
    – aftab qaisrani
    13 hours ago

















up vote
-1
down vote

favorite












During accessing the respective table, the table ends with in middle. While accessing the website code through Ctrl+U, i found the complete table.
Screen Shots are attached below



Accessing through Soup



and
Accessing through Ctrl+U or inspecting element



Accessing in Soup Like



soup = BeautifulSoup(r.text, 'html.parser')

table =soup.findAll('table',{'align':'center', 'border':'1', 'cellpadding':'1' ,'cellspacing':'0', 'width':'800'})
print(table)


Website Link is










share|improve this question
























  • i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
    – aftab qaisrani
    13 hours ago















up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











During accessing the respective table, the table ends with in middle. While accessing the website code through Ctrl+U, i found the complete table.
Screen Shots are attached below



Accessing through Soup



and
Accessing through Ctrl+U or inspecting element



Accessing in Soup Like



soup = BeautifulSoup(r.text, 'html.parser')

table =soup.findAll('table',{'align':'center', 'border':'1', 'cellpadding':'1' ,'cellspacing':'0', 'width':'800'})
print(table)


Website Link is










share|improve this question















During accessing the respective table, the table ends with in middle. While accessing the website code through Ctrl+U, i found the complete table.
Screen Shots are attached below



Accessing through Soup



and
Accessing through Ctrl+U or inspecting element



Accessing in Soup Like



soup = BeautifulSoup(r.text, 'html.parser')

table =soup.findAll('table',{'align':'center', 'border':'1', 'cellpadding':'1' ,'cellspacing':'0', 'width':'800'})
print(table)


Website Link is







web-scraping python-requests






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 15 hours ago

























asked 16 hours ago









aftab qaisrani

13




13












  • i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
    – aftab qaisrani
    13 hours ago




















  • i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
    – aftab qaisrani
    13 hours ago


















i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
– aftab qaisrani
13 hours ago






i have done it by changing [ soup = BeautifulSoup(r.text, 'lxml') ]
– aftab qaisrani
13 hours ago



















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237625%2fhow-to-deal-with-bad-html-in-scrapping-with-beautifulsoup%23new-answer', 'question_page');
}
);

Post as a guest





































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53237625%2fhow-to-deal-with-bad-html-in-scrapping-with-beautifulsoup%23new-answer', 'question_page');
}
);

Post as a guest




















































































Popular posts from this blog

List item for chat from Array inside array React Native

Thiostrepton

Caerphilly