Not able to get the table from the below PDF using stream

up vote
-2
down vote

favorite

Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.

Adding output:
output.csv file

enter image description here

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36

tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40

Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19

Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39

You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

|
show 1 more comment

up vote
-2
down vote

favorite

Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.

Adding output:
output.csv file

enter image description here

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36

tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40

Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19

Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39

You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

|
show 1 more comment

up vote
-2
down vote

favorite

Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.

Adding output:
output.csv file

enter image description here

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.

Adding output:
output.csv file

enter image description here

python python-camelot

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

edited Nov 11 at 13:08

Zoe

10.6k73575

edited Nov 11 at 13:08

Zoe

10.6k73575

edited Nov 11 at 13:08

Zoe

10.6k73575

asked Nov 11 at 12:28

Abhishek Bisht

125

asked Nov 11 at 12:28

Abhishek Bisht

125

asked Nov 11 at 12:28

Abhishek Bisht

125

Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36

tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40

Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19

Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39

You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

|
show 1 more comment

Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36

tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40

Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19

Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39

You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36

tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40

Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19

Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39

You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

|
show 1 more comment

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248766%2fnot-able-to-get-the-table-from-the-below-pdf-using-stream%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky