Not able to get the table from the below PDF using stream











up vote
-2
down vote

favorite
1












Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.



Adding output:
output.csv file



enter image description here










share|improve this question
























  • Can you share the code snippet that you're using to extract the tables?
    – Vinayak Mehta
    Nov 11 at 12:36










  • tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
    – Abhishek Bisht
    Nov 11 at 12:40










  • Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
    – Vinayak Mehta
    Nov 11 at 15:19










  • Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
    – Abhishek Bisht
    Nov 12 at 5:39










  • You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
    – Vinayak Mehta
    Nov 13 at 8:18















up vote
-2
down vote

favorite
1












Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.



Adding output:
output.csv file



enter image description here










share|improve this question
























  • Can you share the code snippet that you're using to extract the tables?
    – Vinayak Mehta
    Nov 11 at 12:36










  • tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
    – Abhishek Bisht
    Nov 11 at 12:40










  • Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
    – Vinayak Mehta
    Nov 11 at 15:19










  • Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
    – Abhishek Bisht
    Nov 12 at 5:39










  • You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
    – Vinayak Mehta
    Nov 13 at 8:18













up vote
-2
down vote

favorite
1









up vote
-2
down vote

favorite
1






1





Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.



Adding output:
output.csv file



enter image description here










share|improve this question















Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.



Adding output:
output.csv file



enter image description here







python python-camelot






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 13:08









Zoe

10.6k73575




10.6k73575










asked Nov 11 at 12:28









Abhishek Bisht

125




125












  • Can you share the code snippet that you're using to extract the tables?
    – Vinayak Mehta
    Nov 11 at 12:36










  • tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
    – Abhishek Bisht
    Nov 11 at 12:40










  • Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
    – Vinayak Mehta
    Nov 11 at 15:19










  • Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
    – Abhishek Bisht
    Nov 12 at 5:39










  • You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
    – Vinayak Mehta
    Nov 13 at 8:18


















  • Can you share the code snippet that you're using to extract the tables?
    – Vinayak Mehta
    Nov 11 at 12:36










  • tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
    – Abhishek Bisht
    Nov 11 at 12:40










  • Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
    – Vinayak Mehta
    Nov 11 at 15:19










  • Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
    – Abhishek Bisht
    Nov 12 at 5:39










  • You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
    – Vinayak Mehta
    Nov 13 at 8:18
















Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36




Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36












tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40




tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40












Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19




Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19












Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39




Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39












You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18




You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248766%2fnot-able-to-get-the-table-from-the-below-pdf-using-stream%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248766%2fnot-able-to-get-the-table-from-the-below-pdf-using-stream%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python