Not able to get the table from the below PDF using stream
up vote
-2
down vote
favorite
Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.
Adding output:
output.csv file
python python-camelot
|
show 1 more comment
up vote
-2
down vote
favorite
Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.
Adding output:
output.csv file
python python-camelot
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18
|
show 1 more comment
up vote
-2
down vote
favorite
up vote
-2
down vote
favorite
Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.
Adding output:
output.csv file
python python-camelot
Not able to get the table from below PDF. I am using stream as the data is in free flow text. No lines are present.
Adding output:
output.csv file
python python-camelot
python python-camelot
edited Nov 11 at 13:08
Zoe
10.6k73575
10.6k73575
asked Nov 11 at 12:28
Abhishek Bisht
125
125
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18
|
show 1 more comment
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18
|
show 1 more comment
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248766%2fnot-able-to-get-the-table-from-the-below-pdf-using-stream%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you share the code snippet that you're using to extract the tables?
– Vinayak Mehta
Nov 11 at 12:36
tables = camelot.read_pdf('plc.pdf',pages='76', flavor='stream'), I am not doing any additional configuration , As table looks good in structure.
– Abhishek Bisht
Nov 11 at 12:40
Currently, Stream treats the whole page as a table, as written in the docs here: camelot-py.readthedocs.io/en/master/user/…. So you'll have to specify a table area to get these tables out. You should check out the advanced guide for how to do that.
– Vinayak Mehta
Nov 11 at 15:19
Thanks, However I want to do it dynamically, i.e. by providing the table area we can not make it generic, as table area will differ for tables and in advance we don't know what is the table coordinate without opening the pdf.
– Abhishek Bisht
Nov 12 at 5:39
You can keep track of this issue github.com/socialcopsdev/camelot/issues/102 after which it would be generic.
– Vinayak Mehta
Nov 13 at 8:18