Google cloud function with wand stopped working
I have set up 3 Google Cloud Storge buckets and 3 functions (one for each bucket) that will trigger when a PDF file is uploaded to a bucket. Functions convert PDF to png image and do further processing.
When I am trying to create a 4th bucket and similar function, strangely it is not working. Even if I copy one of the existing 3 functions, it is still not working and I am getting this error:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 333, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 196, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 27, in pdf_to_img with Image(filename=tmp_pdf, resolution=300) as image: File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2874, in __init__ self.read(filename=filename, resolution=resolution) File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2952, in read self.raise_exception() File "/env/local/lib/python3.7/site-packages/wand/resource.py", line 222, in raise_exception raise e wand.exceptions.PolicyError: not authorized/tmp/tmphm3hiezy' @ error/constitute.c/ReadImage/412`
It is baffling me why same functions are working on existing buckets but not on new one.
UPDATE:
Even this is not working (getting "cache resources exhausted" error):
In requirements.txt:
google-cloud-storage
wand
In main.py:
import tempfile
from google.cloud import storage
from wand.image import Image
storage_client = storage.Client()
def pdf_to_img(data, context):
file_data = data
pdf = file_data['name']
if pdf.startswith('v-'):
return
bucket_name = file_data['bucket']
blob = storage_client.bucket(bucket_name).get_blob(pdf)
_, tmp_pdf = tempfile.mkstemp()
_, tmp_png = tempfile.mkstemp()
tmp_png = tmp_png+".png"
blob.download_to_filename(tmp_pdf)
with Image(filename=tmp_pdf) as image:
image.save(filename=tmp_png)
print("Image created")
new_file_name = "v-"+pdf.split('.')[0]+".png"
blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)
Above code is supposed to just create a copy of image file which is uploaded to bucket.
python-3.x google-cloud-functions
|
show 2 more comments
I have set up 3 Google Cloud Storge buckets and 3 functions (one for each bucket) that will trigger when a PDF file is uploaded to a bucket. Functions convert PDF to png image and do further processing.
When I am trying to create a 4th bucket and similar function, strangely it is not working. Even if I copy one of the existing 3 functions, it is still not working and I am getting this error:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 333, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 196, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 27, in pdf_to_img with Image(filename=tmp_pdf, resolution=300) as image: File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2874, in __init__ self.read(filename=filename, resolution=resolution) File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2952, in read self.raise_exception() File "/env/local/lib/python3.7/site-packages/wand/resource.py", line 222, in raise_exception raise e wand.exceptions.PolicyError: not authorized/tmp/tmphm3hiezy' @ error/constitute.c/ReadImage/412`
It is baffling me why same functions are working on existing buckets but not on new one.
UPDATE:
Even this is not working (getting "cache resources exhausted" error):
In requirements.txt:
google-cloud-storage
wand
In main.py:
import tempfile
from google.cloud import storage
from wand.image import Image
storage_client = storage.Client()
def pdf_to_img(data, context):
file_data = data
pdf = file_data['name']
if pdf.startswith('v-'):
return
bucket_name = file_data['bucket']
blob = storage_client.bucket(bucket_name).get_blob(pdf)
_, tmp_pdf = tempfile.mkstemp()
_, tmp_png = tempfile.mkstemp()
tmp_png = tmp_png+".png"
blob.download_to_filename(tmp_pdf)
with Image(filename=tmp_pdf) as image:
image.save(filename=tmp_png)
print("Image created")
new_file_name = "v-"+pdf.split('.')[0]+".png"
blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)
Above code is supposed to just create a copy of image file which is uploaded to bucket.
python-3.x google-cloud-functions
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editingpolicy.xmlfrom within the cloud function but it didn't work.
– Naveed
Nov 20 '18 at 18:28
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the samewand.exceptions.PolicyError: not authorized
– RogB
Jan 17 at 16:40
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20
|
show 2 more comments
I have set up 3 Google Cloud Storge buckets and 3 functions (one for each bucket) that will trigger when a PDF file is uploaded to a bucket. Functions convert PDF to png image and do further processing.
When I am trying to create a 4th bucket and similar function, strangely it is not working. Even if I copy one of the existing 3 functions, it is still not working and I am getting this error:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 333, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 196, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 27, in pdf_to_img with Image(filename=tmp_pdf, resolution=300) as image: File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2874, in __init__ self.read(filename=filename, resolution=resolution) File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2952, in read self.raise_exception() File "/env/local/lib/python3.7/site-packages/wand/resource.py", line 222, in raise_exception raise e wand.exceptions.PolicyError: not authorized/tmp/tmphm3hiezy' @ error/constitute.c/ReadImage/412`
It is baffling me why same functions are working on existing buckets but not on new one.
UPDATE:
Even this is not working (getting "cache resources exhausted" error):
In requirements.txt:
google-cloud-storage
wand
In main.py:
import tempfile
from google.cloud import storage
from wand.image import Image
storage_client = storage.Client()
def pdf_to_img(data, context):
file_data = data
pdf = file_data['name']
if pdf.startswith('v-'):
return
bucket_name = file_data['bucket']
blob = storage_client.bucket(bucket_name).get_blob(pdf)
_, tmp_pdf = tempfile.mkstemp()
_, tmp_png = tempfile.mkstemp()
tmp_png = tmp_png+".png"
blob.download_to_filename(tmp_pdf)
with Image(filename=tmp_pdf) as image:
image.save(filename=tmp_png)
print("Image created")
new_file_name = "v-"+pdf.split('.')[0]+".png"
blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)
Above code is supposed to just create a copy of image file which is uploaded to bucket.
python-3.x google-cloud-functions
I have set up 3 Google Cloud Storge buckets and 3 functions (one for each bucket) that will trigger when a PDF file is uploaded to a bucket. Functions convert PDF to png image and do further processing.
When I am trying to create a 4th bucket and similar function, strangely it is not working. Even if I copy one of the existing 3 functions, it is still not working and I am getting this error:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 333, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 196, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 27, in pdf_to_img with Image(filename=tmp_pdf, resolution=300) as image: File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2874, in __init__ self.read(filename=filename, resolution=resolution) File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2952, in read self.raise_exception() File "/env/local/lib/python3.7/site-packages/wand/resource.py", line 222, in raise_exception raise e wand.exceptions.PolicyError: not authorized/tmp/tmphm3hiezy' @ error/constitute.c/ReadImage/412`
It is baffling me why same functions are working on existing buckets but not on new one.
UPDATE:
Even this is not working (getting "cache resources exhausted" error):
In requirements.txt:
google-cloud-storage
wand
In main.py:
import tempfile
from google.cloud import storage
from wand.image import Image
storage_client = storage.Client()
def pdf_to_img(data, context):
file_data = data
pdf = file_data['name']
if pdf.startswith('v-'):
return
bucket_name = file_data['bucket']
blob = storage_client.bucket(bucket_name).get_blob(pdf)
_, tmp_pdf = tempfile.mkstemp()
_, tmp_png = tempfile.mkstemp()
tmp_png = tmp_png+".png"
blob.download_to_filename(tmp_pdf)
with Image(filename=tmp_pdf) as image:
image.save(filename=tmp_png)
print("Image created")
new_file_name = "v-"+pdf.split('.')[0]+".png"
blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)
Above code is supposed to just create a copy of image file which is uploaded to bucket.
python-3.x google-cloud-functions
python-3.x google-cloud-functions
edited Nov 14 '18 at 22:29
Dustin Ingram
3,27111225
3,27111225
asked Nov 14 '18 at 9:09
NaveedNaveed
100110
100110
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editingpolicy.xmlfrom within the cloud function but it didn't work.
– Naveed
Nov 20 '18 at 18:28
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the samewand.exceptions.PolicyError: not authorized
– RogB
Jan 17 at 16:40
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20
|
show 2 more comments
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editingpolicy.xmlfrom within the cloud function but it didn't work.
– Naveed
Nov 20 '18 at 18:28
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the samewand.exceptions.PolicyError: not authorized
– RogB
Jan 17 at 16:40
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editing
policy.xml from within the cloud function but it didn't work.– Naveed
Nov 20 '18 at 18:28
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editing
policy.xml from within the cloud function but it didn't work.– Naveed
Nov 20 '18 at 18:28
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the same
wand.exceptions.PolicyError: not authorized– RogB
Jan 17 at 16:40
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the same
wand.exceptions.PolicyError: not authorized– RogB
Jan 17 at 16:40
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20
|
show 2 more comments
3 Answers
3
active
oldest
votes
This actually seems to be a show stopper for ImageMagick related functionalities using PDF format. Similar code deployed by us on Google App engine via custom docker is failing with the same error on missing authorizations.
I am not sure how to edit the policy.xml file on GAE or GCF but a line there has to be changed to:
<policy domain="coder" rights="read|write" pattern="PDF" />
@Dustin: Do you have a bug link where we can see the progress ?
Update:
I fixed it on my Google app engine container by adding a line in docker image. This directly changes the policy.xml file content after imagemagick gets installed.
RUN sed -i 's/rights="none"/rights="read|write"/g' /etc/ImageMagick-6/policy.xml
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
add a comment |
This is an upstream bug in Ubuntu, we are working on a workaround for App Engine and Cloud Functions.
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
|
show 8 more comments
While we wait for the issue to be resolved in Ubuntu, I followed @DustinIngram's suggestion and created a virtual machine in Compute Engine with an ImageMagick installation. The downside is that I now have a second API that my API in App Engine has to call, just to generate the images. Having said that, it's working fine for me. This is my setup:
Main API:
When a pdf file is uploaded to Cloud Storage, I call the following:
response = requests.post('http://xx.xxx.xxx.xxx:5000/makeimages', data=data)
Where data is a JSON string with the format {"file_name": file_name}
On the API that is running on the VM, the POST request gets processed as follows:
@app.route('/makeimages', methods=['POST'])
def pdf_to_jpg():
file_name = request.form['file_name']
blob = storage_client.bucket(bucket_name).get_blob(file_name)
_, temp_local_filename = tempfile.mkstemp()
temp_local_filename_jpeg = temp_local_filename + '.jpg'
# Download file from bucket.
blob.download_to_filename(temp_local_filename)
print('Image ' + file_name + ' was downloaded to ' + temp_local_filename)
with Image(filename=temp_local_filename, resolution=300) as img:
pg_num = 0
image_files = {}
image_files['pages'] =
for img_page in img.sequence:
img_page_2 = Image(image=img_page)
img_page_2.format = 'jpeg'
img_page_2.compression_quality = 70
img_page_2.save(filename=temp_local_filename_jpeg)
new_file_name = file_name.replace('.pdf', 'p') + str(pg_num) + '.jpg'
new_blob = blob.bucket.blob(new_file_name)
new_blob.upload_from_filename(temp_local_filename_jpeg)
print('Page ' + str(pg_num) + ' was saved as ' + new_file_name)
image_files['pages'].append({'page': pg_num, 'file_name': new_file_name})
pg_num += 1
try:
os.remove(temp_local_filename)
except (ValueError, PermissionError):
print('Could not delete the temp file!')
return jsonify(image_files)
This will download the pdf from Cloud Storage, create an image for each page, and save them back to cloud storage. The API will then return a JSON file with the list of image files created.
So, not the most elegant solution, but at least I don't need to convert the files manually.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296500%2fgoogle-cloud-function-with-wand-stopped-working%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
This actually seems to be a show stopper for ImageMagick related functionalities using PDF format. Similar code deployed by us on Google App engine via custom docker is failing with the same error on missing authorizations.
I am not sure how to edit the policy.xml file on GAE or GCF but a line there has to be changed to:
<policy domain="coder" rights="read|write" pattern="PDF" />
@Dustin: Do you have a bug link where we can see the progress ?
Update:
I fixed it on my Google app engine container by adding a line in docker image. This directly changes the policy.xml file content after imagemagick gets installed.
RUN sed -i 's/rights="none"/rights="read|write"/g' /etc/ImageMagick-6/policy.xml
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
add a comment |
This actually seems to be a show stopper for ImageMagick related functionalities using PDF format. Similar code deployed by us on Google App engine via custom docker is failing with the same error on missing authorizations.
I am not sure how to edit the policy.xml file on GAE or GCF but a line there has to be changed to:
<policy domain="coder" rights="read|write" pattern="PDF" />
@Dustin: Do you have a bug link where we can see the progress ?
Update:
I fixed it on my Google app engine container by adding a line in docker image. This directly changes the policy.xml file content after imagemagick gets installed.
RUN sed -i 's/rights="none"/rights="read|write"/g' /etc/ImageMagick-6/policy.xml
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
add a comment |
This actually seems to be a show stopper for ImageMagick related functionalities using PDF format. Similar code deployed by us on Google App engine via custom docker is failing with the same error on missing authorizations.
I am not sure how to edit the policy.xml file on GAE or GCF but a line there has to be changed to:
<policy domain="coder" rights="read|write" pattern="PDF" />
@Dustin: Do you have a bug link where we can see the progress ?
Update:
I fixed it on my Google app engine container by adding a line in docker image. This directly changes the policy.xml file content after imagemagick gets installed.
RUN sed -i 's/rights="none"/rights="read|write"/g' /etc/ImageMagick-6/policy.xml
This actually seems to be a show stopper for ImageMagick related functionalities using PDF format. Similar code deployed by us on Google App engine via custom docker is failing with the same error on missing authorizations.
I am not sure how to edit the policy.xml file on GAE or GCF but a line there has to be changed to:
<policy domain="coder" rights="read|write" pattern="PDF" />
@Dustin: Do you have a bug link where we can see the progress ?
Update:
I fixed it on my Google app engine container by adding a line in docker image. This directly changes the policy.xml file content after imagemagick gets installed.
RUN sed -i 's/rights="none"/rights="read|write"/g' /etc/ImageMagick-6/policy.xml
edited Nov 20 '18 at 10:23
answered Nov 20 '18 at 9:02
Hasan RafiqHasan Rafiq
112
112
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
add a comment |
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
Thanks for your inputs. Unfortunately I can't use app engine as it is not suitable for long running background processes. I am processing thousands of PDFs. I tried AWS lambda function but the complexity of AWS turned me off.
– Naveed
Nov 20 '18 at 10:35
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
I am using Cloud Functions with App engine for huge volume of data, works perfectly :)
– Hasan Rafiq
Nov 21 '18 at 12:22
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
@Naveed: As of August 2018 you can use docker images on serverless containers( Cloud functions ) - cloud.google.com/blog/products/gcp/…. Try signing up for the alpha program at services.google.com/fb/forms/serverlesscontainers
– Hasan Rafiq
Nov 21 '18 at 14:52
add a comment |
This is an upstream bug in Ubuntu, we are working on a workaround for App Engine and Cloud Functions.
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
|
show 8 more comments
This is an upstream bug in Ubuntu, we are working on a workaround for App Engine and Cloud Functions.
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
|
show 8 more comments
This is an upstream bug in Ubuntu, we are working on a workaround for App Engine and Cloud Functions.
This is an upstream bug in Ubuntu, we are working on a workaround for App Engine and Cloud Functions.
edited Jan 17 at 19:21
answered Nov 14 '18 at 20:25
Dustin IngramDustin Ingram
3,27111225
3,27111225
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
|
show 8 more comments
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I am getting the same error if I create a new bucket on a new Google Cloud account and use one of my 3 functions (which are working fine on their respective older buckets). Also I have tried allotting 2GB Ram (which is highest) to my GC function. All in vain.
– Naveed
Nov 14 '18 at 20:32
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
I have updated the original question, please check.
– Naveed
Nov 14 '18 at 20:52
1
1
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
Thanks, I can reproduce it. Looking into it.
– Dustin Ingram
Nov 14 '18 at 22:30
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
If we get such errors in a local machine, we have to edit the policy.xml file in /etc/ImageMagick but can't do that in a cloud function. Looks like there is some issue in current GC function deployment while functions which were deployed few weeks ago are working fine.
– Naveed
Nov 15 '18 at 5:14
1
1
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
I've filed an issue internally, will update here when this is resolved.
– Dustin Ingram
Nov 16 '18 at 20:34
|
show 8 more comments
While we wait for the issue to be resolved in Ubuntu, I followed @DustinIngram's suggestion and created a virtual machine in Compute Engine with an ImageMagick installation. The downside is that I now have a second API that my API in App Engine has to call, just to generate the images. Having said that, it's working fine for me. This is my setup:
Main API:
When a pdf file is uploaded to Cloud Storage, I call the following:
response = requests.post('http://xx.xxx.xxx.xxx:5000/makeimages', data=data)
Where data is a JSON string with the format {"file_name": file_name}
On the API that is running on the VM, the POST request gets processed as follows:
@app.route('/makeimages', methods=['POST'])
def pdf_to_jpg():
file_name = request.form['file_name']
blob = storage_client.bucket(bucket_name).get_blob(file_name)
_, temp_local_filename = tempfile.mkstemp()
temp_local_filename_jpeg = temp_local_filename + '.jpg'
# Download file from bucket.
blob.download_to_filename(temp_local_filename)
print('Image ' + file_name + ' was downloaded to ' + temp_local_filename)
with Image(filename=temp_local_filename, resolution=300) as img:
pg_num = 0
image_files = {}
image_files['pages'] =
for img_page in img.sequence:
img_page_2 = Image(image=img_page)
img_page_2.format = 'jpeg'
img_page_2.compression_quality = 70
img_page_2.save(filename=temp_local_filename_jpeg)
new_file_name = file_name.replace('.pdf', 'p') + str(pg_num) + '.jpg'
new_blob = blob.bucket.blob(new_file_name)
new_blob.upload_from_filename(temp_local_filename_jpeg)
print('Page ' + str(pg_num) + ' was saved as ' + new_file_name)
image_files['pages'].append({'page': pg_num, 'file_name': new_file_name})
pg_num += 1
try:
os.remove(temp_local_filename)
except (ValueError, PermissionError):
print('Could not delete the temp file!')
return jsonify(image_files)
This will download the pdf from Cloud Storage, create an image for each page, and save them back to cloud storage. The API will then return a JSON file with the list of image files created.
So, not the most elegant solution, but at least I don't need to convert the files manually.
add a comment |
While we wait for the issue to be resolved in Ubuntu, I followed @DustinIngram's suggestion and created a virtual machine in Compute Engine with an ImageMagick installation. The downside is that I now have a second API that my API in App Engine has to call, just to generate the images. Having said that, it's working fine for me. This is my setup:
Main API:
When a pdf file is uploaded to Cloud Storage, I call the following:
response = requests.post('http://xx.xxx.xxx.xxx:5000/makeimages', data=data)
Where data is a JSON string with the format {"file_name": file_name}
On the API that is running on the VM, the POST request gets processed as follows:
@app.route('/makeimages', methods=['POST'])
def pdf_to_jpg():
file_name = request.form['file_name']
blob = storage_client.bucket(bucket_name).get_blob(file_name)
_, temp_local_filename = tempfile.mkstemp()
temp_local_filename_jpeg = temp_local_filename + '.jpg'
# Download file from bucket.
blob.download_to_filename(temp_local_filename)
print('Image ' + file_name + ' was downloaded to ' + temp_local_filename)
with Image(filename=temp_local_filename, resolution=300) as img:
pg_num = 0
image_files = {}
image_files['pages'] =
for img_page in img.sequence:
img_page_2 = Image(image=img_page)
img_page_2.format = 'jpeg'
img_page_2.compression_quality = 70
img_page_2.save(filename=temp_local_filename_jpeg)
new_file_name = file_name.replace('.pdf', 'p') + str(pg_num) + '.jpg'
new_blob = blob.bucket.blob(new_file_name)
new_blob.upload_from_filename(temp_local_filename_jpeg)
print('Page ' + str(pg_num) + ' was saved as ' + new_file_name)
image_files['pages'].append({'page': pg_num, 'file_name': new_file_name})
pg_num += 1
try:
os.remove(temp_local_filename)
except (ValueError, PermissionError):
print('Could not delete the temp file!')
return jsonify(image_files)
This will download the pdf from Cloud Storage, create an image for each page, and save them back to cloud storage. The API will then return a JSON file with the list of image files created.
So, not the most elegant solution, but at least I don't need to convert the files manually.
add a comment |
While we wait for the issue to be resolved in Ubuntu, I followed @DustinIngram's suggestion and created a virtual machine in Compute Engine with an ImageMagick installation. The downside is that I now have a second API that my API in App Engine has to call, just to generate the images. Having said that, it's working fine for me. This is my setup:
Main API:
When a pdf file is uploaded to Cloud Storage, I call the following:
response = requests.post('http://xx.xxx.xxx.xxx:5000/makeimages', data=data)
Where data is a JSON string with the format {"file_name": file_name}
On the API that is running on the VM, the POST request gets processed as follows:
@app.route('/makeimages', methods=['POST'])
def pdf_to_jpg():
file_name = request.form['file_name']
blob = storage_client.bucket(bucket_name).get_blob(file_name)
_, temp_local_filename = tempfile.mkstemp()
temp_local_filename_jpeg = temp_local_filename + '.jpg'
# Download file from bucket.
blob.download_to_filename(temp_local_filename)
print('Image ' + file_name + ' was downloaded to ' + temp_local_filename)
with Image(filename=temp_local_filename, resolution=300) as img:
pg_num = 0
image_files = {}
image_files['pages'] =
for img_page in img.sequence:
img_page_2 = Image(image=img_page)
img_page_2.format = 'jpeg'
img_page_2.compression_quality = 70
img_page_2.save(filename=temp_local_filename_jpeg)
new_file_name = file_name.replace('.pdf', 'p') + str(pg_num) + '.jpg'
new_blob = blob.bucket.blob(new_file_name)
new_blob.upload_from_filename(temp_local_filename_jpeg)
print('Page ' + str(pg_num) + ' was saved as ' + new_file_name)
image_files['pages'].append({'page': pg_num, 'file_name': new_file_name})
pg_num += 1
try:
os.remove(temp_local_filename)
except (ValueError, PermissionError):
print('Could not delete the temp file!')
return jsonify(image_files)
This will download the pdf from Cloud Storage, create an image for each page, and save them back to cloud storage. The API will then return a JSON file with the list of image files created.
So, not the most elegant solution, but at least I don't need to convert the files manually.
While we wait for the issue to be resolved in Ubuntu, I followed @DustinIngram's suggestion and created a virtual machine in Compute Engine with an ImageMagick installation. The downside is that I now have a second API that my API in App Engine has to call, just to generate the images. Having said that, it's working fine for me. This is my setup:
Main API:
When a pdf file is uploaded to Cloud Storage, I call the following:
response = requests.post('http://xx.xxx.xxx.xxx:5000/makeimages', data=data)
Where data is a JSON string with the format {"file_name": file_name}
On the API that is running on the VM, the POST request gets processed as follows:
@app.route('/makeimages', methods=['POST'])
def pdf_to_jpg():
file_name = request.form['file_name']
blob = storage_client.bucket(bucket_name).get_blob(file_name)
_, temp_local_filename = tempfile.mkstemp()
temp_local_filename_jpeg = temp_local_filename + '.jpg'
# Download file from bucket.
blob.download_to_filename(temp_local_filename)
print('Image ' + file_name + ' was downloaded to ' + temp_local_filename)
with Image(filename=temp_local_filename, resolution=300) as img:
pg_num = 0
image_files = {}
image_files['pages'] =
for img_page in img.sequence:
img_page_2 = Image(image=img_page)
img_page_2.format = 'jpeg'
img_page_2.compression_quality = 70
img_page_2.save(filename=temp_local_filename_jpeg)
new_file_name = file_name.replace('.pdf', 'p') + str(pg_num) + '.jpg'
new_blob = blob.bucket.blob(new_file_name)
new_blob.upload_from_filename(temp_local_filename_jpeg)
print('Page ' + str(pg_num) + ' was saved as ' + new_file_name)
image_files['pages'].append({'page': pg_num, 'file_name': new_file_name})
pg_num += 1
try:
os.remove(temp_local_filename)
except (ValueError, PermissionError):
print('Could not delete the temp file!')
return jsonify(image_files)
This will download the pdf from Cloud Storage, create an image for each page, and save them back to cloud storage. The API will then return a JSON file with the list of image files created.
So, not the most elegant solution, but at least I don't need to convert the files manually.
answered Jan 21 at 18:52
RogBRogB
14918
14918
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53296500%2fgoogle-cloud-function-with-wand-stopped-working%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
None of the wand (imagemgick) functionality is working. I tried cropping an image and I got this error: wand.exceptions.CacheError: cache resources exhausted `/tmp/tmpt7_1dq6i' @ error/cache.c/OpenPixelCache/3984
– Naveed
Nov 14 '18 at 15:39
I do not know if this is related, but if the server was updated for imagemagick, it could have added the policy restriction on PDF files for security, due to a bug in Ghostscript that has now been fixed. If you relax the policy restriction, it might work again. See stackoverflow.com/questions/52861946/…
– fmw42
Nov 20 '18 at 17:45
@fmw42 What you said is true, but if you observe the code I posted above, Wand module is not even creating a copy of a PNG file. Also I tried editing
policy.xmlfrom within the cloud function but it didn't work.– Naveed
Nov 20 '18 at 18:28
@Naveed Did you manage to get this working? I'm trying to write a very similar function (convert each page of a pdf to jpeg) and I'm getting the same
wand.exceptions.PolicyError: not authorized– RogB
Jan 17 at 16:40
@RogB No its still not working. I am doing the PDF to PNG (you can do JPEG as well) conversion on my computer itself using pdf2image (set concurrency to 3 for faster processing) and then sending the images to cloud bucket for further processing.
– Naveed
Jan 17 at 20:20