【发布时间】:2021-03-21 11:46:42
【问题描述】:
我尝试在 Google Cloud Functions 上将 PDF 转换为 JPEG。我使用了 Python 模块pdf2image。但是我不知道如何解决云功能上的错误No such file or directory: 'pdfinfo'和"Unable to get page count. Is poppler installed and in PATH?。
错误代码与this question 非常相似。 pdf2image 是 poppler 的“pdftoppm”和“pdftocairo”的包装。但是如何在谷歌云功能上安装 poppler 包,并将其添加到 PATH?我找不到相关的参考资料。甚至有可能吗?如果没有,怎么办?
还有this question,不过没用。
代码如下所示。入口点是process_image。
import requests
from pdf2image import convert_from_path
def process_image(event, context):
# Download sample pdf file
url = 'https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf'
r = requests.get(url, allow_redirects=True)
open('/tmp/sample.pdf', 'wb').write(r.content)
# Error occur on this line
pages = convert_from_path('/tmp/sample.pdf')
# Save pages to /tmp
for idx, page in enumerate(pages):
output_file_path = f"/tmp/{str(idx)}.jpg"
page.save(output_file_path, 'JPEG')
# To be saved to cloud storage
Requirement.txt:
requests==2.25.1
pdf2image==1.14.0
这是我得到的错误代码:
Traceback (most recent call last):
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 441, in pdfinfo_from_path
proc = Popen(command, env=env, stdout=PIPE, stderr=PIPE)
File "/opt/python3.8/lib/python3.8/subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/python3.8/lib/python3.8/subprocess.py", line 1706, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'pdfinfo'
在处理上述异常的过程中,又发生了一个异常:
Traceback (most recent call last):
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/functions_framework/__init__.py", line 149, in view_func
function(data, context)
File "/workspace/main.py", line 11, in process_image
pages = convert_from_path('/tmp/sample.pdf')
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 97, in convert_from_path
page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pdf2image/pdf2image.py", line 467, in pdfinfo_from_path
raise PDFInfoNotInstalledError(
pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?
提前感谢您的帮助。
【问题讨论】:
标签: python image pdf google-cloud-functions poppler