【发布时间】:2011-12-31 15:52:22
【问题描述】:
我有以下实现将 pdf 文件上传到谷歌文档(取自 gdata API 示例):
def UploadResourceSample():
"""Upload a document, and convert to Google Docs."""
client = CreateClient()
doc = gdata.docs.data.Resource(type='document', title='My Sample Doc')
# This is a convenient MS Word doc that we know exists
path = _GetDataFilePath('test.0.doc')
print 'Selected file at: %s' % path
# Create a MediaSource, pointing to the file
media = gdata.data.MediaSource()
media.SetFileHandle(path, 'application/msword')
# Pass the MediaSource when creating the new Resource
doc = client.CreateResource(doc, media=media)
print 'Created, and uploaded:', doc.title.text, doc.resource_id.text
现在我想对上传的文件进行 OCR 文本识别。但我不确定如何在 gdata docs python API 中启用 OCR 识别。所以我的问题是: 有没有办法在 pdf 文件上使用 gdata python v3.0 API 启用 OCR 识别?
【问题讨论】:
标签: python pdf ocr gdata gdata-api