angela-dba

需求:读取图片内的文字,图片包含url形式的和image形式的

实现思路:python调用腾讯api,参考腾讯官方文档:https://cloud.tencent.com/document/product/866/17596

步骤:调用api需要配置header请求头,请求头需要鉴权签名,鉴权签名需要api密钥。

鉴权签名:https://cloud.tencent.com/document/product/866/17734

api密钥的获取:登陆腾讯云https://console.cloud.tencent.com/cam/capi

代码如下

import time
import base64
import hmac
import hashlib
from hmac import new as hmac
import requests
import json
import datetime
import random

\'\'\'通过登陆腾讯云获取自己的api密钥\'\'\'
appid=\'12******25\'
SecretID=\'AKIDGS************************NPpyNp\'
SecretKey=\'Xt*************************iwybH\'

\'\'\'初始化除api外的其他参数\'\'\'
currentTime = int(time.time()) #当前时间戳
expiredTime = currentTime+2592000 #签名的有效期 此处定义为当前时间+30天
bucket=\'\'# 图片资源的组织管理单元,历史遗留字段,可不填
rand=7648353324 #随机串,通过函数 random.randint(1,9999999999) 生成
fileid=\'\' #资源存储的唯一标识,单次签名必填;多次签名选填,如填写则会验证与当前操作的文件路径是否一致

\'\'\'配置 拼接签名串\'\'\'
src_str=\'a=\'+appid+\'&b=\'+bucket+\'&k=\'+SecretID+\'&e=\'+str(expiredTime)+\'&t=\'+str(currentTime)+\'&r=\'+str(rand)+\'&f=\'+fileid

\'\'\'定义生成签名的函数\'\'\'
def hash_hmac(ac_key,orignal):
SignTmp = hmac(bytes(ac_key,\'utf-8\'),bytes(orignal,\'utf-8\'), hashlib.sha1).digest()#+\'.\'+bytes(orignal,\'utf-8\')
Sign = base64.b64encode(SignTmp+orignal.encode())
return Sign

\'\'\'生成签名\'\'\'
authorization=hash_hmac(SecretKey,src_str)

\'\'\'定义请求协议\'\'\'
url=\'https://recognition.image.myqcloud.com/ocr/handwriting\'

\'\'\'配置请求头\'\'\'
headers={
\'Authorization\':authorization,
\'Host\':\'recognition.image.myqcloud.com\'
}

# 使用 url 的请求示例
url_img=\'https://images.jiandaoyun.com/Fm0I5jLH9zGFpYn5SLoEP-EhWOmC\'#+\'.png\'
data_img={\'appid\':appid,\'url\':url_img}

\'\'\'
#使用 image 的请求示例
url_img=\'kuaiji-5-243.jpg\',open(\'D:\\python\\kuaiji\\kuaiji-5-243.jpg\',\'rb\'),\'image/jpeg\'
data_img={\'appid\':appid,\'image\':(url_img)}
\'\'\'
r = requests.post(url,files=data_img,headers=headers)
data=json.loads(r.content.decode(\'utf-8\'))
count=len(data[\'data\'][\'items\'])

try:
for i in range(count):
x=data[\'data\'][\'items\'][i][\'itemcoord\'][\'x\']
y=data[\'data\'][\'items\'][i][\'itemcoord\'][\'y\']
content=data[\'data\'][\'items\'][i][\'itemstring\']
print(content)
except:
print(\'wrong\')

代码执行结果

 

分类:

技术点:

相关文章: