niceshoot

摘要:根据一个运营朋友的需求,取出上万个微信公众号的关注度排行,最终用python实现了这一需求,工作量从至少3天缩减至2小时。

简介:本文使用python+requests框架实现接口访问,通过字典方式实现需求字段的摘取,其中还有excel的读写哦。ok!废话不多说,直接上代码。PS:该爬虫仅仅是为了朋友的燃眉之急,并没有使用函数方法等代码优化规范,切勿模仿~~~

程序思路:

1,requests模块引用,实现接口调用

2,读取excel中指定行、列的值

3,抓取需要的数据,写入excel中

代码:

#模块的引用

import requests
import json
import xlrd,xlwt
from xlutils.copy import copy

#程序主干

print "Start".center(40,"*")

wx = xlrd.open_workbook("123.xls")
table = wx.sheet_by_name(u\'test\')
nrows = table.nrows #行
ncols = table.ncols #列
for i in range(1,3):
cell_C2 = table.col(2)[i].value
data1 = {"PageIndex":1,"PageSize":10,"Kw":cell_C2}
try:
r1 = requests.post(\'http://top.aiweibang.com/user/getsearch\',data = data1)
dic1 = json.loads(r1.content)
dic3 = dic1[\'data\'][\'data\']
userId = dic3[0][\'Id\']
data2 = {"id":userId}
r = requests.post("http://top.aiweibang.com/statistics/readnum",data = data2)
dic4 = json.loads(r.content)
s = dic4[\'data\']
l = [cell_C2,s[0][\'ArticleCount\'],s[0][\'ReadNumAvg\'],s[0][\'ReadNumMax\'],s[0][\'LikeNumAvg\'],s[0][\'LikeNumMax\']]
n = [u\'地址/ID\',u\'篇数\',u\'阅读平均\',u\'阅读最高\',u\'点赞平均\',u\'点赞最高\']
except Exception,e:
pass

rb = xlrd.open_workbook(\'test02.xls\')
wb = copy(rb)
ws = wb.get_sheet(0)
for k in range(len(n)):
ws.write(0,k,n[k])
#print str(l[1])
for j in range(len(l)):
try:
ws.write(i,j,str(l[j]))
except Exception,e:
pass
wb.save(\'test02.xls\')

print "End".center(40,"*")

程序运行结果:

抓取数据后写入excel的内容如下,仅供参考~

 

分类:

技术点:

相关文章: