wangyuhangboke

1.工欲善其事必先利其器,fiddler安装

https://www.telerik.com/fiddler

2.安装exe(无脑下一步)

3.安装成功后配置fiddler(因为启动fiddler时链接不能登录)

(1)开启fiddler----->Tools----->Options----->Actions----->Export root,此时桌面就会出现fiddler证书。

(2)(Firefox)进入网页,点击选项,搜索证书,查看证书,导入证书(FiddlerRoot.cer)全部勾上。

(Google Chrome)点击设置,进入高级,管理证书,同上导入证书。

(3)App,首先要设置同一网络下,设置,手动代理IP,输入本机IP,端口:8888,同时找到系统安全,安装上面说到的证书。

至此,大家就可以用fiddler抓包了,上一段简单的代码(抓取wawayaya阅读第一页图片)

 

重中之重:如果要运行下列代码,记得信任证书,fiddler----->Tools----->Options----->Actions------>Trust root(不然图片会加载失败)

#-*- coding: UTF-8 -*-
from urllib.request import urlretrieve
import requests
import os


def book_imgs_download(books_url,header):
    req = requests.get(url = books_url, headers = header).json()
    a = req["retinfo"]
    book_num = len(a[\'libList\'])
    print(book_num)
    print(\'一共有%d个种类\' % book_num)
    book_images_path = \'books_images\'
    for each_book in a[\'libList\']:
        book_photo_url = each_book[\'imageEn\']
        book_name = each_book[\'cname\'] + \'.jpg\'
        filename = book_images_path + \'/\' + book_name
        if book_images_path not in os.listdir():
            os.makedirs(book_images_path)
        urlretrieve(url = book_photo_url, filename = filename)

if __name__ == \'__main__\':
    headers = {\'Accept-Charset\': \'UTF-8\',
            \'Accept-Encoding\': \'gzip\',
            \'User-Agent\': \'Dalvik/2.1.0 (Linux; U; Android 6.0.1; Redmi 4 MIUI/V8.5.3.0.MBECNED)\',
            \'X-Requested-With\': \'XMLHttpRequest\',
            \'Content-type\': \'application/x-www-form-urlencoded\',
            \'Connection\': \'Keep-Alive\',
            \'Host\': \'duba.wawayaya.com\'}
    books_url = "http://api3-joyreader.wawayaya.com/api/server/book/getLibHomeList?deviceType=phone&client_lang=zh&appVer=3.8.6&userId=610378&uuid=android_fe087c86fc37533980072e3aa76c244e&platform=android&osVer=23&mac=02:00:00:00:00:00&token=9bdebc55f4f3465def7f3dabd0a0727c&sig=009ff5f1&countryCode=CN&appId=2209410&app_ver=3.8.6&imei=863968035553662&timestamp=1513081106655"
    # req = requests.get(url = heros_url, headers = headers).json()
    # a = req["retinfo"]
    # b = a["libList"]
    # c = b[1]
    # print(c)
    book_imgs_download(books_url, headers)

  注意:链接失效较快,大约几分钟就会失效,这时就要重新获取链接。而且要注意把headers的东西改成自己的(

\'User-Agent,
\'Host

 

分类:

技术点:

相关文章:

  • 2021-08-08
  • 2021-06-01
  • 2022-12-23
  • 2022-01-22
  • 2022-12-23
  • 2021-12-02
  • 2021-11-19
  • 2021-10-18
猜你喜欢
  • 2021-04-12
  • 2021-11-04
  • 2021-03-30
  • 2021-12-05
  • 2021-10-19
  • 2021-09-30
相关资源
相似解决方案