没有使用 ThreadPool 的连接适配器答案

【问题标题】：No Connection Adapters Using ThreadPool没有使用 ThreadPool 的连接适配器
【发布时间】：2019-09-30 19:56:34
【问题描述】：

我正在通过当前在 csv 中的 url 列表运行 ThreadPool。当代码运行时，它会返回下面发布的错误：

InvalidSchema: No connection adapters were found for '['http://axiomglobal.sharepoint.com/sites/MarkandElena-Strategic']'

看起来 python 将括号视为 url 的一部分，但列表本身似乎省略了它们：

[['http://axiomglobal.sharepoint.com/sites/123456'], ['http://axiomglobal.sharepoint.com/sites/SharePointo365globaltestsite'], ['http:3456789']

下面的代码中是否有某些内容导致它搜索列表中包含括号的站点？

我尝试对代码的 ThreadLoop 部分使用单个 url，而不是整个列表，它会生成正确的结果。

def get_site_status(site):
    try:
        response = requests.get(site)
    except requests.exceptions.ConnectionError:
        print('Connection Refused')
        return 1
    if response.status_code == 401:
        print('web site exists, permission needed')
    elif response.status_code == 404:
        print('web site does not exist')
    elif response.status_code == 400:
        print('web site does not exist')
    elif response.status_code == 403:
        print('web site forbidden')
    elif response.status_code == 423:
        print('web site locked')
    elif response.status_code == 200:
        print('web site exists and is available')
    else:
        print('other')
    return 0
pool = ThreadPool(processes=1)

results = pool.map_async(get_site_status, Row_list)

print('Results: {}'.format(results.get()))

我希望代码使用代码的 ThreadPool 部分的结果（大约 1500 行 url）填充每一行的列表。

【问题讨论】：

标签： python url

【解决方案1】：

gevent 处理请求的速度更快。您没有正确捕获错误或在升级时将其交给 SSL 证书。

from gevent import monkey, spawn, joinall
monkey.patch_all()
import requests, certifi
from time import time
t0 = time()
Row_list = ['http://axiomglobal.sharepoint.com/sites/123456', 'http://axiomglobal.sharepoint.com/sites/SharePointo365globaltestsite', 'http:3456789', 'https://www.yahoo.com']
def get_site_status(site):
    try:
        response = requests.get(site, verify=certifi.where())
        status = response.status_code
    except:
        print('Connection Refused')
        return False
    if status == 401:
        print('web site exists, permission needed')
    elif status == 404:
        print('web site does not exist')
    elif status == 400:
        print('web site does not exist')
    elif status == 403:
        print('web site forbidden')
    elif status == 423:
        print('web site locked')
    elif status == 200:
        print('web site exists and is available')
    else:
        print('other')
    return True
threads = []
for row in Row_list:
    threads.append(spawn(get_site_status, row))
joinall(threads)

for thread in threads:
    print thread.value

print time() - t0

【讨论】：

【解决方案2】：

意识到这个问题，我之前在代码中使用了列表中的列表，导致循环查找“[url]”而不是“url”

【讨论】：