【问题标题】:No Connection Adapters Using ThreadPool没有使用 ThreadPool 的连接适配器
【发布时间】:2019-09-30 19:56:34
【问题描述】:

我正在通过当前在 csv 中的 url 列表运行 ThreadPool。当代码运行时,它会返回下面发布的错误:

InvalidSchema: No connection adapters were found for '['http://axiomglobal.sharepoint.com/sites/MarkandElena-Strategic']'

看起来 python 将括号视为 url 的一部分,但列表本身似乎省略了它们:

[['http://axiomglobal.sharepoint.com/sites/123456'], ['http://axiomglobal.sharepoint.com/sites/SharePointo365globaltestsite'], ['http:3456789']

下面的代码中是否有某些内容导致它搜索列表中包含括号的站点?

我尝试对代码的 ThreadLoop 部分使用单个 url,而不是整个列表,它会生成正确的结果。

def get_site_status(site):
    try:
        response = requests.get(site)
    except requests.exceptions.ConnectionError:
        print('Connection Refused')
        return 1
    if response.status_code == 401:
        print('web site exists, permission needed')
    elif response.status_code == 404:
        print('web site does not exist')
    elif response.status_code == 400:
        print('web site does not exist')
    elif response.status_code == 403:
        print('web site forbidden')
    elif response.status_code == 423:
        print('web site locked')
    elif response.status_code == 200:
        print('web site exists and is available')
    else:
        print('other')
    return 0
pool = ThreadPool(processes=1)
​
results = pool.map_async(get_site_status, Row_list)
​
print('Results: {}'.format(results.get()))

我希望代码使用代码的 ThreadPool 部分的结果(大约 1500 行 url)填充每一行的列表。

【问题讨论】:

    标签: python url


    【解决方案1】:

    gevent 处理请求的速度更快。您没有正确捕获错误或在升级时将其交给 SSL 证书。

    from gevent import monkey, spawn, joinall
    monkey.patch_all()
    import requests, certifi
    from time import time
    t0 = time()
    Row_list = ['http://axiomglobal.sharepoint.com/sites/123456', 'http://axiomglobal.sharepoint.com/sites/SharePointo365globaltestsite', 'http:3456789', 'https://www.yahoo.com']
    def get_site_status(site):
        try:
            response = requests.get(site, verify=certifi.where())
            status = response.status_code
        except:
            print('Connection Refused')
            return False
        if status == 401:
            print('web site exists, permission needed')
        elif status == 404:
            print('web site does not exist')
        elif status == 400:
            print('web site does not exist')
        elif status == 403:
            print('web site forbidden')
        elif status == 423:
            print('web site locked')
        elif status == 200:
            print('web site exists and is available')
        else:
            print('other')
        return True
    threads = []
    for row in Row_list:
        threads.append(spawn(get_site_status, row))
    joinall(threads)
    
    for thread in threads:
        print thread.value
    
    print time() - t0
    

    【讨论】:

      【解决方案2】:

      意识到这个问题,我之前在代码中使用了列表中的列表,导致循环查找“[url]”而不是“url”

      【讨论】:

        猜你喜欢
        • 2021-11-29
        • 2021-09-08
        • 1970-01-01
        • 1970-01-01
        • 2023-02-06
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多