【问题标题】:Why threads will quit occasionally in my python program为什么线程会在我的 python 程序中偶尔退出
【发布时间】:2015-07-14 00:05:34
【问题描述】:

程序用函数thread_monitor创建一个监控线程。然后监控线程创建数百个带有 fucntion:valid_proxy 的线程和一个线程来统计每秒活动线程的数量。

但运行几个小时后,我发现活动线程的数量正在减少,例如从 500 减少到 472。运行时间越长,减少的次数越多。

我不知道valid_proxy函数有什么问题导致异常线程存在。你能帮我指出潜在的错误吗?

所有代码都放在:https://github.com/iaston/proxy_checker。以下是一些代码片段。

import threading
import time
import requests
import requests.exceptions
import requests.adapters
import datetime
import queue
import re

def update_gui():
    while True:
        time.sleep(1)
        num_of_threads = 0
        for each_thread in threading.enumerate():
            if each_thread.name.find("Verify_Proxy_") == 0:
                num_of_threads += 1                 
        print("\nNumber of running threads is %d.\n" % num_of_threads)


def get_a_proxy():
    global g_b_stop, g_all_statu, g_proxy_queue, lock_get_proxy
    if g_b_stop:  
        return ""
    proxy_now = ""
    lock_get_proxy.acquire()
    try:
        while proxy_now == "":
            proxy_now = re.sub("[^\d:\.].+", "", str(g_proxy_queue.get(block=False)))
    except Exception:
        proxy_now = ""
    lock_get_proxy.release()
    return proxy_now


def valid_proxy(check_site_info, success_try):
    global lock_valided_list, g_tree_proxies, g_all_statu, proxies_valided_list
    i_error_limit = success_try[1] - success_try[0]
    if i_error_limit < 0:
        i_error_limit = 0
    i_error_now = 0
    proxy_now = get_a_proxy()
    while proxy_now != "":
        test_num = 0
        for each_check_site in check_site_info:  

            i_error_now = 0
            proxy_speed_recorder = {each_check_site: []}
            for iCounter in range(0, success_try[1]):  
                test_num += 1
                if (datetime.datetime.now() - g_gui_last_update_time).seconds > g_gui_update_interval:
                    redraw_gui_event_finished.clear()
                redraw_gui_event_finished.wait()  
                try:
                    read_timeout = int(check_site_info[each_check_site]['timeout'])
                    connect_timeout = read_timeout / 2
                    if connect_timeout < 2:
                        connect_timeout = 2
                    start_test_time = time.time()
                    req_headers = {
                        "User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0",
                        "Referer": check_site_info[each_check_site]['url']}
                    req_result = requests.get(check_site_info[each_check_site]['url'],
                                              timeout=(connect_timeout, read_timeout),
                                              proxies={'http': proxy_now, 'https': proxy_now}, headers=req_headers)
                    used_time_seconds = (time.time() - start_test_time) * 1000

                    html_result = req_result.text
                except Exception as e:
                    used_time_seconds = -1
                    html_result = ""
                    print("\nError in proxy %s:\n%r" % (proxy_now, e))
                if html_result.find(check_site_info[each_check_site]['keyword']) < 0:
                    i_error_now += 1
                    if i_error_now > i_error_limit:
                        print(('\nInvalided proxy: ' + proxy_now))
                        break  
                else:
                    proxy_speed_recorder[each_check_site].append((iCounter, used_time_seconds))

                    if iCounter + 1 - i_error_now >= success_try[0]:
                        break
            if i_error_now > i_error_limit:
                break  
        print("Proxy: " + proxy_now + " test number:" + str(test_num))
        if i_error_now <= i_error_limit:

            all_used_time = 0
            all_test_time = 0
            for each_check_site in proxy_speed_recorder:
                for each_test in proxy_speed_recorder[each_check_site]:
                    all_used_time += each_test[1]
                    all_test_time += 1
            avarge_time = round(all_used_time / all_test_time) if all_test_time != 0 else 0
            lock_valided_list.acquire()
            proxies_valided_list.append((proxy_now, avarge_time))
            try:
                g_all_statu["text_proxy_valid_append"] += proxy_now + "&" + str(avarge_time) + "\n"  
                g_tree_proxies.add_data_treeview([(proxy_now, avarge_time)], skip_datebase=True)  
            except Exception as e2:
                print("""g_all_statu["text_proxy_valid_append"] is wrong:\n""" + repr(e2))
            lock_valided_list.release()
        time.sleep(1)
        proxy_now = get_a_proxy()
    print("Finish, thread exit")


def thread_monitor(check_site_info, success_try, th_num):
    for each_proxy in proxies_unvalided_list:
        g_proxy_queue.put(each_proxy)

    th_gui = threading.Thread(target=update_gui)
    th_gui.setDaemon(True)
    th_gui.start()

    ths_verify = []
    for iCounter in range(0, th_num):
        t = threading.Thread(target=valid_proxy, args=(check_site_info, success_try))
        t.setDaemon(True)
        t.setName("Verify_Proxy_" + str(iCounter))
        t.start()
        ths_verify.append(t)

    for iCounter in range(0, th_num):
        ths_verify[iCounter].join()


thread_monitor(check_site_info, success_try, 500)

【问题讨论】:

    标签: python multithreading python-3.x


    【解决方案1】:

    我无法对您的函数valid_proxy 进行故障排除,因为它主要引用了您未提供的代码,而且我不是千里眼。但是,我可以告诉你,如果函数中的任何地方发生异常,它将导致线程退出。如果你这样做:

    def forever_valid_proxy(*x,**y):
        try:
             valid_proxy(*x,**y)
        except Exception:
             pass
    

    并替换该行:

    t = threading.Thread(target=valid_proxy, args=(check_site_info, success_try))
    

    与:

    t = threading.Thread(target=forever_valid_proxy, args=(check_site_info, success_try))
    

    线程数永远不会减少。我不知道这是否是您真正想要或需要的程序,但您永远不会看到线程数减少。

    【讨论】:

    • 感谢您的帮助。将所有代码包含在 try except 中的方法将起作用。但它会覆盖我在程序中的错误,并且可能会导致其他无法控制的问题。我添加了有关程序的更多信息。您能否花点时间了解一下程序中的错误在哪里。谢谢。
    • @maston 不会,如果您在异常处理程序中放置一些日志记录而不是通过。所以像Exception as ex: 这样的东西后面是你的日志代码块。这样您就可以诊断问题
    • @PaulCornelius 你确定线程不存在吗?看起来它只会捕获异常然后退出。
    • @PaulCornelius 再次tring后,我发现多线程同时启动打印时存在资源冲突。谢谢。
    • 如果您想避免需要制作自定义打印语句,请使用某种锁保护库print
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-09-03
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多