python的线程一直被称为鸡肋,所以它也没有亲生的线程池,但是竟然被我发现了野生的线程池,简直不能更幸运~~~于是,我开始啃源码,实在是虐心,在啃源码的过程中,我简略的了解了python线程的相关知识,感觉还是很有趣的,于是写博客困难症患者一夜之间化身写作小能手,完成了一系列线程相关的博客,然后恍然发现,python的多线程是一个鸡肋哎。。。这里换来了同事们的白眼若干→_→。嘻嘻,但是鸡肋归鸡肋,看懂了一篇源码给我带来的收获和成就感还是不能小视,所以还是分享下~~~
别人的线程池
首先介绍别人写的线程池模块,野生threadpool,直接到pypi上去搜,或者pip安装,都可以get到。这里还是先贴上来:
1 # -*- coding: UTF-8 -*- 2 """Easy to use object-oriented thread pool framework. 3 4 A thread pool is an object that maintains a pool of worker threads to perform 5 time consuming operations in parallel. It assigns jobs to the threads 6 by putting them in a work request queue, where they are picked up by the 7 next available thread. This then performs the requested operation in the 8 background and puts the results in another queue. 9 10 The thread pool object can then collect the results from all threads from 11 this queue as soon as they become available or after all threads have 12 finished their work. It's also possible, to define callbacks to handle 13 each result as it comes in. 14 15 The basic concept and some code was taken from the book "Python in a Nutshell, 16 2nd edition" by Alex Martelli, O'Reilly 2006, ISBN 0-596-10046-9, from section 17 14.5 "Threaded Program Architecture". I wrapped the main program logic in the 18 ThreadPool class, added the WorkRequest class and the callback system and 19 tweaked the code here and there. Kudos also to Florent Aide for the exception 20 handling mechanism. 21 22 Basic usage:: 23 24 >>> pool = ThreadPool(poolsize) 25 >>> requests = makeRequests(some_callable, list_of_args, callback) 26 >>> [pool.putRequest(req) for req in requests] 27 >>> pool.wait() 28 29 See the end of the module code for a brief, annotated usage example. 30 31 Website : http://chrisarndt.de/projects/threadpool/ 32 33 """ 34 __docformat__ = "restructuredtext en" 35 36 __all__ = [ 37 'makeRequests', 38 'NoResultsPending', 39 'NoWorkersAvailable', 40 'ThreadPool', 41 'WorkRequest', 42 'WorkerThread' 43 ] 44 45 __author__ = "Christopher Arndt" 46 __version__ = '1.3.2' 47 __license__ = "MIT license" 48 49 50 # standard library modules 51 import sys 52 import threading 53 import traceback 54 55 try: 56 import Queue # Python 2 57 except ImportError: 58 import queue as Queue # Python 3 59 60 61 # exceptions 62 class NoResultsPending(Exception): 63 """All work requests have been processed.""" 64 pass 65 66 class NoWorkersAvailable(Exception): 67 """No worker threads available to process remaining requests.""" 68 pass 69 70 71 # internal module helper functions 72 def _handle_thread_exception(request, exc_info): 73 """Default exception handler callback function. 74 75 This just prints the exception info via ``traceback.print_exception``. 76 77 """ 78 traceback.print_exception(*exc_info) 79 80 81 # utility functions 82 def makeRequests(callable_, args_list, callback=None, 83 exc_callback=_handle_thread_exception): 84 """Create several work requests for same callable with different arguments. 85 86 Convenience function for creating several work requests for the same 87 callable where each invocation of the callable receives different values 88 for its arguments. 89 90 ``args_list`` contains the parameters for each invocation of callable. 91 Each item in ``args_list`` should be either a 2-item tuple of the list of 92 positional arguments and a dictionary of keyword arguments or a single, 93 non-tuple argument. 94 95 See docstring for ``WorkRequest`` for info on ``callback`` and 96 ``exc_callback``. 97 98 """ 99 requests = [] 100 for item in args_list: 101 if isinstance(item, tuple): 102 requests.append( 103 WorkRequest(callable_, item[0], item[1], callback=callback, 104 exc_callback=exc_callback) 105 ) 106 else: 107 requests.append( 108 WorkRequest(callable_, [item], None, callback=callback, 109 exc_callback=exc_callback) 110 ) 111 return requests 112 113 114 # classes 115 class WorkerThread(threading.Thread): 116 """Background thread connected to the requests/results queues. 117 118 A worker thread sits in the background and picks up work requests from 119 one queue and puts the results in another until it is dismissed. 120 121 """ 122 123 def __init__(self, requests_queue, results_queue, poll_timeout=5, **kwds): 124 """Set up thread in daemonic mode and start it immediatedly. 125 126 ``requests_queue`` and ``results_queue`` are instances of 127 ``Queue.Queue`` passed by the ``ThreadPool`` class when it creates a 128 new worker thread. 129 130 """ 131 threading.Thread.__init__(self, **kwds) 132 self.setDaemon(1) 133 self._requests_queue = requests_queue 134 self._results_queue = results_queue 135 self._poll_timeout = poll_timeout 136 self._dismissed = threading.Event() 137 self.start() 138 139 def run(self): 140 """Repeatedly process the job queue until told to exit.""" 141 while True: 142 if self._dismissed.isSet(): 143 # we are dismissed, break out of loop 144 break 145 # get next work request. If we don't get a new request from the 146 # queue after self._poll_timout seconds, we jump to the start of 147 # the while loop again, to give the thread a chance to exit. 148 try: 149 request = self._requests_queue.get(True, self._poll_timeout) 150 except Queue.Empty: 151 continue 152 else: 153 if self._dismissed.isSet(): 154 155 # we are dismissed, put back request in queue and exit loop 156 self._requests_queue.put(request) 157 break 158 try: 159 result = request.callable(*request.args, **request.kwds) 160 self._results_queue.put((request, result)) 161 except: 162 request.exception = True 163 self._results_queue.put((request, sys.exc_info())) 164 165 def dismiss(self): 166 print '**********dismiss***********' 167 """Sets a flag to tell the thread to exit when done with current job. 168 """ 169 self._dismissed.set() 170 171 172 class WorkRequest: 173 """A request to execute a callable for putting in the request queue later. 174 175 See the module function ``makeRequests`` for the common case 176 where you want to build several ``WorkRequest`` objects for the same 177 callable but with different arguments for each call. 178 179 """ 180 181 def __init__(self, callable_, args=None, kwds=None, requestID=None, 182 callback=None, exc_callback=_handle_thread_exception): 183 """Create a work request for a callable and attach callbacks. 184 185 A work request consists of the a callable to be executed by a 186 worker thread, a list of positional arguments, a dictionary 187 of keyword arguments. 188 189 A ``callback`` function can be specified, that is called when the 190 results of the request are picked up from the result queue. It must 191 accept two anonymous arguments, the ``WorkRequest`` object and the 192 results of the callable, in that order. If you want to pass additional 193 information to the callback, just stick it on the request object. 194 195 You can also give custom callback for when an exception occurs with 196 the ``exc_callback`` keyword parameter. It should also accept two 197 anonymous arguments, the ``WorkRequest`` and a tuple with the exception 198 details as returned by ``sys.exc_info()``. The default implementation 199 of this callback just prints the exception info via 200 ``traceback.print_exception``. If you want no exception handler 201 callback, just pass in ``None``. 202 203 ``requestID``, if given, must be hashable since it is used by 204 ``ThreadPool`` object to store the results of that work request in a 205 dictionary. It defaults to the return value of ``id(self)``. 206 207 """ 208 #__init__( callable_, args=None, kwds=None, callback=None, exc_callback=_handle_thread_exception) 209 #WorkRequest(callable_, item[0], item[1], callback=callback,exc_callback=exc_callback) 210 if requestID is None: 211 self.requestID = id(self) 212 else: 213 try: 214 self.requestID = hash(requestID) 215 except TypeError: 216 raise TypeError("requestID must be hashable.") 217 self.exception = False 218 self.callback = callback 219 self.exc_callback = exc_callback 220 self.callable = callable_ 221 self.args = args or [] 222 self.kwds = kwds or {} 223 224 def __str__(self): 225 return "<WorkRequest id=%s args=%r kwargs=%r exception=%s>" % \ 226 (self.requestID, self.args, self.kwds, self.exception) 227 228 class ThreadPool: 229 """A thread pool, distributing work requests and collecting results. 230 231 See the module docstring for more information. 232 233 """ 234 235 def __init__(self, num_workers, q_size=0, resq_size=0, poll_timeout=5): 236 """Set up the thread pool and start num_workers worker threads. 237 238 ``num_workers`` is the number of worker threads to start initially. 239 240 If ``q_size > 0`` the size of the work *request queue* is limited and 241 the thread pool blocks when the queue is full and it tries to put 242 more work requests in it (see ``putRequest`` method), unless you also 243 use a positive ``timeout`` value for ``putRequest``. 244 245 If ``resq_size > 0`` the size of the *results queue* is limited and the 246 worker threads will block when the queue is full and they try to put 247 new results in it. 248 249 .. warning: 250 If you set both ``q_size`` and ``resq_size`` to ``!= 0`` there is 251 the possibilty of a deadlock, when the results queue is not pulled 252 regularly and too many jobs are put in the work requests queue. 253 To prevent this, always set ``timeout > 0`` when calling 254 ``ThreadPool.putRequest()`` and catch ``Queue.Full`` exceptions. 255 256 """ 257 self._requests_queue = Queue.Queue(q_size) 258 self._results_queue = Queue.Queue(resq_size) 259 self.workers = [] 260 self.dismissedWorkers = [] 261 self.workRequests = {} 262 self.createWorkers(num_workers, poll_timeout) 263 264 def createWorkers(self, num_workers, poll_timeout=5): 265 """Add num_workers worker threads to the pool. 266 267 ``poll_timout`` sets the interval in seconds (int or float) for how 268 ofte threads should check whether they are dismissed, while waiting for 269 requests. 270 271 """ 272 for i in range(num_workers): 273 self.workers.append(WorkerThread(self._requests_queue, 274 self._results_queue, poll_timeout=poll_timeout)) 275 276 def dismissWorkers(self, num_workers, do_join=False): 277 """Tell num_workers worker threads to quit after their current task.""" 278 dismiss_list = [] 279 for i in range(min(num_workers, len(self.workers))): 280 worker = self.workers.pop() 281 worker.dismiss() 282 dismiss_list.append(worker) 283 284 if do_join: 285 for worker in dismiss_list: 286 worker.join() 287 else: 288 self.dismissedWorkers.extend(dismiss_list) 289 290 def joinAllDismissedWorkers(self): 291 """Perform Thread.join() on all worker threads that have been dismissed. 292 """ 293 for worker in self.dismissedWorkers: 294 worker.join() 295 self.dismissedWorkers = [] 296 297 def putRequest(self, request, block=True, timeout=None): 298 """Put work request into work queue and save its id for later.""" 299 assert isinstance(request, WorkRequest) 300 # don't reuse old work requests 301 assert not getattr(request, 'exception', None) 302 import time 303 self._requests_queue.put(request, block, timeout) 304 self.workRequests[request.requestID] = request 305 306 def poll(self, block=False): 307 """Process any new results in the queue.""" 308 while True: 309 # still results pending? 310 if not self.workRequests: 311 raise NoResultsPending 312 # are there still workers to process remaining requests? 313 elif block and not self.workers: 314 raise NoWorkersAvailable 315 try: 316 # get back next results 317 318 request, result = self._results_queue.get(block=block) 319 320 # has an exception occured? 321 if request.exception and request.exc_callback: 322 323 request.exc_callback(request, result) 324 325 # hand results to callback, if any 326 if request.callback and not \ 327 (request.exception and request.exc_callback): 328 request.callback(request, result) 329 del self.workRequests[request.requestID] 330 except Queue.Empty: 331 break 332 333 def wait(self): 334 """Wait for results, blocking until all have arrived.""" 335 while 1: 336 try: 337 self.poll(True) 338 except NoResultsPending: 339 break 340 341 342 ################ 343 # USAGE EXAMPLE 344 ################ 345 346 if __name__ == '__main__': 347 import random 348 import time 349 350 # the work the threads will have to do (rather trivial in our example) 351 def do_something(data): 352 time.sleep(random.randint(1,5)) 353 result = round(random.random() * data, 5) 354 # just to show off, we throw an exception once in a while 355 if result > 5: 356 raise RuntimeError("Something extraordinary happened!") 357 return result 358 359 # this will be called each time a result is available 360 def print_result(request, result): 361 print("**** Result from request #%s: %r" % (request.requestID, result)) 362 363 # this will be called when an exception occurs within a thread 364 # this example exception handler does little more than the default handler 365 def handle_exception(request, exc_info): 366 if not isinstance(exc_info, tuple): 367 # Something is seriously wrong... 368 print(request) 369 print(exc_info) 370 raise SystemExit 371 print("**** Exception occured in request #%s: %s" % \ 372 (request.requestID, exc_info)) 373 374 # assemble the arguments for each job to a list... 375 data = [random.randint(1,10) for i in range(20)] 376 # ... and build a WorkRequest object for each item in data 377 requests = makeRequests(do_something, data, print_result, handle_exception) 378 # to use the default exception handler, uncomment next line and comment out 379 # the preceding one. 380 #requests = makeRequests(do_something, data, print_result) 381 382 # or the other form of args_lists accepted by makeRequests: ((,), {}) 383 data = [((random.randint(1,10),), {}) for i in range(20)] 384 requests.extend( 385 makeRequests(do_something, data, print_result, handle_exception) 386 #makeRequests(do_something, data, print_result) 387 # to use the default exception handler, uncomment next line and comment 388 # out the preceding one. 389 ) 390 391 # we create a pool of 3 worker threads 392 print("Creating thread pool with 3 worker threads.") 393 main = ThreadPool(3) 394 395 # then we put the work requests in the queue... 396 for req in requests: 397 main.putRequest(req) 398 print("Work request #%s added." % req.requestID) 399 # or shorter: 400 # [main.putRequest(req) for req in requests] 401 402 # ...and wait for the results to arrive in the result queue 403 # by using ThreadPool.wait(). This would block until results for 404 # all work requests have arrived: 405 # main.wait() 406 407 # instead we can poll for results while doing something else: 408 i = 0 409 while True: 410 try: 411 time.sleep(0.5) 412 main.poll() 413 print("Main thread working...") 414 print("(active worker threads: %i)" % (threading.activeCount()-1, )) 415 if i == 10: 416 417 main.createWorkers(3) 418 if i == 20: 419 420 main.dismissWorkers(2) 421 i += 1 422 except KeyboardInterrupt: 423 print("**** Interrupted!") 424 break 425 except NoResultsPending: 426 print("**** No pending results.") 427 break 428 if main.dismissedWorkers: 429 print("Joining all dismissed worker threads...") 430 main.joinAllDismissedWorkers()