如何在Python中使用线程？

import Queue
import threading
import urllib2

# called by each thread
def get_url(q, url):
    q.put(urllib2.urlopen(url).read())

theurls = ["http://google.com", "http://yahoo.com"]

q = Queue.Queue()

for u in theurls:
    t = threading.Thread(target=get_url, args = (q,u))
    t.daemon = True
    t.start()

s = q.get()
print s

这是一个将线程作为简单优化的案例。每个子线程都在等待一个URL的解析和响应，以便将其内容放到队列中；每个线程都是一个守护线程（如果主线程结束，不会保持进程--这种情况更常见）；主线程启动所有子线程，在队列中进行 "get"，等待其中一个子线程完成 "put"，然后发出结果并终止（这将使任何可能仍在运行的子线程停机，因为它们是守护线程）。

在Python中正确使用线程总是与I/O操作有关（因为CPython无论如何都不会使用多核来运行CPU绑定的任务，线程的唯一原因是在等待一些I/O时不要阻塞进程）。队列几乎是将工作分配给线程和/或收集工作结果的最佳方式，而且它们本质上是线程安全的，所以它们使你不必担心锁、条件、事件、semaphores和其他线程间协调/通信概念。

Michael Aaron Safyan · Answer 2 · 2010-05-17T04:35:11+00:00

注意。对于Python中的实际并行化，你应该使用multiprocessing模块来分叉多个并行执行的进程(由于全局解释器锁的存在，Python线程提供了交错，但实际上是串行执行的，不是并行的，只有在交错I/O操作时才有用。)

然而，如果你只是在寻找交织（或者正在进行I/O操作，尽管有全局解释器锁，但仍然可以并行化），那么threading模块是开始的地方。作为一个非常简单的例子，让我们考虑通过并行求和子范围来求一个大范围的问题。

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i

thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()  
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

请注意，以上是一个非常愚蠢的例子，因为它完全不做I/O，而且由于全局解释器锁的存在，在CPython中会被串行执行，尽管是交错执行（有上下文切换的额外开销）。

Kai · Answer 3 · 2012-03-08T22:22:17+00:00

像其他人提到的那样，由于GIL的原因，CPython只能对I/O的等待使用线程。如果你想从多核心的CPU绑定任务中获益，请使用multiprocessing。

from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()