多线程与单线程计算

时间:2016-07-25 03:03:53

标签: python multithreading python-3.x

class QueryResults {
   public string SubCategory {get;set;}
   public List<InspectionItem> Items {get;set}
   public int ValidCount {get;set;}
}

结果:

def dowork():
  y = []
  z = []
  ab = 0
  start_time = time.time()
  t = threading.current_thread()

  for x in range(0,1500):
    y.append(random.randint(0,100000))
  for x in range(0,1500):
    z.append(random.randint(0,1000))
  for x in range(0,100):
    for k in range(0,len(z)):
      ab += y[k] ** z[k]
  print(" %.50s..." % ab)
  print("--- %.6s seconds --- %s" % (time.time() - start_time, t.name))

#do the work!
threads = []
for x in range(0,4): #4 threads
  threads.append(threading.Thread(target=dowork))

for x in threads:
  x.start() # and they are off

现在让我们在1个帖子中完成:

 23949968699026357507152486869104218631097704347109...
--- 11.899 seconds --- Thread-2
 10632599432628604090664113776561125984322566079319...
--- 11.924 seconds --- Thread-4
 20488842520966388603734530904324501550532057464424...
--- 12.073 seconds --- Thread-1
 17247910051860808132548857670360685101748752056479...
--- 12.115 seconds --- Thread-3
[Finished in 12.2s]

结果:

def dowork():
  y = []
  z = []
  ab = 0
  start_time = time.time()
  t = threading.current_thread()

  for x in range(0,1500):
    y.append(random.randint(0,100000))
  for x in range(0,1500):
    z.append(random.randint(0,1000))
  for x in range(0,100):
    for k in range(0,len(z)):
      ab += y[k] ** z[k]
  print(" %.50s..." % ab)
  print("--- %.6s seconds --- %s" % (time.time() - start_time, t.name))

# print(threadtest())
threads = []
for x in range(0,4):
  threads.append(True)

for x in threads:
  dowork()

为什么单线程和多线程脚本具有相同处理时间? 多线程实现不应该只有1 /#个线程数少吗? (我知道当你达到最大cpu线程时收益递减)

我搞砸了我的实施吗?

4 个答案:

答案 0 :(得分:3)

Python中的多线程不像其他语言一样工作,如果我正确地回忆起它与global interpreter lock有关。但是,有很多不同的解决方法,例如,您可以使用gevent's coroutine based "threads"。对于需要同时运行的工作,我自己更喜欢dask。例如

import dask.bag as db
start = time.time()
(db.from_sequence(range(4), npartitions=4)
     .map(lambda _: dowork())
    .compute())
print('total time: {} seconds'.format(time.time() - start))

start = time.time()
threads = []
for x in range(0,4):
  threads.append(True)

for x in threads:
  dowork()
print('total time: {} seconds'.format(time.time() - start))

和输出

 19016975777667561989667836343447216065093401859905...
--- 2.4172 seconds --- MainThread
 32883203981076692018141849036349126447899294175228...
--- 2.4685 seconds --- MainThread
 34450410116136243300565747102093690912732970152596...
--- 2.4901 seconds --- MainThread
 50964938446237359434550325092232546411362261338846...
--- 2.5317 seconds --- MainThread
total time: 2.5557193756103516 seconds
 10380860937556820815021239635380958917582122217407...
--- 2.3711 seconds --- MainThread
 13309313630078624428079401365574221411759423165825...
--- 2.2861 seconds --- MainThread
 27410752090906837219181398184615017013303570495018...
--- 2.2853 seconds --- MainThread
 73007436394172372391733482331910124459395132986470...
--- 2.3136 seconds --- MainThread
total time: 9.256525993347168 seconds

在这种情况下,dask使用multiprocessing来完成工作,这可能是你的情况所希望的,也可能不是。

除了使用cpython之外,您还可以尝试其他python实现,例如pypystackless python等声称提供解决方法/解决问题的方法。

答案 1 :(得分:2)

Here is a link to presentations about the GIL http://www.dabeaz.com/GIL/

这些演讲的作者通过示例详细解释了GIL。他还在Youtube上发布了一些视频

除了使用线程,您可能还对异步编程感兴趣。在python 3中,This library is added to python提供异步并发

答案 2 :(得分:0)

在CPython中,由于Global Intepreter Lock,线程不会并行运行。来自Python wiki(https://wiki.python.org/moin/GlobalInterpreterLock):

  

在CPython中,全局解释器锁或GIL是一个互斥锁,它可以防止多个本机线程一次执行Python字节码。这种锁是必要的,主要是因为CPython的内存管理不是线程安全的

答案 3 :(得分:0)

这是一个关于多线程和多处理与单线程/进程的完整测试和示例。

计算,你可以选择你想要的任何计算。

import time, os, threading, random,  multiprocessing 

def dowork():
  total = 0
  start_time = time.time()
  t = threading.current_thread()
  p = multiprocessing.current_process()
  for x in range(0,100):
    total += random.randint(1000000-1,1000000) ** random.randint(37000-1,37000)
  print("--- %.6s seconds DONE --- %s | %s" % (time.time() - start_time, p.name, t.name))

测试:

t, p = [], []
for x in range(0,4):
  #create thread
  t.append(threading.Thread(target=dowork))
  #create child process
  p.append(multiprocessing.Process(target=dowork))
#multi-thread
start_time = time.time()
for l in t:
  l.start()

for l in t:
  l.join()

print("===== %.6s seconds Multi-Threads =====" % (time.time() - start_time))
start_time = time.time()
#multi-process
for l in p:
  l.start()
for l in p:
  l.join()

print("===== %.6s seconds Multi-Processes =====" % (time.time() - start_time))
start_time = time.time()
# Sequential
for l in p:
  dowork()
print("===== %.6s seconds Single Process/Thread  =====" % (time.time() - start_time))

以下是示例输出:

#Sample Output:

--- 2.6412 seconds DONE --- MainProcess | Thread-1
--- 2.5712 seconds DONE --- MainProcess | Thread-2
--- 2.5774 seconds DONE --- MainProcess | Thread-3
--- 2.5973 seconds DONE --- MainProcess | Thread-4
===== 10.388 seconds Multi-Threads =====
--- 2.4816 seconds DONE --- Process-4 | MainThread
--- 2.4841 seconds DONE --- Process-3 | MainThread
--- 2.4965 seconds DONE --- Process-2 | MainThread
--- 2.5182 seconds DONE --- Process-1 | MainThread
===== 2.5241 seconds Multi-Processes =====
--- 2.4624 seconds DONE --- MainProcess | MainThread
--- 2.6447 seconds DONE --- MainProcess | MainThread
--- 2.5716 seconds DONE --- MainProcess | MainThread
--- 2.4369 seconds DONE --- MainProcess | MainThread
===== 10.115 seconds Single Process/Thread  =====
[Finished in 23.1s]