Gevent 是一个第三方库,可以轻松通过gevent实现并发同步或异步编程,在gevent中用到的主要模式是Greenlet, 它是以C扩展模块形式接入Python的轻量级协程。 Greenlet全部运行在主程序操作系统进程的内部,但它们被协作式地调度。
示例
import gevent #gevent,自动挡切换 def func1(): print(‘Ashley starts running.‘) #1 gevent.sleep(2) #切换到2 print(‘Ashley tumbled over Sarah.‘) #6 def func2(): print(‘Sarah starts running.‘) #2 gevent.sleep(1) #切换到3 print(‘Sarah tumbled over Nancy.‘) #5 def func3(): print(‘Nancy starts running.‘) #3 gevent.sleep(0) #切换,但无需等待 print(‘Nancy tumbled on a hurdle.‘) #4 gevent.joinall([ gevent.spawn(func1), #spawn引发引起导致造成 gevent.spawn(func2), gevent.spawn(func3) ])
结果
Ashley starts running. Sarah starts running. Nancy starts running. Nancy tumbled on a hurdle. Sarah tumbled over Nancy. Ashley tumbled over Sarah.
看下面示例
import gevent def task(pid): gevent.sleep(0.5) print(‘Task %s done‘ % pid) def synchronous(): for i in range(1, 10): task(i) #串行,循环10次task函数 def asynchronous(): threads = [gevent.spawn(task, i) for i in range(10)] #发起协程,协程执行task函数,循环10次,总计10条线程,存储在threads里 gevent.joinall(threads) print(‘Synchronous:‘) synchronous() #同步串行 print(‘Asynchronous:‘) asynchronous() #异步并行
结果
Synchronous: #这里每条睡0.5秒 Task 1 done Task 2 done Task 3 done Task 4 done Task 5 done Task 6 done Task 7 done Task 8 done Task 9 done Asynchronous: #这里是同步执行 Task 0 done Task 1 done Task 2 done Task 3 done Task 4 done Task 5 done Task 6 done Task 7 done Task 8 done
上面程序的重要部分是将task函数封装到Greenlet内部线程的gevent.spawn
。 初始化的greenlet列表存放在数组threads
中,此数组被传给gevent.joinall
函数,后者阻塞当前流程,并执行所有给定的greenlet。执行流程只会在 所有greenlet执行完后才会继续向下走。
遇到IO阻塞时会自动切换任务,下面是一个简单的爬取网页的示例,我们看一下同步与异步的区别:
import gevent, time from gevent import monkey #gevent无法识别socket中的I/O操作 from urllib import request monkey.patch_all() #有了这个方法,相当于给socket的I/O操作加上了标识 def f(url): print(‘GET: %s‘ % url) resp = request.urlopen(url) data = resp.read() #I/O任务,每当遇到I/O任务会自动切换 print(‘%d bytes received from %s.‘ % (len(data), url)) # print(‘Data from %s is: %s.‘ % (url, data)) urls = [‘https://www.python.org/‘, ‘https://www.yahoo.com/‘, ‘https://github.com/‘] sync_start_time = time.time() for url in urls: f(url) print(‘同步Cost: ‘, time.time() - sync_start_time) async_start_time = time.time() gevent.joinall([ gevent.spawn(f, ‘https://www.python.org/‘), #启动协程,执行f函数,并将url传递给f gevent.spawn(f, ‘https://www.yahoo.com/‘), gevent.spawn(f, ‘https://github.com/‘), ]) print(‘异步Cost: ‘, time.time() - async_start_time)
结果
GET: https://www.python.org/ 48981 bytes received from https://www.python.org/. GET: https://www.yahoo.com/ 334724 bytes received from https://www.yahoo.com/. GET: https://github.com/ 132874 bytes received from https://github.com/. 同步Cost: 25.191901445388794 GET: https://www.python.org/ GET: https://www.yahoo.com/ GET: https://github.com/ 48981 bytes received from https://www.python.org/. 132874 bytes received from https://github.com/. 334179 bytes received from https://www.yahoo.com/. 异步Cost: 5.816762208938599
由于每次遇到I/O操作都会自动切换,所以,先输出三条print(‘GET: %s’ % url),当遇到data = resp.read()这步时会自动切换。read()读取属于I/O操作。
原文:https://www.cnblogs.com/infinitecodes/p/12134529.html