Gevent 是一个第三方库,可以轻松通过gevent实现并发同步或异步编程,在gevent中用到的主要模式是Greenlet, 它是以C扩展模块形式接入Python的轻量级协程。 Greenlet全部运行在主程序操作系统进程的内部,但它们被协作式地调度。
示例
import gevent
#gevent,自动挡切换
def func1():
print(‘Ashley starts running.‘) #1
gevent.sleep(2) #切换到2
print(‘Ashley tumbled over Sarah.‘) #6
def func2():
print(‘Sarah starts running.‘) #2
gevent.sleep(1) #切换到3
print(‘Sarah tumbled over Nancy.‘) #5
def func3():
print(‘Nancy starts running.‘) #3
gevent.sleep(0) #切换,但无需等待
print(‘Nancy tumbled on a hurdle.‘) #4
gevent.joinall([
gevent.spawn(func1), #spawn引发引起导致造成
gevent.spawn(func2),
gevent.spawn(func3)
])结果
Ashley starts running. Sarah starts running. Nancy starts running. Nancy tumbled on a hurdle. Sarah tumbled over Nancy. Ashley tumbled over Sarah.
看下面示例
import gevent
def task(pid):
gevent.sleep(0.5)
print(‘Task %s done‘ % pid)
def synchronous():
for i in range(1, 10):
task(i) #串行,循环10次task函数
def asynchronous():
threads = [gevent.spawn(task, i) for i in range(10)]
#发起协程,协程执行task函数,循环10次,总计10条线程,存储在threads里
gevent.joinall(threads)
print(‘Synchronous:‘)
synchronous() #同步串行
print(‘Asynchronous:‘)
asynchronous() #异步并行结果
Synchronous: #这里每条睡0.5秒 Task 1 done Task 2 done Task 3 done Task 4 done Task 5 done Task 6 done Task 7 done Task 8 done Task 9 done Asynchronous: #这里是同步执行 Task 0 done Task 1 done Task 2 done Task 3 done Task 4 done Task 5 done Task 6 done Task 7 done Task 8 done
上面程序的重要部分是将task函数封装到Greenlet内部线程的gevent.spawn。 初始化的greenlet列表存放在数组threads中,此数组被传给gevent.joinall 函数,后者阻塞当前流程,并执行所有给定的greenlet。执行流程只会在 所有greenlet执行完后才会继续向下走。
遇到IO阻塞时会自动切换任务,下面是一个简单的爬取网页的示例,我们看一下同步与异步的区别:
import gevent, time
from gevent import monkey #gevent无法识别socket中的I/O操作
from urllib import request
monkey.patch_all() #有了这个方法,相当于给socket的I/O操作加上了标识
def f(url):
print(‘GET: %s‘ % url)
resp = request.urlopen(url)
data = resp.read() #I/O任务,每当遇到I/O任务会自动切换
print(‘%d bytes received from %s.‘ % (len(data), url))
# print(‘Data from %s is: %s.‘ % (url, data))
urls = [‘https://www.python.org/‘,
‘https://www.yahoo.com/‘,
‘https://github.com/‘]
sync_start_time = time.time()
for url in urls:
f(url)
print(‘同步Cost: ‘, time.time() - sync_start_time)
async_start_time = time.time()
gevent.joinall([
gevent.spawn(f, ‘https://www.python.org/‘), #启动协程,执行f函数,并将url传递给f
gevent.spawn(f, ‘https://www.yahoo.com/‘),
gevent.spawn(f, ‘https://github.com/‘),
])
print(‘异步Cost: ‘, time.time() - async_start_time)结果
GET: https://www.python.org/ 48981 bytes received from https://www.python.org/. GET: https://www.yahoo.com/ 334724 bytes received from https://www.yahoo.com/. GET: https://github.com/ 132874 bytes received from https://github.com/. 同步Cost: 25.191901445388794 GET: https://www.python.org/ GET: https://www.yahoo.com/ GET: https://github.com/ 48981 bytes received from https://www.python.org/. 132874 bytes received from https://github.com/. 334179 bytes received from https://www.yahoo.com/. 异步Cost: 5.816762208938599
由于每次遇到I/O操作都会自动切换,所以,先输出三条print(‘GET: %s’ % url),当遇到data = resp.read()这步时会自动切换。read()读取属于I/O操作。
原文:https://www.cnblogs.com/infinitecodes/p/12134529.html