本篇介绍的是websocket,但是并不介绍它的协议格式,一般能看明白http头也能明白websocket在协议切换前的协商,能看明白IP报头也就对websocket在协议切换后通讯格式不陌生。
websocket在可靠传输TCP之上,提供了消息包的传输服务,即是在websocket的一端的应用层调用websocket发送指定大小的消息,在另一端的websocket就会向协议处理过程提交同样大小的消息。至于消息的格式客户自定义。
本篇将通过http协议文档串联起来,了解websocket想要解决什么问题。
如果在win平台下需要参考libwebsocket的例子,可以参考上一篇《win平台下编译带libev和libuv的libwebsocket》。
本篇参考的协议文档有
rfc 1945 《Hypertext Transfer Protocol -- HTTP/1.0》 May 1996
1.3 Overall Operation
7.2.2 Length
8. Method Definitions
rfc 2068 《Hypertext Transfer Protocol -- HTTP/1.1》 January 1997
rfc 2616 《Hypertext Transfer Protocol -- HTTP/1.1》 June 1999
8.1 Persistent Connections
10.1.2 101 Switching Protocols
14.10 Connection
19.7.1 Compatibility with HTTP/1.0 Persistent Connections
rfc 2817 《Upgrading to TLS Within HTTP/1.1》 May 2000
3. Client Requested Upgrade to HTTP over TLS
可见http/1.0协议发展到现在已经有30年了。
首先我们来看http/1.0协议文档
1.3 Overall Operation [page 5-6] The HTTP protocol is based on a request/response paradigm. A client establishes a connection with a server and sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content. The server responds with a status line, including the message‘s protocol version and a success or error code, followed by a MIME-like message containing server information, entity metainformation, and possible body content. [page 7] On the Internet, HTTP communication generally takes place over TCP/IP connections. The default port is TCP 80 [15], but other ports can be used. This does not preclude HTTP from being implemented on top of any other protocol on the Internet, or on other networks. HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used, and the mapping of the HTTP/1.0 request and response structures onto the transport data units of the protocol in question is outside the scope of this specification.
从运作概述可知,HTTP协议基于请求/响应(request/response)的范式。客户端建立对服务端的连接,发送一个固定格式的请求;服务端对请求作出响应。也就是服务端不主动向客户端发送数据。
接着文档指出HTTP协议只相信可靠的传输,在Internet范围内首选TCP,默认端口80,众所周知。所以我们在底层传输协议锁定在TCP,我们在本篇提到的socket就等同于使用在INET协议簇的TCP协议的socket。
websocket一拆开就是web和socket,如何利用http协议通讯建立起的socket连接,玩出新天地。
[page 7] Except for experimental applications, current practice requires that the connection be established by the client prior to each request and closed by the server after sending the response. Both clients and servers should be aware that either party may close the connection prematurely, due to user action, automated time-out, or program failure, and should handle such closing in a predictable fashion. In any case, the closing of the connection by either or both parties always terminates the current request, regardless of its status.
除某些实验性应用程序外,http 1.0协议要求客户端为每个请求建立起连接,并由服务端在发送完响应消息后关闭连接。在这个过程中,客服两端都可以关闭连接,其后果是中止本次请求(响应)。
虽然在http 1.0已经有对长度的定义和使用,用来描述消息体(body)的大小,但是也不能复用(reuse)一个连接,并且在没有content-length的情况下,由服务端关闭连接来标志响应发送的结束,客户端去判断。
下面是相关的文档
7.2.2 Length [Page 29] When an Entity-Body is included with a message, the length of that body may be determined in one of two ways. If a Content-Length header field is present, its value in bytes represents the length of the Entity-Body. Otherwise, the body length is determined by the closing of the connection by the server. Closing the connection cannot be used to indicate the end of a request body, since it leaves no possibility for the server to send back a response. Therefore, HTTP/1.0 requests containing an entity body must include a valid Content-Length header field. If a request contains an entity body and Content-Length is not specified, and the server does not recognize or cannot calculate the length from other fields, then the server should send a 400 (bad request) response.
http协议可以做什么呢?http定义了几种方法,我们使用得最多就GET方法。
8.1 GET [Page 30] The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response and not the source text of the process, unless that text happens to be the output of the process.
我们通过http协议,获取来自互联网的资源,由URI指定资源的目的地址。目标资源可能不是一个实体而是一个处理过程(进程),获取它处理的结果。
到了http 1.1协议,添加了Connection字段,它的使用如文档
14.10 Connection The Connection general-header field allows the sender to specify options that are desired for that particular connection and MUST NOT be communicated by proxies over further connections. The Connection header has the following grammar: Connection = "Connection" ":" 1#(connection-token) connection-token = token
支持http 1.1协议的服务端开始支持持久连接,也就是当一个request/response结束后,服务端并不关闭连接。需要用"Connection: close"明确标示不支持持久连接的实现。
8.1 Persistent Connections 8.1.1 Purpose Persistent HTTP connections have a number of advantages: - HTTP requests and responses can be pipelined on a connection. Pipelining allows a client to make multiple requests without waiting for each response, allowing a single TCP connection to be used much more efficiently, with much lower elapsed time. 8.1.2.1 Negotiation An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to maintain a persistent connection unless a Connection header including the connection-token "close" was sent in the request. If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close.
文档中指出了持久连接的用途目的,其中最突出就是,连接可以复用来进行不止一次的request/response,并且两端可以通过Connection字段来控制持久连接的结束。
试想一下一个网页下引用了多个图片,脚本,文本等资源,如果用http 1.0协议,就必须每个资源的请求都建立一个连接,但是用http 1.1协议就可以将这些请求pipeline到一个连接来完成。
我们来看一下libwebsocket的测试例子-简单http服务的包capture:
(红框是请求,绿框是响应,蓝线是浏览器端口关联)
7681为http服务端口,test.html引用了两个图片分别是logo.png和favicon.ico,整个test.html网页的document引用到的资源只通过一次TCP连接完成了多个http请求。
接下来我们继续看文档,看看http 1.1还有什么新元素。
10.1.2 101 Switching Protocols The server understands and is willing to comply with the client‘s request, via the Upgrade message header field (section 14.42), for a change in the application protocol being used on this connection. The server will switch protocols to those defined by the response‘s Upgrade header field immediately after the empty line which terminates the 101 response. The protocol SHOULD be switched only when it is advantageous to do so. For example, switching to a newer version of HTTP is advantageous over older versions, and switching to a real-time, synchronous protocol might be advantageous when delivering resources that use such features.
http 1.1开始支持协议切换,websocket自然也就有了一切的支持要素了,想来也不是什么新鲜事,websocket必然就应运而生。
我们看一下其它一些早就应用这个特性的协议,TLS(rfc 2817 Upgrading to TLS Within HTTP/1.1)
3. Client Requested Upgrade to HTTP over TLS When the client sends an HTTP/1.1 request with an Upgrade header field containing the token "TLS/1.0", it is requesting the server to complete the current HTTP/1.1 request after switching to TLS/1.0. 3.1 Optional Upgrade A client MAY offer to switch to secured operation during any clear HTTP request when an unsecured response would be acceptable: GET http://example.bank.com/acct_stat.html?749394889300 HTTP/1.1 Host: example.bank.com Upgrade: TLS/1.0 Connection: Upgrade
现在我们回故一下上面浏览过的文档。
1. http协议依赖可靠传输,在互联网环境中首选使用是TCP传输协议。
2. http协议是基于request/response范式的,服务端不主动向客户端发数据。
3. http 1.0不支持持久连接,每次request/response都要建立一次TCP连接。
4. http 1.1要求支持持久连接,连接可以复用完成多次request/response,但还是不能离开request/response范式,还得依靠轮询。
5. http 1.1开始支持协议切换。
各种终端硬件性能翻天覆地的发展,各种软件技术的强大支持,Web应用发展的需要。
websocket出现了。
原理就是利用http协议进行协议切换,将TCP连接(大家说它socket)从http协议中解放出来,进行更有效的应用数据通讯。
websocket协议提供消息包(frame或message)传输服务,使用者可以定义任形式的应用协议。websocket使用http协议切换之前,客户端和服务端必须协商好各种参数,最主要的就是绑定到哪一种使用者定义的应用协议,询问http服务器可不可切换websocket协议,以及期望http服务器返回101回答我切换好websocket协议并且对你说的某某protocol和extension支持。它们通过一次request/response,利用http协议头字段进行参数的交互协商,然后切换成websocket通讯。
libwebsocket库的测试例子为我们展现了websocket致力和善于解决的问题。server例子实现了最基本简单的http服务以及其它一些基于ws的测试服务,可以看到一个http服务器如何同时支持http和ws协议。test.html使用到的测试例子全都由server测试程序提供服务,分别有基本的网页服务,post数据,基于ws协议的自增数推送服务,以及类似于聊天室的消息转发广播的镜子(mirror)服务(不过不是文字记录,而是绘图路径)。还有几个客户端测试应用程序,如何用ws消费server程序的服务,以及通过镜子服务与浏览器实时数据传递。
本篇到此结束,多谢大家观看。
原文:http://www.cnblogs.com/bbqzsl/p/5967966.html