3 Mins Read  March 28, 2013  Prashant Gurav

Websocket Handshaking

1) Websocket:

Websocket is nothing but technology which provides full duplex connection between client and server. Mainly it’s designed to implement in web browsers and web servers but it can be used in any sever-client application.

So using this, We have set full duplex connection between epub file(client) and our php script(server) will work as server.

2) Web Socket Handshake Protocol:

For establishing this full duplex connection client needs to send Websocket handshake request and server will send response for this request.


Client handshake request:

GET /mychat HTTP/1.1

Host: server.example.com

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==

Sec-WebSocket-Protocol: chat

Sec-WebSocket-Version: 13

Origin: http://example.com

Here client will send Sec-WebSocket-Key which is base64 encoded with appending magic string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, then the resulting string is hashed with SHA-1.

Whereas Server Response is like:

HTTP/1.1 101 Switching Protocols

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=

Sec-WebSocket-Protocol: chat

Details of Sec-WebSocket-Key to Sec-WebSocket-Accept :

  • x3JJHMbDL1EzLkh9GBhXDw==258EAFA5-E914-47DA-95CA-C5AB0DC85B11 string hashed by SHA-1 gives 0x1d29ab734b0c9585240069a6e4e3e91b61da1969 hexadecimal value.
  • Encoding the SHA-1 hash by Base64 yields HSmrc0sMlYUkAGmm5OPpG2HaGWk= , which is the Sec-WebSocket-Accept value.

Once the connection is established, the client and server can send WebSocket data frames back and forth in full-duplex mode. They can send text frames in full-duplex, in either direction at the same time. The data is minimally framed.

Depending upon the client environment there are some differences in request and response, the second type of examples is as follows:

Client Request :

GET /directory_path/websocket_server.php HTTP/1.1

Upgrade: WebSocket

Connection: Upgrade

Host: www.example.com:3060

Origin: ibooks://4a065c45ca207c1d8f39b7d2bdd2360b

Sec-WebSocket-Key1: 18x 6]8vM;54 *(5:  {   U1]8  z [  8

Sec-WebSocket-Key2: 1_ tx7X d  <  nw  334J702) 7]o}` 0mJôúÕ¶~

Here we received two keys as Sec-WebSocket-Key1 and Sec-WebSocket-Key2 so in this case

To prove that the handshake was received, the server has to take three pieces of information and combine them to form a response. The first two pieces of information come from the |Sec-WebSocket-Key1| and |Sec-WebSocket-Key2| fields in the client handshake:

Sec-WebSocket-Key1: 18x 6]8vM;54 *(5:  {   U1]8  z [  8
Sec-WebSocket-Key2: 1_ tx7X d  <  nw  334J702) 7]o}` 0

For each of these fields, the server has to take the digits from the value to obtain a number (in this case 1868545188 and 1733470270 respectively), then divide that number by the number of spaces characters in the value (in this case 12 and 10) to obtain a 32-bit number (155712099 and 173347027).

Server Response:

HTTP/1.1 101 WebSocket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
Sec-WebSocket-Origin: ibooks://4a065c45ca207c1d8f39b7d2bdd2360b
Sec-WebSocket-Location: ws://www.example.com:3060/directory_path/websocket_server.php

The counting of spaces is intended to make it impossible to smuggle this field into the resource name; making this even harder is the presence of two such fields, and the use of a newline as the only reliable indicator that the end of the key has been reached. The use of random characters interspersed with the spaces and the numbers ensures that the implementor actually looks for spaces and newlines, instead of being treating any character like a space, which would make it again easy to smuggle the fields into the path and trick the server.

Finally, dividing by this number of spaces is intended to make sure that even the most naive of implementations will check for spaces, since if ther server does not verify that there are some spaces, the server will try to divide by zero, which is usually fatal (a correct handshake will always have at least one space).

So we have custmized lot to make it compatible on devices, and now it’s working fine.

Below image will help to understand handshaking:


Handshaking in websocket…

3) Data Transfer:

After Handshaking we are sending requests to server, we are encoding this request in json format(see point 4 for detail), server will process this request and send response which is also in json encoded format. We are decoding them while displaying to user.

4) What is Json:

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language.

JSON is built on two structures:

  • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
  • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures.

Recommended Content

Go Back to Main Page