Introduction
Are you looking to enhance communication between clients and servers in real-time, with millions of active connections? If so, you're not alone. Many companies that rely on agents installed in customer environments struggle with scaling server-initiated communication, especially when connections need to be persistent.
At Druva, a data protection solutions company, we face this challenge every day. With millions of clients per customer and thousands of customers, we understand the need for long-running, bi-directional, synchronous, and asynchronous connections.
WebSockets are a cutting-edge communication protocol that enables real-time, bi-directional data transfer between clients and servers. They're becoming more popular as they can reduce latency and overhead, and improve scalability. In this post, we'll explore how to scale millions of active, persistent connections with WebSockets without taking up too much space. Say goodbye to communication headaches and hello to seamless, efficient client-server communication!
Possible Approaches
One option is frequent polls, where clients repeatedly ask the server if there are any updates. However, this can quickly become inefficient, especially with a large number of active connections.
Another approach is HTTP long-polling, which keeps the connection open until there's new data to return. While this method works for some use cases, it has its limitations. For instance, it's not a duplex, meaning that only the server can initiate communication. Additionally, per-connection authorization and higher server footprint can pose challenges. We can also run into some runtime issues, like detecting server disconnects after a delay.
WebSockets provide an alternative approach to tackling this problem. With WebSockets, the server can initiate communication with clients, and the connection remains open until explicitly closed. This makes it a suitable option for real-time, bi-directional, synchronous, and asynchronous communication with multi-million active connections.
WebSockets
A WebSocket is a protocol that enables bi-directional, real-time communication between a client and a server over a single, long-lived connection. Unlike traditional HTTP requests that require the client to constantly poll the server for updates, a WebSocket connection allows the server to push data to the client whenever new information becomes available.
Setup
The initial setup of a WebSocket connection involves a handshake protocol between the client and the server. Here are the steps involved in this process:
The client sends an HTTP request to the server, typically using the GET method, with an "Upgrade" header field set to "WebSocket" and a "Connection" header field set to "Upgrade." The request also includes a unique "Sec-WebSocket-Key" header field, which is a randomly generated key that the server will use to prove that it can speak the WebSocket protocol.
If the server supports the WebSocket protocol, it will respond with an HTTP response with a status code of "101 Switching Protocols." The response will include an "Upgrade" header field set to "WebSocket," a "Connection" header field set to "Upgrade,” and a "Sec-WebSocket-Accept" header field that is calculated using the value of the client's "Sec-WebSocket-Key" header field. The server may also include additional header fields in the response.
Once the client receives the server's response, the WebSocket connection is established and both the client and server can begin sending data to each other in real time.
It's important to note that the WebSocket protocol also supports additional options for setting up the initial handshake, including specifying subprotocols and extensions. However, the basic steps outlined above are the core components of the WebSocket handshake protocol.