I am building a small chat application for friends, but unsure about how to get information in a timely manner that is not as manual or as rudimentary as forcing a page refresh.
Currently, I am implementing this using simple AJAX, but this has the disadvantage of regularly hitting the server when a short timer elapses.
In researching long/short polling, I ran across HTML5 WebSockets. This seems easy to implement, but I'm not sure if there are some hidden disadvantages. For example, I think WebSockets is only supported by certain browsers. Are there other disadvantages to WebSockets that I should be aware of?
Since it seems like both technologies do the same thing, in what sorts of scenarios would one prefer to use one over the other? More specifically, has HTML5 WebSockets made AJAX long/short polling obsolete, or are there compelling reasons to prefer AJAX over WebSockets?
WebSockets is definitely the future now.
Long polling is a dirty workaround to prevent creating connections for each request like AJAX does - but long polling was created when WebSockets didn't exist. Now due to WebSockets, long polling is going away no more.
WebRTC allows for peer-to-peer communication.
I recommend learning WebSockets.
Comparison:
of different communication techniques on the web
AJAX - request → response. Creates a connection to the server, sends request headers with optional data, gets a response from the server, and closes the connection. Supported in all major browsers.
Long poll - request → wait → response. Creates a connection to the server like AJAX does, but maintains a keep-alive connection open for some time (not long though). During connection, the open client can receive data from the server. The client has to reconnect periodically after the connection is closed, due to timeouts or data eof. On server side it is still treated like an HTTP request, same as AJAX, except the answer on request will happen now or some time in the future, defined by the application logic. support chart (full) | wikipedia
WebSockets - client ↔ server. Create a TCP connection to the server, and keep it open as long as needed. The server or client can easily close the connection. The client goes through an HTTP compatible handshake process. If it succeeds, then the server and client can exchange data in both directions at any time. It is efficient if the application requires frequent data exchange in both ways. WebSockets do have data framing that includes masking for each message sent from client to server, so data is simply encrypted. support chart (very good) | wikipedia
WebRTC - peer ↔ peer. Transport to establish communication between clients and is transport-agnostic, so it can use UDP, TCP or even more abstract layers. This is generally used for high volume data transfer, such as video/audio streaming, where reliability is secondary and a few frames or reduction in quality progression can be sacrificed in favour of response time and, at least, some data transfer. Both sides (peers) can push data to each other independently. While it can be used totally independent from any centralised servers, it still requires some way of exchanging endPoints data, where in most cases developers still use centralised servers to "link" peers. This is required only to exchange essential data for establishing a connection, after which a centralised server is not required. support chart (medium) | wikipedia
Server-Sent Events - client ← server. Client establishes persistent and long-term connection to server. Only the server can send data to a client. If the client wants to send data to the server, it would require the use of another technology/protocol to do so. This protocol is HTTP compatible and simple to implement in most server-side platforms. This is a preferable protocol to be used instead of Long Polling. support chart (good, except IE) | wikipedia
Advantages:
The main advantage of WebSockets server-side, is that it is not an HTTP request (after handshake), but a proper message based communication protocol. This enables you to achieve huge performance and architecture advantages. For example, in node.js, you can share the same memory for different socket connections, so they can each access shared variables. Therefore, you don't need to use a database as an exchange point in the middle (like with AJAX or Long Polling with a language like PHP). You can store data in RAM, or even republish between sockets straight away.
Security considerations
People are often concerned about the security of WebSockets. The reality is that it makes little difference or even puts WebSockets as better option. First of all, with AJAX, there is a higher chance of MITM, as each request is a new TCP connection that is traversing through internet infrastructure. With WebSockets, once it's connected it is far more challenging to intercept in between, with additionally enforced frame masking when data is streamed from client to server as well as additional compression, which requires more effort to probe data. All modern protocols support both: HTTP and HTTPS (encrypted).
P.S.
Remember that WebSockets generally have a very different approach of logic for networking, more like real-time games had all this time, and not like http.
One contending technology you've omitted is Server-Sent Events / Event Source. What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet? has a good discussion of all of these. Keep in mind that some of these are easier than others to integrate with on the server side.
For chat applications or any other application that is in constant conversation with the server, WebSockets
are the best option. However, you can only use WebSockets
with a server that supports them, so that may limit your ability to use them if you cannot install the required libraries. In which case, you would need to use Long Polling
to obtain similar functionality.
https://i.stack.imgur.com/0XFxb.png
XHR polling A Request is answered when the event occurs (could be straight away, or after a delay). Subsequent requests will need to made to receive further events. The browser makes an asynchronous request of the server, which may wait for data to be available before responding. The response can contain encoded data (typically XML or JSON) or Javascript to be executed by the client. At the end of the processing of the response, the browser creates and sends another XHR, to await the next event. Thus the browser always keeps a request outstanding with the server, to be answered as each event occurs. Wikipedia
Server Sent Events Client sends request to server. Server sends new data to webpage at any time. Traditionally, a web page has to send a request to the server to receive new data; that is, the page requests data from the server. With server-sent events, it's possible for a server to send new data to a web page at any time, by pushing messages to the web page. These incoming messages can be treated as Events + data inside the web page. Mozilla
WebSockets After the initial handshake (via HTTP protocol). Communication is done bidirectionally using the WebSocket protocol. The handshake starts with an HTTP request/response, allowing servers to handle HTTP connections as well as WebSocket connections on the same port. Once the connection is established, communication switches to a bidirectional binary protocol which does not conform to the HTTP protocol. Wikipedia
Success story sharing