Saturday, June 11, 2011

HTTP Long Polling (aka Comet) with Nginx

Say you need to have live updates on your web site, like to receive chat messages. In this article, I will explain how to use the HTTP push module for the Nginx web server for this. There are several other ways to do this, but most of them have serious drawbacks.

Short Polling
Probably the easiest to implement is simple polling - just fire off an AJAX request every 5, 10, or 15 seconds to an action on your server that simply checks for new messages, and if there are new messages they are sent as the response. Either send the data back as JSON and have Javascript on the client end take care of what to do with it, or send back Javascript instructions for what to do with the data. This two big issues that really preclude it from being used on a large scale:
  1. Client doesn't get real time updates. The time to get a new message is on average at least half of your poll time.
  2. A lot of server and client overhead. Every 5,10,15 seconds an entirely new HTTP connection is established. This uses up resources on the server when you have more than just a few people. And, the server resources to execute the check for messages in your Rails code every 5,10,15 seconds adds up.
Web Sockets
Maybe in 4 years this will be a real option, but right now (2011) it's not an option. The standard is not yet finalized. Firefox 4 and Chrome support it, but it's disabled in Firefox due to security concerns. IE doesn't yet support it.

Long Poll (aka Comet) As A Controller Action
Another pretty simple thing to set up is long polling. With this, the client issues an AJAX request to an action, and instead of returning immediately if there is no data on the server, you sleep for a second or two and check for messages again. Keep doing this in a loop until a message is there, or you hit a timeout period. On the client end, when a response is received, it's processed, and then a connection is immediately re-established to get more messages. This is more responsive than short polling but has two huge problems:
  1. Server resources for open connections. For each client that connects, you need to have an open HTTP connection. With languages like Ruby on Rails and servers using Passenger or FastCGI, that means a separate process for each connection using up lots of memory. This severely limits the number of clients that can be on your server at once. Using JRuby and a Java server can help here but you'll still run in to some problems.
  2. Server resources for checking for messages every 1 or 2 seconds. If you're checking the database for new messages this often, lots of clients connected at once can really eat up the processor on your server when you have a lot of clients.
Nginx Push Module for Long Polling
While researching a better way to do this for a web site that I work on where traffic is ramping up, I stumped upon the HTTP Push Module for Nginx. This is a module for the web server Nginx that acts a message queue. Clients wishing to receive messages issue an HTTP connection to a path on your server configured for "subscribing" and leave the connection open. Then anyone wishing to send a message issues an HTTP POST to another path on the server configured for publishing. Clients listening will get a response to the HTTP connection with the data. Any number of different channels can be used, so you can have a channel for each user of your system. The messages are queued up, so that the client doesn't have to actually be listening, they can check in at a later point to get the message.

This works similar to the long polling I described above, except that you don't actually write the code to check for messages, and the open connections are all handled by Nginx - no Ruby. Nginx is extremely efficient at handling tons of open connections at once. I was able to test 5000 (yes, that's five THOUSAND) concurrent clients listening for messages, and publishing a message to EACH client once every minute on an Amazon EC2 micro instance, and the server didn't even break a sweat. The entire server (not just Nginx but every process on the system) was only using 220 megs of memory, and was averaging 1 or 2 % CPU usage.

Server Configuration
To install Nginx with long poll, I did these steps on a Ubuntu server:
  1. sudo apt-get install libpcre3 libpcre3-dev
  2. wget http://nginx.org/download/nginx-0.8.54.tar.gz
  3. wget http://pushmodule.slact.net/downloads/nginx_http_push_module-0.692.tar.gz
  4. tar -xzf nginx-0.8.54.tar.gz
  5. tar -xzf nginx_http_push_module-0.692.tar.gz
  6. cd nginx-0.8.54
  7. ./configure --prefix=/usr/local/nginx-0.8.54 --add-module=/home/ubuntu/nginx_http_push_module-0.692
  8. make
  9. sudo make install
  10. sudo ln -s /usr/local/nginx-0.8.54 /usr/local/nginx
  11. sudo ln -s /usr/local/nginx/sbin/nginx /usr/local/sbin/nginx
  12. Add /etc/init.d/nginx from http://wiki.nginx.org/Nginx-init-ubuntu
  13. sudo chmod a+x /etc/init.d/nginx
  14. sudo /usr/sbin/update-rc.d -f nginx defaults
  15. sudo mkdir /var/log/nginx
Note that to use this, you have to actually re-compile Nginx with the push module. If you're using the Passenger module as well, you'll have to compile that in too. I have to confess, I'm just going to use the Nginx server as a load balancer and long poll server, so it will just forward actual application requests (other than long poll) on to another server that will actually execute the Rails code. So I don't know how to compile the passenger module in as well off the top of my head. Maybe the install-passenger-nginx-module has an option to include other modules?

That's not it though, you'll need to configure the end points for subscribing and publishing. Add the following to your nginx.conf file:
# internal publish endpoint (keep it private / protected)
location /publish {
set $push_channel_id $arg_id; #/?id=239aff3 or somesuch
push_publisher;
push_store_messages on; # enable message queueing
push_message_timeout 2h; # messages expire after 2 hours, set to 0 to never expire
push_message_buffer_length 10; # store 10 messages
}
# public long-polling endpoint
location /subscribe {
push_subscriber;
# how multiple listener requests to the same channel id are handled
# - last: only the most recent listener request is kept, 409 for others.
# - first: only the oldest listener request is kept, 409 for others.
# - broadcast: any number of listener requests may be long-polling.
push_subscriber_concurrency broadcast;
set $push_channel_id $arg_id;
default_type text/plain;
}

If you really want to have a ton of clients connected, you'll need to edit some system settings to allow lots of open files. First, edit /etc/security/limits.conf and add:
* soft nofile 50000
* hard nofile 50000

Then edit /etc/sysctl.conf and add:
fs.file-max = 100000

Client Code
So that should be it for configuring the server. Here is an example page with full Javascript using Prototype to listen in the background for messages, and then display the messages with a timestamp. Enter a message to send and click Send Message, and you should see that message show up (as long as you save this file on a server with the Nginx HTTP push module configured as I describe above).

<html>
<head>
<script src="javascripts/prototype.js" type="text/javascript"></script>
<script type="text/javascript">
// Code to listen for messages from an Nginx server with
// the HTTP push module installed, and publish messages
// to this server.
// See http://rails.brentsowers.com/2011/06/http-long-polling-aka-comet-with-nginx.html
// for details on how to set the server up.
// Note that for everything to work properly, the Nginx
// server has to be the same server that this file is
// on. If you set up a separate server to handle this,
// the Javascript won't quite work as expected because
/ of the Same Origin Policy.

// Just use a generic channel ID here, this can be any
// text string, for a real system you'll want this to be
// some sort of identifier for the client.
var channelId = "asdf";

// Default initial values
var etag=0;
var lm='Thu, 1 Jan 1970 00:00:00 GMT';

function doRequest() {
new Ajax.Request('/subscribe?id=' + channelId, {
method: 'GET',
onSuccess: handleResponse,
onFailure: handleFailure,
// Custom HTTP headers have to be sent, based on
// the HTTP response from the previous request.
// This tells the server at which point to look
// for messages after. If these aren't included,
// the server will just return the first message
// in the queue
requestHeaders: [
'If-None-Match', etag,
'If-Modified-Since', lm
]
});
}

function handleResponse(response) {
var txt = response.responseText.stripScripts().stripTags();
addMessage(txt);
// Read the headers from the server response. The
// header will contain a Last-Modified header that
// is the date/time of the message we just received.
// This time will be specified on the next request, so
// we get messages after this time. There is no
// acknowledgement of messages, messages stay in the
// queue on the server until the limits set in the
// server config are met.
etag = response.getHeader("Etag") || 0;
lm = response.getHeader("Last-Modified") ||
'Thu, 1 Jan 1970 00:00:00 GMT';
doRequest();
}

function handleFailure(response) {
addMessage(error);
}

function publishMessage() {
var txt = $F('pubtext').stripScripts().stripTags();
if (txt.length == 0) {
alert("You must enter text to publish");
} else {
// The response is XML with how many messages are
// queued up, no point in looking at it here.
new Ajax.Request('/publish?id=' + channelId, {
method: 'POST',
postBody: txt
});
}
}

function addMessage(msg) {
var d = new Date();
var msg = d.toString() + ": " + msg;
$('data').insert(msg + "<br />");
}
</script>
</head>
<body onload="doRequest()">
Messages:
<div id="data">
</div>

<input type="text" name="pubtext" id="pubtext" />
<input type="button" value="Send Message" onclick="publishMessage()" />
</body>
</html>


At some point I'll set up an example page on my server so you can actually see this in action.

Integration with your web app
This is a simple example, but you could use this in a complex system as I am. When one part of your app wants to send a message to a user, simply issue an HTTP POST request to the long poll server from within your controller action (or rake task, or whatever else), using net/http, HTTParty, or any other Ruby code to issue HTTP requests. As long as the long poll server is on the same network as your app server, the response time for this will be extremely fast.

One big downside to the http push module is that there is no authentication out of the box. So theoretically anyone could listen for messages to a user (by default any number of clients can be listening for messages on the same channel). A way to get around this, is to dynamically generate random channel IDs for each subscriber every time they log in, and then store the mapping of your user ID to the current random channel ID. You'll need to set push_authorized_channels_only setting to on (see the description of this). This way, a subscriber cannot create a channel. Then when the user authenticates, issue a POST to create the channel. I haven't implemented this but I know it can be done.

Useful Links