Beej's Bit Bucket

 ⚡ Tech and Programming Fun

WebSockets: Writing the Server

2015-11-25
The websockets logo. I think.

We're going to do a little three-part introduction to WebSockets, which provide a way to communicate back and forth with the web server without all the overhead of a standard HTTP connection.

And, in the course of it, we'll be writing a simple chat server and client.

We're going to be piggybacking on the previous blog entry, on writing A NodeJS Webserver. In particular, that webserver will be used for generic webserving on this project, and will also be what the WebSockets run on.

So you might as well start there, if you haven't already.

Part 0: A NodeJS Webserver

Part 1: Intro to WebSockets, conceptually

Part 2: Writing the client side chat code for the browser in JavaScript

Part 3: Writing the server-side chat code in NodeJS

The Server Side

All righty! At this point we have a couple components already written. We have:

What's missing is the WebSockets server portion of the entire affair.

There are many WebSockets implementations for NodeJS. For this we're going to be using this one.

Get the Source For All This

We're going to start a little bit backward, and you should download the complete source for this project, in beej-websockets-demo.zip.

Unpack the zip file, and switch to that directory:

$ unzip beej-websockets-demo.zip
$ cd beej-websockets-demo

Installing the NodeJS WebSockets library

If you have NodeJS installed, you should already have npm installed. (If you don't have NodeJS installed, install NodeJS.)

(And if you don't have npm, install npm, as well.)

Now, still in the beej-websockets-demo directory from above, run this:

Shell
$ npm install websockets

It should crunch for a while, and then say everything is beautiful and wonderful and butterflies.

And then we'll be ready to go!

This is exactly what it will be like.

Starting the Server

Assuming you're in the beej-websockets-demo directory above, run the main server script, and it should tell you it's running:

Shell
$ node server.js Listening on port 3490

Running Clients

The server has set up a normal webserver on port 3490 that you can connect to. So point your browser to URL http://localhost:3490/. And then open another window and do the same thing. And say hi:

International, global, interplanetary, intergalactic chat.
On localhost.

You're going to notice a pile of output from the server, as well, logging what's going on (file paths snipped for brevity):

Listening on port 3490
Sending file: /.../beej-websockets-demo/index.html
Sending file: /.../beej-websockets-demo/client.js
Not Found: /.../beej-websockets-demo/favicon.ico
Not Found: /.../beej-websockets-demo/favicon.ico
Websocket: accepted connection from ::1
Websocket: message: ::1: chat-join
Sending file: /.../beej-websockets-demo/index.html
Sending file: /.../beej-websockets-demo/client.js
Websocket: accepted connection from ::1
Websocket: message: ::1: chat-join
Websocket: message: ::1: chat-message
Websocket: closed: ::1: 1001: Remote peer is going away
Websocket: closed: ::1: 1001: Remote peer is going away

You can see the browsers looking for the favicon.ico file to show in as the page icon in the task bar. I didn't include one, so they 404.

The remote IP address is showing up in the logs as "::1". What's that? Turns out it's the equivalent of 127.0.0.1 for IPv6, also known as localhost.

Stopping the Server

Just hit ^C (that's CTRL-C for the uninitiated) in the window running the server. It'll quit in fiery blaze of Viking glory.

The Server, Overall

We're going to make use of the exact same HTTP server from the zeroth part of this series. It can be run standalone (as we did then), or it can be included by another NodeJS script and used more like a library (like we're going to do now).

There's another script in the beej-websockets-demo directory called server.js. This is the main one to run. It sets up the HTTP server, and then starts up the WebSockets server on top of it.

NodeJS, server.js
#!/usr/bin/env node /** * This starts the HTTP server, and then starts the Websocket server on * top of that. */ "use strict"; var httpServer = require("./httpserver.js"), wsServer = require("./wsserver.js"); var server = httpServer.start(); wsServer.start(server);

The httpServer is just a reference to the standard HTTP server code we made before. Except this time we've loaded it as a NodeJS module with require(). It's not doing anything at first, so we call .start() to fire it off, and save the returned server instance.

Then we take that server and pass it in to the WebSockets server with wsServer.start(). It's not the best-architected code in the world, but it's good for this tutorial. At least we got to reuse the HTTP server without modification, right?

What's that #!/usr/bin/env node at the top of the file? That's the shebang that tells the shell (likely Bash or some other POSIX shell) what command is supposed to run all the following text. In this case, node should be the one. The upshot is that you can just run it outright:

Unix shell
$ chmod 755 server.js # make executable $ ./server.js

And BAM! It's running.

This works great at a Unix command line on any Unix variant, including OSX.

Windows, not so much. As far as I know, Windows users have to run it explicitly:

Windows shell
C:\beej-websockets-demo> node server.js

The Server, WebSockets Madness

You'll notice the server.js code, above, pulled in both httpserver.js (which we wrote earlier) and wsserver.js. It's high time we talked about the second of these.

This WebSockets implementation piggybacks on top of an existing HTTP server. So we first launch and HTTP server, and then we say, "Hey, WebSockets Server: attach to this HTTP server we already made."

This way they're both running in the same process, which can be convenient. No reason you couldn't write it standalone and run them in different processes, but I didn't do it that way because... I'm sorry, I don't have a compelling reason, but that's what I did.

When the start() function is called, it indirectly calls the startWSServer() function. This takes the HTTP server as an argument. We'll make a new WebSocket server (again from the websocket package), pass it the HTTP server to attach to, and stick some listeners on request-connection and new-connection events:

NodeJS, wsserver.js
/** * Start the websockets server, attached to this HTTP server */ function startWSServer(httpServer) { wsServer = new websocket.server({ httpServer: httpServer }); wsServer.on('request', onServerRequest); wsServer.on('connect', onServerConnect); //wsServer.on('close', onServerClose); // unused }

The connection is a two step process. First of all, a request event comes in; its handler can accept or reject the connection. If accepted and the connection is set up successfully, a connect event is soon to follow, possibly delivered by Santa, depending on the season. He's in the RFC somewhere, isn't he?

Chat Server Logic Overview

Not much to it.

When the server gets a new connection, it adds the connection to its connection list. When a connection closes, it removes the connection from its connection list. When data arrives from a connection, it sends that data to all other connections.

This actually led to an interesting diversion. I wanted to be able to find the connection in the connection list if the connection was closed (so I could remove it). I wanted to do this quickly and not have to do a linear search through the list. So the best way was to store the connection list as an Object, and use the connections as keys (properties) to look them up in that object.

But you can't use the connection itself as a property, because property names are converted to string expressions, and this means every connection would be "[Object object]". They're all the same when converted to strings.

So what to do? I needed a way to uniquely identify the connection as a string so I could refer to it in the connection list. I could just make up an ID, and then add it to the connection as some property:

NodeJS, wsserver.js
var globalChatID = 0; var connectionList = {}; function onNewConnection(connection) { var id = globalChatID++; connection.beejsChatID = id; connectionList[id] = connection; } function onClose() { var id = this.beejsChatID; // this is the connection delete connectionList[id]; }

but now I've polluted the connection with beejsChatID, and, while technically allowed, that object doesn't really belong to me. I feel it's Wrong to mess with it in that way.

The safest way I can think of is to make a property name that's a UUID to very-probably guarantee there will not be a collision, e.g.:

NodeJS
// Generated with uuidgen on the command line: var BEEJS_CHAT_ID = '6c2efc56-7e76-4f52-83ab-39888e9aa07d'; // Then NOT this: connection.beejsChatID = id; // NOT this // But this, instead: connection[BEEJS_CHAT_ID] = id;

But is there a better way? Does the connection object already expose some uniquely identifying information we can use? Let's look at the documentation.

There's a remoteAddress property... but is that really unique? Maybe. Maybe not. The docs don't say.

But that gives me an idea. Turns out every TCP connection on the entire Internet can be uniquely identified by the quad (local port, local IP address, remote port, remote IP address). Can we get that information?

The connection object has a socket property that gives access to the underlying net.Socket NodeJS object that is the actual connection.

And look there, that object has properties for localAddress, localPort, remoteAddress, and remotePort! Just what we need!

Now since all WebSockets are connecting to this server on a particular port, we know that the local port and address (that is, the server port and address) will always be the same for all connections. Only the remote port and address will differ.

With all this in mind, we can generate a unique key for any partiular connection and use that to refer to it in the connection list. We'll just make the key the string concatenation of the address and port:

NodeJS, wsserver.js
/** * Get the object key for a connection. */ function getConnectionKey(connection) { var socket = connection.socket; // The underlying socket return socket.remoteAddress + socket.remotePort; }

And now we can get the property key ID like this, instead of using the hacked-in property:

NodeJS
var connectionList = {}; function onNewConnection(connection) { var id = getConnectionKey(connection); connectionList[id] = connection; } function onClose() { var id = getConnectionKey(connection); delete connectionList[id]; }

Much cleaner and happier.

Handling Incoming Connections

The server has a lot of choices about what it can do when it gets an incoming connection. For instance, it can look at the remote host and make sure it wants to allow that host to connect. (Often it will only want web pages it has served to connect back, not just anyone anywhere.)

Or it can reject a connection because the client wants to use an unsupported protocol.

But if it likes everything it sees, it can reply to the client, "Yes, let's chat, and let's use this particular protocol."

Because the server has this opportunity to reject an incoming connection, setting up the connection is actually a two-event process on the server: first a request event fires, at which point the server can accept or reject, and then a connect event fires which signifies the connection has been established.

(There's also a general error event for when something goes hideously awry.)

So you can fire up the WebSocket server like this, passing in a reference to the HTTP server that we should attach to:

NodeJS, wsserver.js
/** * Start the websockets server, attached to this HTTP server */ function startWSServer(httpServer) { wsServer = new websocket.server({ httpServer: httpServer }); wsServer.on('request', onServerRequest); wsServer.on('connect', onServerConnect); //wsServer.on('close', onServerClose); }

That just makes the new server, attaches it to the HTTP server, and sets up the event handlers. onServerRequest() will be called when a brand new incoming connection arrives to be accepted (or rejected), and onServerConnect() will be called when the connection is accepted and fully established.

You might see we're ignoring the close event. This fires whenever a WebSocket connection closes, but each individual connection also fires its own close event, so we'll handle our closing duties there, instead.

onServerRequest()—New Incoming Connections

Like I mentioned, we're going to accept or reject connections based on two things: (1) URL of the page the connection is coming from and (2) if the client supports one of our supported protocols. You can actually reject based on anything you'd like: time of day, phase of moon, whether or not your kid is playing hooky, whatever.

But you should reject unknown protocols simply because you won't know how to speak them.

And you (most probably) should reject connections that were served from other websites for server load and security reasons.

Firstly, let's see how we whitelist connections. The host will come to us in the form "hostname" or "hostname:port". So let's put together a quick function to whitelist every domain you will be serving this from on your webserver. Since I'm testing on localhost, which is also a machine called goat, which is 192.168.1.2 on my LAN, and I'm running the server on port 3490, I whitelist all that. If you have a production machine that will be serving the files, e.g. example.com, whitelist it here, too.

There's an argument to be made that you shouldn't ship with localhost and all your test entires in the whitelist. One could say that since every computer in the world is localhost, a spoofer could serve the HTML from their own computer, and then when it connected to the WebSocket server, it would pass the whitelist test.

We avoid that in our client code by declaring the WebSocket server to be on the same host that the HTML came from. But, of course, that client code can be modified.

But a counter-argument is that the whitelist test is inherently spoofable, since an attacker can spoof the origin in the WebSockets HTTP request with custom-written code, anyway. And that preventing this kind of spoofing isn't what the whitelist test is for.

It's actually there to prevent innocent third-parties falling victim to cross-site scripting attacks that can allow malicious parties to access the WebSocket as if they were the victim. This is known as Cross-Site WebSocket Hijacking.

NodeJS, wsserver.js
/** * Returns true if a particular host is in the whitelist */ function isWhitelisted(host) { // This should contain the URL of the site you're hosting the server var whitelist = [ "localhost", "localhost:3490", "goat:3490", "192.168.1.2:3490", "example.com" ]; // Return true if we're in the whitelist return whitelist.indexOf(host) != -1; }

And now let's write the rest of the handler than verifies and accepts the connection:

NodeJS, wsserver.js
/** * Server: Handle new connection requests */ function onServerRequest(request) { if (!isWhitelisted(request.host)) { request.reject(403, "Forbidden"); console.log("Websocket: denying connection from " + request.host); return; } // Make sure the protocol matches if (request.requestedProtocols[0] != 'beej-chat-protocol') { request.reject(400, "Unknown protocol"); console.log("Websocket: unknown protocol"); return; } // Ok, we're golden. Accept and specify the protocol. request.accept('beej-chat-protocol', request.origin); console.log("Websocket: accepted connection from " + request.remoteAddress) // at this point, onServerConnect() will be called }

A couple quick notes:

You can see we're only looking at requestedProtocol[0] and not looping through them all. Technically, we should loop through them all to see if there's a match with our desired protocol (beej-chat-protocol). But in this case, we wrote the client, and we know the client is only going to request a single protocol, so we only are checking the first one.

Secondly, the 403 and 400 codes in the reject() call are HTTP client error codes. Use whichever ones are appropriate for your error handling.

If the handler ends up calling accept(), then the connect event handler, onServerConnect() will be called.

onServerConnect()—We Finally Got One!

Once the connection is accept()ed and all set up properly, we get the connect event, and its handler is called. In our case, we want to add the new connection to our connection list, and set up event handlers on the connection for message, error, and close:

NodeJS, wsserver.js
/** * Server: Handle new connections (after being accepted in onServerRequest()) */ function onServerConnect(connection) { var k = getConnectionKey(connection); connectionList[k] = { 'connection': connection }; connection.on('message', onMessage); connection.on('error', onError); connection.on('close', onClose); }

As you can see, we're actually storing another object in the connection list with a property connection (that holds the actual connection object). We've done this since later we're going to use the same object to also hold the username associated with the connection.

Handling Events on the Connections

There's a new connection object for each incoming connection to the WebSockets server, and, as you saw, we store them all in the connectionList.

And each one gets event handlers attached to it for message, error, and close events.

So let's tackle those connection handlers!

Connection onError() Handler

All this guy does is print out the error that has occurred. It doesn't close the connection. (If the error is lethal, the connection will close, and we'll get a close event later.)

NodeJS, wsserver.js
/** * Connection: Handle errors */ function onError(error) { console.log("Websocket: error: " + this.remoteAddress + ": " + error); }

Connection onClose() Handler

The connection's close event handler just needs to do a few things.

  1. Get the username associated with the closed connection.
  2. Delete the connection from the connection list.
  3. Make a chat-leave packet.
  4. Broadcast that packet to everyone.
NodeJS, wsserver.js
/** * Connection: Handle close */ function onClose(reason, description) { var k = getConnectionKey(this); // Get the username so we can tell everyone else var username = connectionList[k].username; // Remove this connection from the list delete connectionList[k]; console.log("Websocket: closed: " + this.remoteAddress + ": " + reason + ": " + description); // Tell everyone this user has left var response = { 'type': 'chat-leave', 'payload': { 'username': username } }; broadcast(response); }

If you notice where we're building the response packet, we do it so that it's compliant with the beej-chat-protocol. (See discussion in Part 1 and Part 2.)

And how about that broadcast() call? We'll get to it later, but for now, just understand that it sends the response to all connections in the connectionList.

Connection onMessage() Handler: Where All the Action Happens

This is it! This is the handler that gets called for the message event, when new messages arrive to the server. The server has to decide what to do with them, and then respond (or whatever) in an appropriate manner.

In this case, it's a chat server, so it'd better behave like one!

The beej-chat-protocol defines what kinds of messages will arrive. We have them organized in JSON like so:

JSON
{ "type": [type] "payload": [payload] }

The type is a string that tells us what kind of packet this is. The type of packet then defines what will be contained in the payload. Defined packet types are:

These can be sent from the client to the server, or server to the client.

So what happens when we get one on the server?

The first thing to is check the type on the packet because that's going to dictate how we process the payload.

You can do that with a switch or if-else, but in this case I used the type as a property name into an object that held the handlers for each packet type. Do whatever makes the most sense and is clearest, of course. :)

NodeJS, wsserver.js
/** * Message type handlers */ var messageHandler = { "chat-join": function (message) { // Handle it here }, "chat-message": function (message) { // Handle it here } };

And then we call that magical stuff from our connection's message event handler. This handler will parse the JSON message, then look in the messageHandler object and find the property for this message type. Then call it.

NodeJS, wsserver.js
/** * Connection: Handle incoming messages */ function onMessage(message) { message = JSON.parse(message.utf8Data); storeUsername(this, message); console.log("Websocket: message: " + this.remoteAddress + ": " + message.type); // Try to find the message handler for this message type if (message.type in messageHandler) { messageHandler[message.type](message, this); } else { console.log("Websocket: unknown payload type: " + this.remoteAddress + ": " + message.type); } }

And yes, you caught me sneaking a call to storeUsername() in there. This function looks for a username in the payload and then stores it in the connection list entry for this connection. This is useful later when a client quits, and we want to broadcast to everyone else that "Beej left the chat".

This is actually a totally hackish way of tracking the username, but I wanted to keep the demo simple.

A much more correct way would be to not let the user even join the chat until they'd logged in, sending a username packet at that time. Or, if they changed their user name on the fly, and new username packet would be sent then, too. And the server could broadcast this to all the other clients, and we'd all know the user's new name.

But right now, the client doesn't send a username packet at all. It just bundles the username along with chat-join and chat-message packets as they arrive.

So to fix this, it would mean more complexity in the server, client, and UI. Which you should totally do as an exercise. :)

chat-join Handler

Let's take look at the chat-join handler in the messageHandler() code, above. This is what will do the right thing when a chat-join message comes in from a client.

What we're going to do is simply rebroadcast the same message to all the connected clients so they can print "So-and-so joined the chat" on the screen.

We do take a slight liberty and clean up the username by trimming whitespace off either end before sending it back.

Then we call broadcast() (which we haven't talked about yet) to send the result to all connected clients:

NodeJS, wsserver.js
[...] "chat-join": function (message) { var response = { 'type': 'chat-join', 'payload': { 'username': message.payload.username.trim() } }; broadcast(response); }, [...]

chat-message Handler—Finally, someone is talking to us!

We're going to do the same thing again: just broadcast the incoming message to every other client, so they can put "So-and-so just said: Hello, World or whatever!" on their screens.

And, again, we're going to do some sanitizing. If the message is empty, we don't send anything. And we clean up the whitespace on the username just like with the chat-join packet.

NodeJS, wsserver.js
[...] "chat-message": function (message) { var payload = message.payload; var text = payload.message.trim(); // ignore empty messages if (text === '') { return; } // make a new chat message to broadcast to everyone var response = { 'type': 'chat-message', 'payload': { 'username': payload.username.trim(), 'message': text } }; broadcast(response); } [...]

And that's it for handling incoming messages!

If you're keenly observant, you'll see that we're not handling incoming chat-leave messages. But only because the client never sends them. :) The server is the only one who generates these when the connection closes.

broadcast()—Sending data to all the clients

Lastly, but not leastly, we have to talk about how we're going to send these messages out to all the clients.

First a quick primer on how to send data over a connection with the send() method:

JavaScript
connection.send("Hello, World!");

Pretty self-explanatory, actually, I guess.

So we want to loop through our connection list, and send our data over to each connection.

Our data should be a JSON string (as agreed upon in beej-chat-protocol, so we need to convert it from a JS Object to that with the internal JSON.stringify() call.

NodeJS, wsserver.js
/** * Broadcast a chat message to all connected clients */ function broadcast(response) { for (var k in connectionList) if (connectionList.hasOwnProperty(k)) { var destConnection = connectionList[k].connection; destConnection.send(JSON.stringify(response)); } }

And that's the game!

Future Directions

Of course there's more to it! There's always more to it!

You can look into sending binary data, for one.

Or think about how you'd structure data if you wanted the user to be able to join different chat rooms.

Or how you'd do things if users could chat, but could also, say, submit game moves. How would you multiplex those messages over the same WebSocket? Or would it be better to open another WebSocket?

Wrap-Up

What a slog! I'm like how I could summarize the server in three sentences, and then the exploration exploded into a book.

But I hope that even though there are a number of little pieces, each little piece on its own makes sense, and they (relatively) clearly come together well enough.

For summary purposes, here's the code: beej-websockets-demo.zip.

See you next Wednesday!

License

The code attached to this article is licensed under the MIT open source license.

Share me!

Comments

blog comments powered by Disqus
Blog  ⚡  Email beej@beej.us  ⚡  Home page