2015-11-25
WebSockets: Writing the Server
The websockets logo. I think.
We're going to do a little three-part introduction to WebSockets, which provide a way to communicate back and forth with the web server without all the overhead of a standard HTTP connection.
And, in the course of it, we'll be writing a simple chat server and client.
We're going to be piggybacking on the previous blog entry, on writing A NodeJS Webserver. In particular, that webserver will be used for generic webserving on this project, and will also be what the WebSockets run on.
So you might as well start there, if you haven't already.
- Part 0: A NodeJS Webserver
- Part 1: Intro to WebSockets, conceptually
- Part 2: Writing the client side chat code for the browser in JavaScript
- Part 3: Writing the server-side chat code in NodeJS
The Server Side
All righty! At this point we have a couple components already written. We have:
What's missing is the WebSockets server portion of the entire affair.
There are many WebSockets implementations for NodeJS. For this we're going to be using this one.
Get the Source For All This
We're going to start a little bit backward, and you should download the complete source for this project from GitHub.
$ git clone https://github.com/beejjorgensen/websockets-demo.git
$ cd websockets-demo
Installing the NodeJS WebSockets library
If you have NodeJS installed, you should already have npm
installed.
(If you don't have NodeJS installed, install
NodeJS. I recommend finding packages for your
specific system.)
(And if you don't have npm, install npm, as well.)
Now, still in the websockets-demo
directory from above, run this:
$ npm install
It should crunch for a while, and then say everything is beautiful and wonderful and butterflies.
And then we'll be ready to go!
This is exactly what it will be like.
Starting the Server
Assuming you're in the websockets-demo
directory above, run the main
server script, and it should tell you it's running:
$ node main-server
Listening on port 3490
---------------------------------------------------------
Point your web browser windows to: http://localhost:3490/
---------------------------------------------------------
Running Clients
The server has set up a normal webserver on port 3490 that you can
connect to. So point your browser to URL
http://localhost:3490/
. And then open
another window and do the same thing. And say hi:
International, global, interplanetary, intergalactic chat. On localhost.
You're going to notice a pile of output from the server, as well, logging what's going on (file paths snipped for brevity):
Listening on port 3490
Sending file: /.../websockets-demo/index.html
Sending file: /.../websockets-demo/client.js
Not Found: /.../websockets-demo/favicon.ico
Not Found: /.../websockets-demo/favicon.ico
Websocket: accepted connection from ::1
Websocket: message: ::1: chat-join
Sending file: /.../websockets-demo/index.html
Sending file: /.../websockets-demo/client.js
Websocket: accepted connection from ::1
Websocket: message: ::1: chat-join
Websocket: message: ::1: chat-message
Websocket: closed: ::1: 1001: Remote peer is going away
Websocket: closed: ::1: 1001: Remote peer is going away
You can see the browsers looking for the favicon.ico
file to show in
as the page icon in the task bar. I didn't include one, so they 404.
The remote IP address is showing up in the logs as "::1
". What's
that? Turns out it's the equivalent of
127.0.0.1 for
IPv6, also known as localhost
.
Stopping the Server
Just hit ^C
(that's CTRL-C for the
uninitiated) in the
window running the server. It'll quit in fiery blaze of Viking glory.
The Server, Overall
We're going to make use of the exact same HTTP server from the zeroth part of this series. It can be run standalone (as we did then), or it can be included by another NodeJS script and used more like a library (like we're going to do now).
There's another script in the websockets-demo
directory called
server.js
. This is the main one to run. It sets up the HTTP server,
and then starts up the WebSockets server on top of it.
#!/usr/bin/env node
// server.js
/**
* This starts the HTTP server, and then starts the Websocket server on
* top of that.
*/
"use strict";
const httpServer = require("./httpserver.js"),
wsServer = require("./wsserver.js");
let server = httpServer.start();
wsServer.start(server);
The httpServer
is just a reference to the standard HTTP server code we
made before. Except this time we've loaded it as a NodeJS module with
require()
. It's not doing anything at first, so we call .start()
to
fire it off, and save the returned server instance.
Then we take that server
and pass it in to the WebSockets server with
wsServer.start()
. It's not the best-architected code in the world, but
it's good for this tutorial. At least we got to reuse the HTTP server
without modification, right?
What's that
#!/usr/bin/env node
at the top of the file? That's the shebang that tells the shell (likely Bash or some other POSIX shell) what command is supposed to run all the following text. In this case,node
should be the one. The upshot is that you can just run it outright:$ chmod 755 server.js # make executable $ ./server.js
And BAM! It's running.
This works great at a Unix command line on any Unix variant, including OSX.
Windows, not so much. As far as I know, Windows users have to run it explicitly:
C:\websockets-demo> node server.js
The Server, WebSockets Madness
You'll notice the server.js
code, above, pulled in both
httpserver.js
(which we wrote earlier) and wsserver.js
. It's high
time we talked about the second of these.
This WebSockets implementation piggybacks on top of an existing HTTP server. So we first launch and HTTP server, and then we say, "Hey, WebSockets Server: attach to this HTTP server we already made."
This way they're both running in the same process, which can be convenient. No reason you couldn't write it standalone and run them in different processes, but I didn't do it that way because... I'm sorry, I don't have a compelling reason, but that's what I did.
When the start()
function is called, it indirectly calls the
startWSServer()
function. This takes the HTTP server as an argument.
We'll make a new WebSocket server (again from the
websocket package),
pass it the HTTP server to attach to, and stick some listeners on
request-connection and new-connection events:
// wsserver.js
/**
* Start the websockets server, attached to this HTTP server
*/
function startWSServer(httpServer) {
wsServer = new websocket.server({
httpServer: httpServer
});
wsServer.on('request', onServerRequest);
wsServer.on('connect', onServerConnect);
//wsServer.on('close', onServerClose); // unused
}
The connection is a two step process. First of all, a request
event
comes in; its handler can accept or reject the connection. If accepted
and the connection is set up successfully, a connect
event is soon to
follow, possibly delivered by
Santa, depending on the
season. He's in the RFC
somewhere, isn't he?
Chat Server Logic Overview
Not much to it.
When the server gets a new connection, it adds the connection to its connection list. When a connection closes, it removes the connection from its connection list. When data arrives from a connection, it sends that data to all other connections.
This actually led to an interesting diversion. I wanted to be able to find the connection in the connection list if the connection was closed (so I could remove it). I wanted to do this quickly and not have to do a linear search through the list. So the best way was to store the connection list as an
Object
, and use the connections as keys (properties) to look them up in that object.But you can't use the connection itself as a property, because property names are converted to string expressions, and this means every connection would be "
[Object object]
". They're all the same when converted to strings.So what to do? I needed a way to uniquely identify the connection as a string so I could refer to it in the connection list. I could just make up an ID, and then add it to the connection as some property:
// wsserver.js let globalChatID = 0; let connectionList = {}; function onNewConnection(connection) { let id = globalChatID++; connection.beejsChatID = id; connectionList[id] = connection; } function onClose() { let id = this.beejsChatID; // this is the connection delete connectionList[id]; }
but now I've polluted the connection with
beejsChatID
, and, while technically allowed, that object doesn't really belong to me. I feel it's Wrong to mess with it in that way.The safest way I can think of is to make a property name that's a UUID to very-probably guarantee there will not be a collision, e.g.:
// Generated with uuidgen on the command line: let BEEJS_CHAT_ID = '6c2efc56-7e76-4f52-83ab-39888e9aa07d'; // Then NOT this: connection.beejsChatID = id; // NOT this // But this, instead: connection[BEEJS_CHAT_ID] = id;
But is there a better way? Does the connection object already expose some uniquely identifying information we can use? Let's look at the documentation.
There's a
remoteAddress
property... but is that really unique? Maybe. Maybe not. The docs don't say.But that gives me an idea. Turns out every TCP connection on the entire Internet can be uniquely identified by the quad (local port, local IP address, remote port, remote IP address). Can we get that information?
The connection object has a
socket
property that gives access to the underlying net.Socket NodeJS object that is the actual connection.And look there, that object has properties for
localAddress
,localPort
,remoteAddress
, andremotePort
! Just what we need!Now since all WebSockets are connecting to this server on a particular port, we know that the local port and address (that is, the server port and address) will always be the same for all connections. Only the remote port and address will differ.
With all this in mind, we can generate a unique key for any partiular connection and use that to refer to it in the connection list. We'll just make the key the string concatenation of the address and port:
// wsserver.js /** * Get the object key for a connection. */ function getConnectionKey(connection) { let socket = connection.socket; // The underlying socket return socket.remoteAddress + socket.remotePort; }
And now we can get the property key ID like this, instead of using the hacked-in property:
let connectionList = {}; function onNewConnection(connection) { let id = getConnectionKey(connection); connectionList[id] = connection; } function onClose() { let id = getConnectionKey(connection); delete connectionList[id]; }
Much cleaner and happier.
Handling Incoming Connections
The server has a lot of choices about what it can do when it gets an incoming connection. For instance, it can look at the remote host and make sure it wants to allow that host to connect. (Often it will only want web pages it has served to connect back, not just anyone anywhere.)
Or it can reject a connection because the client wants to use an unsupported protocol.
But if it likes everything it sees, it can reply to the client, "Yes, let's chat, and let's use this particular protocol."
Because the server has this opportunity to reject an incoming
connection, setting up the connection is actually a two-event process on
the server: first a request
event fires, at which point the server can
accept or reject, and then a connect
event fires which signifies the
connection has been established.
(There's also a general error
event for when something goes hideously
awry.)
So you can fire up the WebSocket server like this, passing in a reference to the HTTP server that we should attach to:
// wsserver.js
/**
* Start the websockets server, attached to this HTTP server
*/
function startWSServer(httpServer) {
wsServer = new websocket.server({
httpServer: httpServer
});
wsServer.on('request', onServerRequest);
wsServer.on('connect', onServerConnect);
//wsServer.on('close', onServerClose);
}
That just makes the new server, attaches it to the HTTP server, and sets
up the event handlers. onServerRequest()
will be called when a brand
new incoming connection arrives to be accepted (or rejected), and
onServerConnect()
will be called when the connection is accepted and
fully established.
You might see we're ignoring the close
event. This fires whenever a
WebSocket connection closes, but each individual connection also fires
its own close
event, so we'll handle our closing duties there,
instead.
onServerRequest()
—New Incoming Connections
Like I mentioned, we're going to accept or reject connections based on two things: (1) URL of the page the connection is coming from and (2) if the client supports one of our supported protocols. You can actually reject based on anything you'd like: time of day, phase of moon, whether or not your kid is playing hooky, whatever.
But you should reject unknown protocols simply because you won't know how to speak them.
And you (most probably) should reject connections that were served from other websites for server load and security reasons.
Firstly, let's see how we whitelist connections. The host will come to
us in the form "hostname" or "hostname:port". So let's put
together a quick function to whitelist every domain you will be serving
this from on your webserver. Since I'm testing on localhost
, which is
also a machine called goat
, which is 192.168.1.2
on my LAN, and I'm
running the server on port 3490
, I whitelist all that. If you have a
production machine that will be serving the files, e.g. example.com
,
whitelist it here, too.
There's an argument to be made that you shouldn't ship with
localhost
and all your test entires in the whitelist. One could say that since every computer in the world islocalhost
, a spoofer could serve the HTML from their own computer, and then when it connected to the WebSocket server, it would pass the whitelist test.We avoid that in our client code by declaring the WebSocket server to be on the same host that the HTML came from. But, of course, that client code can be modified.
But a counter-argument is that the whitelist test is inherently spoofable, since an attacker can spoof the origin in the WebSockets HTTP request with custom-written code, anyway. And that preventing this kind of spoofing isn't what the whitelist test is for.
It's actually there to prevent innocent third-parties falling victim to cross-site scripting attacks that can allow malicious parties to access the WebSocket as if they were the victim. This is known as Cross-Site WebSocket Hijacking.
// wsserver.js
/**
* Returns true if a particular host is in the whitelist
*/
function isWhitelisted(host) {
// This should contain the URL of the site you're hosting the server
let whitelist = [
"localhost",
"localhost:3490",
"goat:3490",
"192.168.1.2:3490",
"example.com"
];
// Return true if we're in the whitelist
return whitelist.indexOf(host) != -1;
}
And now let's write the rest of the handler than verifies and accepts the connection:
// wsserver.js
/**
* Server: Handle new connection requests
*/
function onServerRequest(request) {
if (!isWhitelisted(request.host)) {
request.reject(403, "Forbidden");
console.log("Websocket: denying connection from " + request.host);
return;
}
// Make sure the protocol matches
if (request.requestedProtocols[0] != 'beej-chat-protocol') {
request.reject(400, "Unknown protocol");
console.log("Websocket: unknown protocol");
return;
}
// Ok, we're golden. Accept and specify the protocol.
request.accept('beej-chat-protocol', request.origin);
console.log("Websocket: accepted connection from " + request.remoteAddress)
// at this point, onServerConnect() will be called
}
A couple quick notes:
You can see we're only looking at requestedProtocol[0]
and not looping
through them all. Technically, we should loop through them all to see if
there's a match with our desired protocol (beej-chat-protocol
). But in
this case, we wrote the client, and we know the client is only going to
request a single protocol, so we only are checking the first one.
Secondly, the 403
and 400
codes in the reject()
call are HTTP
client error
codes.
Use whichever ones are appropriate for your error handling.
If the handler ends up calling accept()
, then the connect
event
handler, onServerConnect()
will be called.
onServerConnect()
—We Finally Got One!
Once the connection is accept()
ed and all set up properly, we get the
connect
event, and its handler is called. In our case, we want to add
the new connection to our connection list, and set up event handlers on
the connection for message
, error
, and close
:
// wsserver.js
/**
* Server: Handle new connections (after being accepted in onServerRequest())
*/
function onServerConnect(connection) {
let k = getConnectionKey(connection);
connectionList[k] = {
'connection': connection
};
connection.on('message', onMessage);
connection.on('error', onError);
connection.on('close', onClose);
}
As you can see, we're actually storing another object in the connection
list with a property connection
(that holds the actual connection
object). We've done this since later we're going to use the same object
to also hold the username associated with the connection.
Handling Events on the Connections
There's a new connection object for each incoming connection to the
WebSockets server, and, as you saw, we store them all in the
connectionList
.
And each one gets event handlers attached to it for message
, error
,
and close
events.
So let's tackle those connection handlers!
Connection onError()
Handler
All this guy does is print out the error that has occurred. It doesn't
close the connection. (If the error is lethal, the connection will
close, and we'll get a close
event later.)
// wsserver.js
/**
* Connection: Handle errors
*/
function onError(error) {
console.log("Websocket: error: " + this.remoteAddress + ": " + error);
}
Connection onClose()
Handler
The connection's close
event handler just needs to do a few things.
-
Get the username associated with the closed connection.
-
Delete the connection from the connection list.
-
Make a
chat-leave
packet. -
Broadcast that packet to everyone.
// wsserver.js
/**
* Connection: Handle close
*/
function onClose(reason, description) {
let k = getConnectionKey(this);
// Get the username so we can tell everyone else
let username = connectionList[k].username;
// Remove this connection from the list
delete connectionList[k];
console.log("Websocket: closed: " + this.remoteAddress +
": " + reason + ": " + description);
// Tell everyone this user has left
let response = {
'type': 'chat-leave',
'payload': {
'username': username
}
};
broadcast(response);
}
If you notice where we're building the response
packet, we do it so
that it's compliant with the beej-chat-protocol
. (See discussion in
Part 1 and Part
2.)
And how about that broadcast()
call? We'll get to it later, but for
now, just understand that it sends the response
to all connections in
the connectionList
.
Connection onMessage()
Handler: Where All the Action Happens
This is it! This is the handler that gets called for the message
event, when new messages arrive to the server. The server has to decide
what to do with them, and then respond (or whatever) in an appropriate
manner.
In this case, it's a chat server, so it'd better behave like one!
The beej-chat-protocol
defines what kinds of messages will arrive. We
have them organized in JSON like so:
{
"type": "packet-type-string",
"payload": {whatever payload}
}
The type is a string that tells us what kind of packet this is. The type of packet then defines what will be contained in the payload. Defined packet types are:
chat-join
for when a user joinschat-leave
for when a user leaveschat-message
for when a user sends a chat message
These can be sent from the client to the server, or server to the client.
So what happens when we get one on the server?
The first thing to is check the type on the packet because that's going to dictate how we process the payload.
You can do that with a switch
or if
-else
, but in this case I used
the type as a property name into an object that held the handlers for
each packet type. Do whatever makes the most sense and is clearest, of
course. 🙂
// wsserver.js
/**
* Message type handlers
*/
let messageHandler = {
"chat-join": function (message) {
// Handle it here
},
"chat-message": function (message) {
// Handle it here
}
};
And then we call that magical stuff from our connection's message
event handler. This handler will parse the JSON message, then look in
the messageHandler
object and find the property for this message type.
Then call it.
// wsserver.js
/**
* Connection: Handle incoming messages
*/
function onMessage(message) {
message = JSON.parse(message.utf8Data);
storeUsername(this, message);
console.log("Websocket: message: " + this.remoteAddress +
": " + message.type);
// Try to find the message handler for this message type
if (message.type in messageHandler) {
messageHandler[message.type](message, this);
} else {
console.log("Websocket: unknown payload type: " +
this.remoteAddress + ": " + message.type);
}
}
And yes, you caught me sneaking a call to storeUsername()
in there.
This function looks for a username in the payload and then stores it in
the connection list entry for this connection. This is useful later when
a client quits, and we want to broadcast to everyone else that "Beej
left the chat".
This is actually a totally hackish way of tracking the username, but I wanted to keep the demo simple.
A much more correct way would be to not let the user even join the chat until they'd logged in, sending a username packet at that time. Or, if they changed their user name on the fly, and new username packet would be sent then, too. And the server could broadcast this to all the other clients, and we'd all know the user's new name.
But right now, the client doesn't send a username packet at all. It just bundles the username along with
chat-join
andchat-message
packets as they arrive.So to fix this, it would mean more complexity in the server, client, and UI. Which you should totally do as an exercise.
:)
chat-join
Handler
Let's take look at the chat-join
handler in the messageHandler()
code, above. This is what will do the right thing when a chat-join
message comes in from a client.
What we're going to do is simply rebroadcast the same message to all the connected clients so they can print "So-and-so joined the chat" on the screen.
We do take a slight liberty and clean up the username by trimming whitespace off either end before sending it back.
Then we call broadcast()
(which we haven't talked about yet) to send
the result to all connected clients:
// wsserver.js
// [...]
"chat-join": function (message) {
let response = {
'type': 'chat-join',
'payload': {
'username': message.payload.username.trim()
}
};
broadcast(response);
},
// [...]
chat-message
Handler—Finally, someone is talking to us!
We're going to do the same thing again: just broadcast the incoming message to every other client, so they can put "So-and-so just said: Hello, World or whatever!" on their screens.
And, again, we're going to do some sanitizing. If the message is empty,
we don't send anything. And we clean up the whitespace on the username
just like with the chat-join
packet.
// wsserver.js
// [...]
"chat-message": function (message) {
let payload = message.payload;
let text = payload.message.trim();
// ignore empty messages
if (text === '') { return; }
// make a new chat message to broadcast to everyone
let response = {
'type': 'chat-message',
'payload': {
'username': payload.username.trim(),
'message': text
}
};
broadcast(response);
}
// [...]
And that's it for handling incoming messages!
If you're keenly observant, you'll see that we're not handling incoming
chat-leave
messages. But only because the client never sends them.
:)
The server is the only one who generates these when the connection
closes.
broadcast()
—Sending data to all the clients
Lastly, but not leastly, we have to talk about how we're going to send these messages out to all the clients.
First a quick primer on how to send data over a connection with the
send()
method:
connection.send("Hello, World!");
Pretty self-explanatory, actually, I guess.
So we want to loop through our connection list, and send our data over to each connection.
Our data should be a JSON string (as agreed upon in
beej-chat-protocol
, so we need to convert it from a JS Object to that
with the internal JSON.stringify()
call.
// wsserver.js
/**
* Broadcast a chat message to all connected clients
*/
function broadcast(response) {
for (let k in connectionList) if (connectionList.hasOwnProperty(k)) {
let destConnection = connectionList[k].connection;
destConnection.send(JSON.stringify(response));
}
}
And that's the game!
Future Directions
Of course there's more to it! There's always more to it!
You can look into sending binary data, for one.
Or think about how you'd structure data if you wanted the user to be able to join different chat rooms.
Or how you'd do things if users could chat, but could also, say, submit game moves. How would you multiplex those messages over the same WebSocket? Or would it be better to open another WebSocket?
Wrap-Up
What a slog! I'm like how I could summarize the server in three sentences, and then the exploration exploded into a book.
But I hope that even though there are a number of little pieces, each little piece on its own makes sense, and they (relatively) clearly come together well enough.
Keep hacking, and I'll See you next Wednesday!
Links
License
The code attached to this article is licensed under the MIT open source license.