Elegant TCP with Elixir - Part 2 - Message Boundaries

13 Apr 2020

TCP doesn't provide application-level message boundaries (or framing). I wrote about this a long time ago. Here's a recap. If on one side of a socket you send "hello":

:gen_tcp.send(socket, "hello")

Then on the receiving side you'd have to call recv 1 to 5 times. That we sent the message with a single send/2 has no bearing on how the message will be received. I say "1 to 5" because, the fewest bytes we can get from a successful recv is 1. So, in the worse case, we'd have to call recv 5 times, once for each letter of the word "hello". This works the other way also. If we write two messages:

:gen_tcp.send(socket, "over")
:gen_tcp.send(socket, "9000!")

A single call to recv could get both messages as a continuous block of bytes. While I've used the word "message" to describe the act of sending and receiving, you really need to think about it as a continuous stream of (non delimited!) bytes. (When testing locally, it can appear as though send and receive are dealing with distinct messages, but this isn't how it works outside of localhost).

There are cases where an endless stream of bytes is what you want. But, more often than not, you'll be sending distinct messages and the receiver will want to receive distinct message. The two common solution to achieve this are: prefixing each message with a length, or using a delimiter at the end of each message (like how HTTP 1.x headers are separated by CRLF). When using a length-prefix, it's common to use a 4-byte integer. So instead of sending ['h', 'e', 'l', 'l', 'o'], we'd send [0, 0, 0, 5, 'h', 'e', 'l', 'l', 'o']. Now, this message can still be "fragmented" at the receiver, but because we know to expect 4 bytes and those 4 bytes tell us how many additional bytes to expect, we can re-package the stream on the receiving end. In fact, it's quite common for libraries to provide a function to block until N bytes are received:

def read_message(socket) do
  with {:ok, <>} <- :gen_tcp.recv(socket, 4)
    :gen_tcp.recv(socket, length)

(<<length::big-32>> is Elixir's powerful binary pattern matching, which I've talked about before)

In part 1 we learnt about the active flag. Now we'll look at the packet flag. This flag supports a number of different values. By default, this is set to raw and it behaves like any other socket library. However, another option is to set it to 4. In this mode, any time you send, the library automatically appends a 4-byte length header. And every time a message is received in active or passive mode, you'll only get properly bound messages (with the length header removed). In other words, with packet: 4 set on a socket, if we send a message:

:gen_tcp.send(socket, "over 9000")

Then we're guaranteed to get this exact message in our receiving (here shown while in active mode, but this works with explicit calls to gen_tcp.recv also):

def handle_info({:tcp, socket, data}, state) do
  # data == "over 9000"

I hope it's obvious that there's no Elixir-to-Elixir magic going on here. If you set packet: 4 on the sending side and the other side doesn't set this flag (or is writen in a different language) than they'll get receive those extra 4 bytes and need to handle it manually. Similarly, if the sending side manually prepends a 4 byte length prefix, regarless of the language/runtime, it'll still work on the receiving elixir code so long as packet: 4 is set.

The packet flag takes other values, including 0 (same as raw), 1 (1 byte length prefixed) and 2 (2 bytes length prefixed). All other values, like :line only work on the receiving end (it'll form incoming messages on the newline character, but it won't add a newline on the sending side). There are most advanced values like :fcgi, :httph and :http.

Obviously, you won't always be able to leverage this feature. But prefixing messages with 4 or 2 bytes, or line-delimiting messages, is common enough for this feature to useful.