Using Process Dictionaries in Elixir

Jan 07, 2022

In Elixir, everything is a process. And a key aspect of processes is that they have their own isolated state. In most cases when we're talking about a process' state, we're talking about a GenServer and the explicit state we setup in init and then maintain through the various calls to handle_call, handle_cast and handle_info. However, processes have another stateful mechanism: process dictionaries.

As the name implies, process dictionaries are dictionaries where, in typical Elixir fashion, the keys and values can be any term (any type of value). The Process.get/1, Process.put/2 and Process.delete/1 functions are used to manipulate the current process dictionary:

> IO.inspect Process.get(:power)
nil

> IO.inspect Process.put(:power, 9000)
nil

> IO.inspect Process.get(:power)
9000

> spawn fn -> IO.inspect Process.get(:power) end
nil

The two processes, the original and the one we spawn, don't share a process dictionary. For completeness, there's also a Process.get/0 and Process.get/2 function; the first to get all values, and the second to get the value or returns a specific default.

If you've worked with stateful GenServers, you probably have a pretty intuitive feel for how awkward the process dictionary is. To some degree, it's akin to global variables, which is why you'll find a lot of people saying you shouldn't use them. While that might be good general advice, they can be useful in specific cases.

For example, there are two cases where I make use.

Sensitive Data / Secrets

The first, and simplest, is to store sensitive data. Normal process state can be exposed via various mechanisms, such as calls to :sys.get_state/1 and in crash logs. Consider this minimalist GenServer:

defmodule MyProcess do
  use GenServer

  def init(_), do: {:ok, %{secret: "over 9000", other: 1}}
end

There are various ways to get the state:

> {:ok, pid} = GenServer.start(MyProcess, [])
> :sys.get_state(pid)
%{other: 1, secret: "over 9000"}

> GenServer.stop(pid, :crash)
00:00:01.234 [error] GenServer #PID<0.219.0> terminating
** (stop) :crash
Last message: []
State: %{other: 1, secret: "over 9000"}

Getting the state inside of the crash dump, which can then be ingested into your log system, is probably the main thing you want to avoid. If we use the process dictionary and flag the process as sensitive, we can better protect our data:

def init(_) do
  Process.put(:secret, "over 9000")
  Process.flag(:sensitive, true)
  {:ok, %{other: 1}}
end

For this to work, you must call Process.flag(:sensitive, true). This will make the process dictionary unavailable via calls to Process.info or in crash dumps. Unfortunately, it also disables other information and features (tracing, stacks, pending messages). A common solution is to use a dedicated process for the sensitive information, which can be flagged as sensitive. This isolates the impact of the sensitive flag to this specific "secrets" process.

Mixin State Management

In some cases, you might be using macros to implement shared-functionality. For example, for RabbitMQ, we re-use a "base" Consumer module across multiple consumers. For example, if we wanted to track hits across various resources and periodically dump that to a database, we'd do something like:

defmodule MyApp.Consumers.Hits do
  use MyApp.Consumer

  def routes() do
    ["v1.someapp.hit"]
  end

  def setup() do
    Process.send_after(self(), :dump, 10_000)

    # this is our process state, we'll store the hits here, in memory, and
    # use the above timer to periodically persist it to the database

    %{}
  end

  def handle("v1.someapp.hit", message, hits) do
    Map.update(hits, message.resource, 1, fn existing -> existing + 1 end)
  end

  def handle_info(:dump, hits) do
    Process.send_after(self(), :dump, 10_000)

    # TOOD: save our hits

    # reset our hits now that we've saved it
    {:noreply, %{}}
  end
end

MyApp.Consumer does a bunch of heavy lifting and, importantly, requires its own state (e.g. the RabbitMQ channel). Here's part of the init which MyApp.Consumer generates:

def init(opts) do
  conn = Keyword.fetch!(opts, :conn)
  {:ok, channel} = AMQP.Channel.open(conn)

  routes = routes()
  # TODO bind to routes and start our consumer

  state = setup()
  {:ok, state}
end

How do we add channel to the state? We could do this:

def init(opts) do
  ...
  state = setup()
  {:ok, %{consumer_state: state, channel: channel}
end

But this requires awareness in all of our consumers. Our above handle_info/2 need to be rewritten:

def handle_info(:dump, %{consumer_state: hits} = state) do
  Process.send_after(self(), :dump, 10_000)

  # TOOD: save our hits

  # reset our state now that we've saved it
  {:noreply, %{state | consumer_state: %{}}}
end

This is easy to mess up and adds extra nesting to everything (e.g. pattern matching). There are other techniques you could try, but none are as simple and robust as using the process dictionary:

def init(opts) do
  conn = Keyword.fetch!(opts, :conn)
  {:ok, channel} = AMQP.Channel.open(conn)

  routes = routes()
  # TODO bind to routes and start our consumer

  Process.put(:channel, channel)
  state = setup()
  {:ok, state}
end

Now, the only "state" is the consumer state, and whenever the MyApp.Consumer code needs the channel, it can call Process.get(:channel).

Conclusion

Process dictionaries shouldn't be the default option you use for statefulness. But they shouldn't be avoided at all costs. There are some problems and patterns that they're well suited for and there's no reason not to use them in those cases. The concept is simple and it's just a few functions. It's the type of thing that's worth knowing and keeping around in your head.