Globally distributed Elixir over Tailscale

I’ve been playing with deploying an Elixir Phoenix application to different clouds recently and started looking into simple solutions to cluster the nodes together.

I dabbled with Docker overlay networks to make nodes across different machines on the same private network visible to each other and used libcluster DNSPoll Strategy to auto-discover them via DNS which worked great!

But what if you want to cluster your machines across multiple different clouds, or even anywhere in the world on any device?

Going Global

Tailscale is a VPN service that makes the devices and applications you own accessible anywhere in the world, securely and effortlessly. Follow the instructions to download Tailscale for your system and start the tailscaled daemon.

When bringing up your node you need to specify the hostname, which we’ll use as a service name for the cluster so make sure it is consistent on all servers. There’s an assumption here you’re bringing up one instance of the service on one host, we’ll cover docker containers in a later post that will allow multiple versions of the service on a single host.

For example, if your app is called hello then we’d bring up tailscale like this:

tailscale up --authkey=${TAILSCALE_AUTHKEY} --hostname=hello

Note: you can get your Auth key from your Tailscale settings. Choose a Reusable, Ephemeral key if you have any automation in place.

If succesful you should see your device in the Tailscale Dashboard.

Tailsale Device

Note: you might see your machine name as hello-1, hello-2 etc. if you have multiple services running with the same hostname, which is fine.

Now, if we repeat the same instructions on a second device (this could be your development machine), we’ll end up with two devices connected to the same tailnet on Tailscale. Let’s assume the second device gets the Tailscale IP address of 100.1.1.2, we can now connect two Elixir nodes over the tailnet like this:

iex --name hello@100.1.1.1 --cookie ${SECURE_COOKIE} -S mix
iex(hello@100.1.1.1.2)> Node.list()
[]

iex(hello@100.1.1.1.2)> Node.connect(:"hello@100.1.1.1")
true

iex(hello@100.1.1.1.2)> Node.list()
[:"hello@100.1.1.1"]

This is great! we can now connect our Elixir nodes together from anywhere over the secure Tailscale network. But this would be a bit tedious to handle manually in production deployments so let’s find a way to automate it.

Automatic discovery

The community go-to for Elixir clustering is libcluster. It handles a lot of strategies out of the box and provides a flexible framework for adding other strategies.

I couldn’t find anything existing in the community to support service discovery and node connection so I wrote libcluster_tailscale which provides a strategy for libcluster that uses the Tailscale API to look up hosts with matching hostnames and then automatically connect them together.

Using the example above we would provide the following configration for libcluster:

config :libcluster,
  debug: true,
  topologies: [
    tailscale: [
      strategy: Cluster.Strategy.Tailscale,
      config: [
        authkey: "tskey-api-xxx-yyy",
        tailnet: "example.com",
        hostname: "hello",
        appname: "hello"
      ]
    ]
  ]
  • authkey is your Tailscale API key you can get from Tailscale settings.
  • tailnet is the name of your unique tailnet you can get from Tailscale settings (it is listed under Organization).
  • hostname is the name you provided Tailscale when bringing it up on your device with --hostname. This acts like a service name that allows us to identify all the nodes belonging to a specific service. In production you might also use the version of the service in the hostname to ensure only nodes running the same version connect with each other, for example when deploying a new version alongside an existing one.
  • appname is the name part of the Elixir node name you provided with --name eg. hello in hello@100.1.1.1.1

Note: the hostname and appname are the same in this scenario, but they would likely be different in reality.

Now when we start our Elixir node, because we enabled debug we’ll see output like this:

[info] [libcluster:tailscale] connected to :"hello@100.1.1.1.2"
[info] [libcluster:tailscale] connected to :"hello@100.1.1.1.3"

This now allows us to deploy our Elixir application anywhere in the world, on any cloud or bare metal server and have them all automatically discover and connect to each other over Tailscale.

I’m going to cover a full example Phoenix application deployed using Docker and this Tailscale setup in my next blog post.

Let me know if you have any comments on Twitter or HN.