Using NixOS on an OpenStack Public Cloud
Tags: programming.nixos, programming.openstack, programming.terraform
Here are some notes on using NixOS in an OpenStack public cloud.
Recall, NixOS is an operating system which makes use of the Nix package manager to manage its system configuration.
That NixOS allows declarative configuration of a system lends itself to building cloud VM images.
OpenStack is a standard cloud computing platform. It offers services broadly similar to AWS’ EC2, S3, etc..
One option for deploying NixOS configurations to a cloud VM is to run a NixOS VM, and then switch that VM to the configuration you want. – If the cloud provider doesn’t have a NixOS VM image to run, you’ll have to build your own image.
General Approach
The most ‘challenging’ part of this is wanting to run a NixOS VM, but the cloud provider not having a public NixOS image.
It’s possible to build a your own image in a format the cloud provider wants.
The nix-community’s nixos-generators is a good place to start for this.
Many popular image formats are supported.
In the case of OpenStack images, there’s a generator specifically for that.
I guess for other cloud providers, some customisation may be required; I’d dig through the
code within nixos’ maintainers/scripts/ and modules/virtualisation/ to get an idea of what was done for the ones which are supported.
Using Terraform to Launch a VM on an OpenStack Public Cloud
I was doing this from a rather weak Macbook Air (Intel). I figured it’d be easier to build Linux images using Linux; so the first thing to do is launch a linux VM in the cloud.
My experience with using OpenStack on public clouds is that the networking may be handled slightly differently from one public cloud and another.
(The related “used to hope this would work” is the idea of “Terraform works with different clouds” translating to “plenty of Terraform code can be reused in order to easily support a multi-cloud deployment; e.g. have the same service in both AWS and GCP”. Maybe for OpenStack public clouds this could be largely true; but, it’s unlikely Terraform code for networking resources can be reused).
With AWS, I would think that an example of a simple Terraform task is “launch a VM with a publicly accessible IP”.
From what I’ve tried, it’s slightly trickier with OpenStack public clouds.
I had some spare credits with Cleura to use, and they offer a public cloud with an OpenStack API.
Managing OpenStack resources from outside the cloud console requires an OpenStack user. I found it convenient to download the RC file for the user, and use direnv to load those credentials.
Here’s the code listing for a main.tf
to achieve this (with some notes below):
terraform {
required_providers {
= {
openstack source = "terraform-provider-openstack/openstack"
= "~> 1.49.0"
version
}
}
}
provider "openstack" {}
# Variables
variable "default_user_name" {
= "the name of the default user"
description = string
type = "debian"
default
}
variable "ssh_public_key" {
= "the SSH public key used to access the VM"
description = string
type = "ssh-ed25519 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
default
}
variable "allow_ssh_access_cidr" {
= "the CIDR to allow SSH access to. Defaults to 0.0.0.0/0 (unrestricted)"
description = string
type = "0.0.0.0/0"
default
}
variable "flavor" {
= "the compute flavor to use"
description = string
type = "1C-2GB-20GB"
default
}
variable "image_name" {
= "the name of the image the VM uses"
description = string
type = "Debian 11 Bullseye x86_64"
default
}
variable "instance_name" {
= "the name of the VM"
description = string
type = "debian"
default
}
# OpenStack Server flavor & image
data "openstack_compute_flavor_v2" "self" {
= var.flavor
name
}
data "openstack_images_image_v2" "self" {
= var.image_name
name = true
most_recent
}
# Networking
data "openstack_networking_network_v2" "ext" {
= "ext-net"
name
}
resource "openstack_networking_network_v2" "self" {
= "terraform_vm_network"
name = "true"
admin_state_up
}
resource "openstack_networking_subnet_v2" "self" {
= "terraform_vm_subnet"
name = openstack_networking_network_v2.self.id
network_id = "192.168.199.0/24"
cidr = 4
ip_version
}
resource "openstack_networking_router_v2" "self" {
= "terraform_vm_router"
name = data.openstack_networking_network_v2.ext.id
external_network_id
}
resource "openstack_networking_router_interface_v2" "self" {
= openstack_networking_router_v2.self.id
router_id = openstack_networking_subnet_v2.self.id
subnet_id
}
resource "openstack_networking_floatingip_v2" "self" {
= data.openstack_networking_network_v2.ext.name
pool
}
# Security
resource "openstack_compute_secgroup_v2" "allow_ssh" {
= "allow_ssh"
name = "allow SSH from the given CIDR"
description
rule {= 22
from_port = 22
to_port = "tcp"
ip_protocol = var.allow_ssh_access_cidr
cidr
}
}
resource "openstack_compute_keypair_v2" "self" {
= "terraform_keypair"
name = var.ssh_public_key
public_key
}
# Instance
resource "openstack_compute_instance_v2" "self" {
= var.instance_name
name = data.openstack_compute_flavor_v2.self.id
flavor_id = openstack_compute_keypair_v2.self.name
key_pair = [openstack_compute_secgroup_v2.allow_ssh.name]
security_groups = <<-USER
user_data #cloud-config
system_info:
default_user:var.default_user_name}
name: ${false }
chpasswd: { expire: false
ssh_pwauth: true
package_upgrade:
manage_etc_hosts: localhost
runcmd:- "curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install --no-confirm"
USER
block_device {uuid = data.openstack_images_image_v2.self.id
= "image"
source_type = 20 # GBs
volume_size = 0
boot_index = "volume"
destination_type = true
delete_on_termination
}
}
resource "openstack_compute_floatingip_associate_v2" "self" {
= openstack_networking_floatingip_v2.self.address
floating_ip = openstack_compute_instance_v2.self.id
instance_id
}
# Outputs
output "public_ipv4" {
= openstack_networking_floatingip_v2.self.address
value }
Again, I found the networking details to be a quite complicated.
The details were found by observing what resources were created when creating a VM with a public IP in the console.
Cleara uses the public network ext-net
.
I’m not sure on the exact details, but I create a private network with a subnet, and to get a public IP (i.e. a floating ip in ext-net
) to route to the VM, I create a router for the ext-net
and a router interface which routes that to the private subnet. That public IP then gets associated with the VM.
Another part that can be annoying with OpenStack is specifying the VM resources (CPU/memory/storage).
It’s more flexible than AWS’ “c5.small/medium/large”.
One thing I found annoying it it’s not quite so freeform as "xC-yGB-zGB"
for arbitrary x,y,z; I had to list the flavors to find one.
(The impression I got was that the console lets you choose the number of cpu/mem/disk, so the flavor is created on demand for that).
Idiosyncracies of OpenStack clouds aside, I think the other details are relatively straightforward.
The name "Debian 11 Bullseye x86_64"
comes from running openstack images list
.
(An easy way to get the openstack
client is to run nix shell nixpkgs#openstackclient
).
Looking at the user data passed to the VM…
#cloud-config
system_info:
default_user:
name: ${var.default_user_name}
chpasswd: { expire: false }
ssh_pwauth: false
package_upgrade: true
manage_etc_hosts: localhost
runcmd:
- "curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install --no-confirm"
…this uses cloud-init, which declares some things we want set up in the VM.
The curl ... | sh
installs the determinate-systems nix installer.
This means the VM will have nix
available shortly after the VM launches. (Running the command cloud-init status
on the launched VM shows whether cloud-init has finished).
Running the Terraform file involves (as is usual):
terraform apply
(and removing these resources with terraform destroy
).
The public IP is an output
, which allows SSH’ing into the VM with a command like:
ssh debian@(terraform output -json public_ipv4 | jq -r)
Building the NixOS Image for OpenStack
This is the flake.nix
file. Below, some implementation notes, and the commands for building/uploading.
{
inputs = {
nixos-generators = {
url = "github:nix-community/nixos-generators";
inputs.nixpkgs.follows = "nixpkgs";
};
nixpkgs.url = "github:NixOS/nixpkgs/nixos-22.11";
rgoulter.url = "github:rgoulter/nix-user-repository";
};
outputs = {
self,
nixos-generators,
nixpkgs,
rgoulter,
...
}: {
packages."x86_64-linux".small-openstack = nixos-generators.nixosGenerate {
pkgs = nixpkgs.legacyPackages."x86_64-linux";
format = "openstack";
modules = [
rgoulter.nixosModules.cloud-interactive
rgoulter.nixosModules.ssh
rgoulter.nixosModules.ssh-users
rgoulter.nixosModules.tailscale];
};
};
}
Recall, a flake.nix
file is more/less equivalent to a package.json
/Cargo.toml
/etc. project file,
and is a standard entry point into a Nix codebase.
Here, the OpenStack image is declared as a package, using nixos-generators
and its nixosGenerate
package. (The OpenStack specific part is the format = "openstack";
attribute. As mentioned above, you can dig into the details in nixos/
, which nixos-generators
makes use of).
The rgoulter.nixosModules
refers to the modules in my nix-user-repository
. (Though, you could just inline these all as one module in this flake.nix
file, etc.).
e.g. cloud-interactive ensures that some CLI tools I like are installed:
{
config,
lib,
pkgs,
...
}: {
environment.systemPackages = with pkgs; [
direnv
fd
fish
git
helix
jq
ripgrep
starship
tmux];
nix = {
extraOptions = ''
experimental-features = nix-command flakes
'';
};
security.sudo.wheelNeedsPassword = false;
}
and ssh-users
declares the user I log in with:
{
config,
lib,
pkgs,
...
}:
{
users.users.rgoulter = {
isNormalUser = true;
extraGroups = [
"wheel"
];
openssh.authorizedKeys.keys = [
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOmQ9/u9qV9Vvy2pbcPtGiAmIrhXdi/vY6IesJ5RYpS4"
];
};
}
(This doesn’t necessarily need to be hard coded into the image; but, for this use case, doing it this way is simple).
With this flake.nix
, on a Linux computer with nix
installed (and flakes, nix-command enabled, as the determinate-systems does),
the image is built by running a command like:
nix build .#small-openstack
(A quick & dirty approach is to just copy the flake.nix
file to the Linux VM we’re running. But, it’d also be possible to have a nixosGenerate
package in a flake.nix
in a repository else & refer to it using the appropriate flake URI).
The resulting file is linked to by ./result/nixos.qcow2
.
Uploading the Image
With the openstack
client (use nix shell nixpkgs#openstackclient
to make it available to the shell),
and appropriate openstack credentials (e.g. copying the openstack user RC file over and source
’ing it),
the built image can be uploaded with:
openstack image create \
--private \
--disk-format qcow2 \
--container-format bare \
--file ./result/nixos.qcow2 \
my-nixos
which creates the OpenStack image my-nixos
.
Launching a VM with Our Private NixOS Image
This is straightforward, by re-using the code from “Using Terraform to Launch a VM”.
Simply change the name
of the image in the data.openstack_images_image_v2
block to "my-nixos"
. (e.g. change the value of the Terraform variable image_name
). – The NixOS image ignores the user data.
Switching Configuration After Launch
There are uses cases where it makes sense to ‘switch’ the NixOS configuration after the VM has launched.
There are plenty of options for doing this.
I liked using a command like:
ssh -A "rgoulter@${IP}" -- " \
sudo sh -c 'mkdir -p ~/.ssh && \
chmod 700 ~/.ssh && \
ssh-keyscan github.com >> ~/.ssh/known_hosts && \
echo switch to ${FLAKE_URI} && \
nixos-rebuild switch --flake ${FLAKE_URI}' \
"
I think another option instead of doing the ssh-keyscan
here is to add it to
services.openssh.knownHosts.