Multi-AZ, remote backend, cinder-volume with OpenStack-Ansible

Publié le 31/07/2019

par Adrien Cunin

⭠ Back to the list of articles

This article describes a common pattern we've been using at Osones and alter way for our customers deploying OpenStack with OpenStack-Ansible.

This pattern applies to the following context:

Multi-site (let's consider two) deployment, each site having its own (remote) block-storage storage solution (could be NetApp or similar, could be Ceph)
Each site will be an availability zone (AZ) in OpenStack, and in Cinder specifically
The control plane is spread across to the two sites (typically: two controllers on one site, one controller on another site)

Cinder is the OpenStack Block Storage component. The cinder-volume process is the one interacting with the storage backend. With some drivers, such as LVM, the storage backend is local to the node where cinder-volume is running, but in the case of drivers such as NetApp or Ceph, cinder-volume will be talking to a remote storage system. These two different situations imply a different architecture: in the first case cinder-volume will be running on dedicated storage nodes, in the second case cinder-volume can perfectly run along other control-plane services (API services, etc.), typically on controller nodes.

An important feature of Cinder is the fact that it can expose multiple volume types to the user. A volume type translates the idea of different technologies, or at least different settings, different expectations (imagine: more or less performances, more or less replicas, etc.). A Cinder volume type matches a Cinder backend as defined in a cinder-volume configuration. A single cinder-volume can definitely manage multiple backends, and that especially makes sense for remote backends (as defined previously).

Now when one wants to make use of the Cinder availability zones feature, it's important to note that a single cinder-volume instance can only be dedicated to a single availability zone. In other words, you cannot have a single cinder-volume part of multiple availability zones.

So in our multi-site context, each site having its own storage solution - considered remote to Cinder, and with cinder-volume running on the control plane, we'd be tempted to configure one cinder-volume with two backends. Unfortunately due to the limitation mentioned earlier, this is not possible if we want to expose multiple availabilty zones. It is therefore required to have one cinder-volume per availability zone. This is in addition to having cinder-volume running on all the controller nodes (typically: three) for obvious HA reasons. So we would end up with two cinder-volume (one per AZ) on each controller node; that would be six in total.

This is when OpenStack-Ansible and its default architecture comes in handy. OpenStack-Ansible runs most of the OpenStack (and some non-OpenStack as well) services inside LXC containers. When using remote backends, it makes sense to run cinder-volume in LXC containers, on control plane nodes. Luckily, it's as easy with OpenStack-Ansible to run one or many cinder-volume (or anything else, really) LXC containers per host (controller node), using the affinity option.

/etc/openstack_deploy/openstack_user_config.yml example to deploy two (LXC containers) cinder-volume per controller:

storage_hosts:
  controller-01:
    ip: 192.168.10.100
    affinity:
      cinder_volumes_container: 2
  controller-02:
    ip: 192.168.10.101
    affinity:
      cinder_volumes_container: 2
  controller-03:
    ip: 192.168.10.102
    affinity:
      cinder_volumes_container: 2

Then, thanks to the host_vars mechanism, it's also easy to push the specific availability zone configuration as well as the backend configuration to each cinder-volume. For example in the file /etc/openstack_deploy/host_vars/controller-01_cinder_volumes_container-fd0e1ad3.yml (name of the LXC container):

cinder_storage_availability_zone: AZ1
cinder_backends:
  # backend configuration

You end up with each controller being able to manage both storage backends in the two sites, which is quite good from a cloud infrastructure HA perspective, while correctly exposing the availability zone information to the user.

⭠ Back to the list of articles

Découvrez les derniers articles d'alter way

Découvrez les technologies d'alter way