Capsule sur AKS

thumbernail kubernetes

1. Foreword

First article in a series on "multi-tenant" topics.

Kubernetes is a powerful, flexible and scalable container orchestration platform that can be used to deploy and manage large-scale applications. However, Kubernetes can be expensive and difficult to manage, especially when multiple teams or organizations use it at the same time.

One solution is to use Capsule

Capsule is a Kubernetes-based multi-tenant framework that facilitates the management of shared Kubernetes clusters.

2. Table of Contents

3. Benefits of Capsule

  • Multi-Tenancy Capsule allows the implementation of a multi-tenant architecture within a Kubernetes cluster. This makes it possible to create logical "partitions" grouping a set of users, namespaces, associated rights and quotas. Very useful for allowing different populations or environments to coexist within the same cluster. This avoids having to provision dedicated clusters.

  • Policies Based on rules for managing resources, access, quotas, limits, etc. on clusters.

  • Declarative mode Perfect for managing everything via GitOps.

  • Lightweight Microservices-based minimalist architecture.

  • A CNCF project It is at Sandbox level: https://www.cncf.io/projects/capsule/

4. How does it work?

  • A controller (Capsule Controller) will manage a Custom Resource of type Tenant.

Within each tenant, the attached users will be able to create their own namespaces and share all the resources assigned to them.

  • A policy engine (Capsule Policy Engine) will be responsible for isolating tenants from each other. Network and Security Policies, Resource Quota, Limit Ranges, RBAC, and other policies defined at the Tenant level are inherited by all tenant namespaces.

Once the controller is deployed, you can run kubectl explain tenant.spec to see all the parameters that can be set on a tenant.

This provides a mode of operation that allows users to independently manage their space without intervention from a cluster administrator.

5. Using with Azure Kubernetes Service (AKS) managed service

We will use a managed AKS service with Azure Active Directory (Microsoft Enterprise ID) integration for our POC.


6. POC

6.1. Creating an AKS cluster with AAD integration

6.1.1. Creating different groups in Azure AD (Preparation)

We will use the azure CLI for simplicity (everything can be done with opentofu)

6.1.1.1. Creating the Capsule Admin group

CAPSULE_ADMIN_GROUP_ID=$(az ad group create \
  --display-name capsuleAdminGroup \
  --mail-nickname capsuleAdminGroup \
  --query id \
  --output tsv)

# Check
echo $CAPSULE_ADMIN_GROUP_ID
03c9ccac-xxx-xxx-xx-xxxxx

6.1.1.2. Creating the user who will be the "cluster admin"

💡Note: You will need to adapt the domain with your data (here @microsoftalterway.onmicrosoft.com):

CAPSULE_ADMIN_USER_NAME="capsule-admin@microsoftalterway.onmicrosoft.com"
CAPSULE_ADMIN_USER_PASSWORD="@#Temporary:Password#@"

CAPSULE_ADMIN_USER_ID=$(az ad user create \
  --display-name ${CAPSULE_ADMIN_USER_NAME} \
  --user-principal-name ${CAPSULE_ADMIN_USER_NAME} \
  --password ${CAPSULE_ADMIN_USER_PASSWORD} \
  --query id -o tsv)

az ad group member add \
  --group capsuleAdminGroup \
  --member-id $CAPSULE_ADMIN_USER_ID

6.1.2. Creating the capsuleDevGroup group

CAPSULE_DEV_GROUP_ID=$(az ad group create \
  --display-name capsuleDevGroup \
  --mail-nickname capsuleDevGroup \
  --query id \
  --output tsv)

6.1.3. Creating a user in the capsuleDevGroup group

This user will be considered the owner of the DEV tenant

CAPSULE_DEV_USER_NAME="capsule-user-dev@microsoftalterway.onmicrosoft.com"
CAPSULE_DEV_USER_PASSWORD="@#Temporary:Password#@"

CAPSULE_DEV_USER_ID=$(az ad user create \
  --display-name ${CAPSULE_DEV_USER_NAME} \
  --user-principal-name ${CAPSULE_DEV_USER_NAME} \
  --password ${CAPSULE_DEV_USER_PASSWORD} \
  --query id -o tsv)

az ad group member add \
  --group capsuleDevGroup \
  --member-id $CAPSULE_DEV_USER_ID

6.1.4. Creating the capsuleStagingGroup group

CAPSULE_STAGING_GROUP_ID=$(az ad group create \
  --display-name capsuleStagingGroup \
  --mail-nickname capsuleStagingGroup \
  --query id \
  --output tsv)

6.1.5. Creating a user in the capsuleStagingGroup group

This user will be considered the owner of the STAGING tenant

CAPSULE_STAGING_USER_NAME="capsule-user-staging@microsoftalterway.onmicrosoft.com"
CAPSULE_STAGING_USER_PASSWORD="@#Temporary:Password#@"

CAPSULE_STAGING_USER_ID=$(az ad user create \
  --display-name ${CAPSULE_STAGING_USER_NAME} \
  --user-principal-name ${CAPSULE_STAGING_USER_NAME} \
  --password ${CAPSULE_STAGING_USER_PASSWORD} \
  --query id -o tsv)

az ad group member add \
  --group capsuleStagingGroup \
  --member-id $CAPSULE_STAGING_USER_ID

6.1.6. Creating the capsuleGroup group

💡 Note: All tenant owning users and groups must be in this group!

CAPSULE_GROUP_ID=$(az ad group create \
  --display-name capsuleGroup \
  --mail-nickname capsuleGroup \
  --query id \
  --output tsv)

# Dev Group assignment
az ad group member add \
  --group capsuleGroup \
  --member-id $CAPSULE_DEV_USER_ID

# Staging Group assignment
az ad group member add \
  --group capsuleGroup \
  --member-id $CAPSULE_STAGING_USER_ID

6.1.6.1. Users

6.1.6.2. Groups

6.1.6.3. Memberships


6.1.7. Creating an AKS cluster

We will create an AKS Cluster with AAD integration, RBAC enabled, and a public API

Here we are not trying to create an AKS cluster following best practices, we are letting Azure set the values for a large number of resources (vnet, subnet, vm size ...)

When creating, we will specify which user groups have cluster admin privileges.

💡 Note: You will need to adapt the domain with your data (here @microsoftalterway.onmicrosoft.com):

❗️ Remember the ID ($CAPSULE_ADMIN_GROUP_ID)

# resource-group
az group create --name aw-capsule --location francecentral


# aks cluster
az aks create \
  --resource-group aw-capsule \
  --node-resource-group aw-capsule-vm \
  --name aw-capsule \
  --enable-aad \
  --enable-azure-rbac \
  --aad-admin-group-object-ids $CAPSULE_ADMIN_GROUP_ID \
  --network-plugin azure \
  --network-policy calico

💡 Note: You need to install kubelogin to authenticate to this cluster.

https://aka.ms/aks/kubelogin

6.1.8. Retrieving cluster kubeconfig for administration

# 1: If you want to put it in your ~/.kube/config file
az aks get-credentials --resource-group aw-capsule --name aw-capsule

# 2: If you want to put it in a separate file (you will need to use the --kubeconfig-file flag or the KUBEFONFIG variable to point to the cluster)

az aks get-credentials --resource-group aw-capsule --name aw-capsule --file ~/.kube/aw-capsule-config

For the POC I will use the second solution (2:)

export KUBECONFIG=~/.kube/aw-capsule-config

# kubectl get nodes

To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code XXXXXX to authenticate.
  • Use the account capsule-admin@microsoftalterway.onmicrosoft.com and its password (@#Temporary:Password#@)

you should have in your browser window:

Azure Kubernetes Service AAD Client
You have signed in to the application Azure Kubernetes Service AAD Client on your device. You can now close this window.

And the command should work 😀

 kubectl get nodes
NAME                                STATUS   ROLES   AGE   VERSION
aks-nodepool1-12585846-vmss000000   Ready    agent   6m   v1.26.6
aks-nodepool1-12585846-vmss000001   Ready    agent   6m   v1.26.6
aks-nodepool1-12585846-vmss000002   Ready    agent   6m   v1.26.6

6.2. Adding subscription-level IAM roles

The following rights must be added for the different groups (dev and staging)

"Reader" and "Azure Kubernetes Service Cluster User Role" in order to be able to download kubeconfigs with az aks get-credential...


6.3. Installing Capsule

6.3.1. Using cluster admin kubeconfig

❗️As a reminder:

az aks get-credentials --resource-group aw-capsule --name aw-capsule --file ~/.kube/aw-capsule-config
export KUBECONFIG=~/.kube/aw-capsule-config

6.3.2. Installing Capsule operator helm chart

6.3.2.1. Chart repository reference

helm repo add clastix https://clastix.github.io/charts

# If already installed
helm repo update

6.3.2.2. Deploying the chart

You will need the id of the capsuleGroup group.

# Reminder
CAPSULE_GROUP_ID=$(az ad group create \
  --display-name capsuleGroup \
  --mail-nickname capsuleGroup \
  --query id \
  --output tsv)

Capsule must know the authorized groups it will work with.

We need to register the object ID of the capsuleGroup Azure AD group as a Capsule user group under CapsuleConfiguration

helm upgrade --install capsule clastix/capsule \
   -n capsule-system \
   --create-namespace \
   --set manager.options.forceTenantPrefix=true \
   --set "manager.options.capsuleUserGroups[0]=$CAPSULE_GROUP_ID"

👀 Check:

 kubectl get deploy -n capsule-system
NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
capsule-controller-manager   1/1     1            1           78s

6.3.3. Installing Capsule Proxy helm chart

Capsule Proxy is an add-on module for the Capsule Operator that solves some RBAC issues when enabling multi-tenancy in Kubernetes, as users cannot list cluster-scoped resources.

Kubernetes RBAC cannot list only ACL owned resources cluster-wide as there is no filtered API by ACL. For example:

  • kubectl get namespaces fails even if the user has permissions.

How Capsule Proxy works

                +-----------+          +-----------+         +-----------+
 kubectl ------>|:443       |--------->|:9001      |-------->|:6443      |
                +-----------+          +-----------+         +-----------+
                ingress-controller     capsule-proxy         kube-apiserver
                load-balancer
                nodeport
                HostPort...

Capsule proxy can be exposed in different ways:

  • Ingress
  • NodePort Service
  • LoadBalance Service
  • HostPort
  • HostNetwork

In our case we will use a loadbalancer. On Azure we will have a public IP. We will also use a feature that allows us to create an FQDN of the form [name].francecentral.cloudapp.azure.com that will allow us to directly access the capsule-proxy service via a distinct URL.

❗️As a reminder:

CAPSULE_ADMIN_GROUP_ID=$(az ad group create \
  --display-name capsuleAdminGroup \
  --mail-nickname capsuleAdminGroup \
  --query id \
  --output tsv)
helm upgrade --install capsule-proxy clastix/capsule-proxy \
   -n capsule-system \
   --set service.type=LoadBalancer \
   --set service.port=443 \
   --set options.oidcUsernameClaim=unique_name \
   --set "options.ignoredUserGroups[0]=$CAPSULE_ADMIN_GROUP_ID" \
   --set "options.additionalSANs[0]=capsule-proxy.francecentral.cloudapp.azure.com" \
   --set service.annotations."service\.beta\.kubernetes\.io/azure-dns-label-name"=capsule-proxy

💡 Note: You will need to adapt the SAN: capsule-proxy.francecentral.cloudapp.azure.com and the annotation

👀 Check:

 kubectl get po,secrets,svc -n capsule-system
NAME                                             READY   STATUS    RESTARTS   AGE
pod/capsule-controller-manager-c98c8fb88-7xzhm   1/1     Running   0          4m11s
pod/capsule-proxy-945bc469d-lc9jz                1/1     Running   0          2m

NAME                                         TYPE                 DATA   AGE
secret/capsule-proxy                         Opaque               3      116s
secret/capsule-tls                           Opaque               3      4m11s
secret/sh.helm.release.v1.capsule-proxy.v1   helm.sh/release.v1   1      2m
secret/sh.helm.release.v1.capsule.v1         helm.sh/release.v1   1      4m11s

NAME                                                 TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
service/capsule-controller-manager-metrics-service   ClusterIP      10.0.213.12    <none>        8080/TCP        4m11s
service/capsule-proxy                                LoadBalancer   10.0.2.95      20.199.4.73   443:30350/TCP   2m
service/capsule-proxy-metrics-service                ClusterIP      10.0.163.163   <none>        8080/TCP        2m
service/capsule-webhook-service                      ClusterIP      10.0.9.76      <none>        443/TCP         4m11s


curl -k https://capsule-proxy.francecentral.cloudapp.azure.com
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message

6.4. Création des Tenants

6.4.1. DEV

CAPSULE_DEV_USER_ID=$(az ad user list --upn capsule-user-dev@microsoftalterway.onmicrosoft.com  -o tsv --query "[0].id")

echo $CAPSULE_DEV_USER_ID

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: dev
spec:
  owners:
  - name: ${CAPSULE_DEV_GROUP_ID}
    kind: Group
EOF tenant.capsule.clastix.io/dev created

6.4.2. Staging

CAPSULE_STAGING_USER_ID=$(az ad user list --upn capsule-user-staging@microsoftalterway.onmicrosoft.com  -o tsv --query "[0].id")

echo $CAPSULE_STAGING_USER_ID

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: staging
spec:
  owners:
  - name: ${CAPSULE_STAGING_GROUP_ID}
    kind: Group
EOF tenant.capsule.clastix.io/staging created

👀 Contrôle :

kubectl get tenants.capsule.clastix.io

NAME      STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
dev       Active                     0                                 1m39s
staging   Active                     0                                 5s

Here is a technical English translation of the text:

6.4. Creating Tenants

6.4.1. DEV

CAPSULE_DEV_USER_ID=$(az ad user list --upn capsule-user-dev@microsoftalterway.onmicrosoft.com  -o tsv --query "[0].id")

echo $CAPSULE_DEV_USER_ID

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: dev
spec:
  owners:
  - name: ${CAPSULE_DEV_GROUP_ID}
    kind: Group
EOF tenant.capsule.clastix.io/dev created

6.4.2. Staging

CAPSULE_STAGING_USER_ID=$(az ad user list --upn capsule-user-staging@microsoftalterway.onmicrosoft.com  -o tsv --query "[0].id")

echo $CAPSULE_STAGING_USER_ID

kubectl apply -f - << EOF
apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: staging
spec:
  owners:
  - name: ${CAPSULE_STAGING_GROUP_ID}
    kind: Group
EOF tenant.capsule.clastix.io/staging created

👀 Check:

kubectl get tenants.capsule.clastix.io

NAME      STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR   AGE
dev       Active                     0                                 1m39s
staging   Active                     0                                 5s

6.5. Using Tenants

We will now access the cluster with the capsule-user-dev user which is attached to the dev tenant via the capsuleDevGroup group

I advise you to do the following before starting, to ensure you are on the right user if you use the same user account on your machine. You do not need to do this if you are logging in with a different user on your local machine.

az logout
kubelogin remove-tokens

6.5.1. Retrieving kubeconfig for dev users

az login
az aks get-credentials --resource-group aw-capsule --name aw-capsule --file ~/.kube/dev-capsule-config

✅ You will need to log in with capsule-user-dev

export KUBECONFIG=~/.kube/dev-capsule-config

👀 Check:

 az logout kubelogin remove-tokens
❯ az login
A web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize. Please continue the login in the web browser. If no web browser is available or if the web browser fails to open, use device code flow with `az login --use-device-code`.
CloudName    HomeTenantId                          IsDefault    Name                     State    TenantId
-----------  ------------------------------------  -----------  -----------------------  -------  ------------------------------------
AzureCloud   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  True         Conso Interne Alter Way  Enabled  xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx

❯ az aks get-credentials --resource-group aw-capsule --name aw-capsule --file ~/.kube/dev-capsule-config
Merged "aw-capsule" as current context in /Users/hleclerc/.kube/dev-capsule-config

6.5.2. Using the capsule-user-dev user

If you type a kubectl ... command, you will have to authenticate to use the kubernetes cluster via its public API.

 export KUBECONFIG=~/.kube/dev-capsule-config
❯ kubectl get po
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code G7RCDSJW8 to authenticate.

Try a simple command

 kubectl get po
Error from server (Forbidden): pods is forbidden: User "capsule-user-dev@microsoftalterway.onmicrosoft.com" cannot list resource "pods" in API group "" in the namespace "default": User does not have access to the resource in Azure. Update role assignment to allow access.

This error is normal because in the dev tenant you do not have access to the default namespace because it does not belong to the tenant.

However, if you run the command:

 kubectl create ns dev-demo
namespace/dev-demo created

There is no error because you are in the dev tenant, you are the owner of this tenant, and you have just created a dev-demo namespace

💡 Note: Capsule forces you to prefix all namespace names with dev-

If you run the command:

 kubectl get ns
Error from server (Forbidden): namespaces is forbidden: User "capsule-user-dev@microsoftalterway.onmicrosoft.com" cannot list resource "namespaces" in API group "" at the cluster scope: User does not have access to the resource in Azure. Update role assignment to allow access.

This is why we will use capsule-proxy to list objects that do not have a namespace.

We will manipulate the kubeconfig file to change the server FQDN and point to the proxy: aw-capsule-proxy.francecentral.cloudapp .azure.com for example https://capsule-proxy.caas.fr:443

# Get cluster name kubectl config get-clusters

# Modify API url kubectl config set-cluster aw-capsule  --server=https://capsule-proxy.caas.fr:443

# Get user name kubectl config get-users

# Modify user if needed in context (clusterUser_aw-capsule_aw-capsule)
kubectl config set-context aw-capsule --cluster=aw-capsule --user=clusterUser_aw-capsule_aw-capsule

If you run a command like

 kubectl get ns

You will most likely get an error Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority.

To avoid this, use the --insecure-skip-tls-verify flag.

 kubectl get ns --insecure-skip-tls-verify
I0928 09:10:45.534157    3741 versioner.go:58] Get https://capsule-proxy.francecentral.cloudapp.azure.com:443/version?timeout=5s: x509: certificate signed by unknown authority


NAME       STATUS   AGE
dev-demo   Active   17m

And there you go! 😀😏💪

Congratulations! You have your first namespace in the dev tenant.

You only see components that belong to your tenant.

💡 Note:

I used certbot to quickly generate certificates for the capsule-proxy.caas.fr FQDN.

certbot -d capsule-proxy.caas.fr --manual --preferred-challenges dns certonly

If you want to avoid the X509 error you will need to replace thecertificate-authority-data with the value of the tls.cert you used (base64 encoded)

I took the fullchain.pem that certbot generated

Then you need to update:

  • The capsule-proxy secret in the capsule-system ns
 kubectl delete secret capsule-proxy
❯ kubectl -n capsule-system create secret tls capsule-proxy --cert=./tls.crt --key=./tls.key
  • capsule proxy:
helm upgrade --install capsule-proxy clastix/capsule-proxy \
   -n capsule-system \
   --set service.type=LoadBalancer \
   --set service.port=443 \
   --set options.oidcUsernameClaim=unique_name \
   --set "options.ignoredUserGroups[0]=$CAPSULE_ADMIN_GROUP_ID" \
   --set "options.additionalSANs[0]=capsule-proxy.caas.fr" \
   --set service.annotations."service\.beta\.kubernetes\.io/azure-dns-label-name"=capsule-proxy \
   --set options.generateCertificates=false

The CA should be good, no more certificate error


Managing rules at the tenant level

Network policy

If network policies are defined at the tenant level, they will be propagated to all tenant namespaces.

An example:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
  name: dev
spec:
  ingressOptions:
    hostnameCollisionScope: Disabled
  networkPolicies:
    items:
      - egress:
          - to:
              - ipBlock:
                  cidr: 0.0.0.0/0
        ingress:
          - from:
              - namespaceSelector:
                  matchLabels:
                    capsule.clastix.io/tenant: dev
        podSelector:
          matchLabels:
        policyTypes:
          - Ingress
          - Egress
  owners:
    - clusterRoles:
        - admin
        - capsule-namespace-deleter
      kind: Group
      name: 0fd5e5bb-39bb-468c-b343-dcdfb01e92d0
  resourceQuotas:
    scope: Tenant

If we apply the changes to the dev tenant then

  • Only pods from tenant namespaces will be able to communicate between themselves and consume other "services" internal or external to the cluster
  • No pod will be able to access dev tenant pods

We can see the automatic addition of network policies by the capsule controller.

 kubectl get netpol -A
NAMESPACE     NAME                 POD-SELECTOR             AGE
dev-app       capsule-dev-0        <none>                7m59s
dev-demo      capsule-dev-0        <none>                7m45s
kube-system   konnectivity-agent   app=konnectivity-agent   8h

You can see here a video demonstrating the network policy example


References

  • Capsule Website: https://clastix.io/capsule/
  • Capsule Documentation: https://capsule.clastix.io/docs/
  • Capsule GitHub Repository: https://github.com/clastix/capsule

Découvrez les technologies d'alter way