NATS Streaming Cluster with FT Mode

NATS Streaming Cluster with FT Mode on AWS

Preparation

First, we need a Kubernetes cluster with a provider that offers a service with a ReadWriteMany filesystem available. In this short guide, we will create the cluster on AWS and then use EFS for the filesystem:

# Create 3 nodes Kubernetes cluster
eksctl create cluster --name stan-k8s \
  --nodes 3 \
  --node-type=t3.large \ # t3.small
  --region=us-east-2

# Get the credentials for your cluster
eksctl utils write-kubeconfig --name stan-k8s --region us-east-2

For the FT mode to work, we will need to create an EFS volume which can be shared by more than one pod. Go into the AWS console and create one and make the sure that it is in a security group where the k8s nodes will have access to it. In case of clusters created via eksctl, this will be a security group named ClusterSharedNodeSecurityGroup:

Creating the EFS provisioner

Confirm from the FilesystemID from the cluster and the DNS name, we will use those values to create an EFS provisioner controller within the K8S cluster:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: efs-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: efs-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-efs-provisioner
subjects:
  - kind: ServiceAccount
    name:
      efs-provisioner
      # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: ClusterRole
  name: efs-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-efs-provisioner
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-efs-provisioner
subjects:
  - kind: ServiceAccount
    name: efs-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-efs-provisioner
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: efs-provisioner
data:
  file.system.id: fs-c22a24bb
  aws.region: us-east-2
  provisioner.name: synadia.com/aws-efs
  dns.name: ""
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: efs-provisioner
spec:
  replicas: 1
  selector:
    matchLabels:
      app: efs-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: efs-provisioner
    spec:
      serviceAccountName: efs-provisioner
      containers:
        - name: efs-provisioner
          image: quay.io/external_storage/efs-provisioner:latest
          env:
            - name: FILE_SYSTEM_ID
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: file.system.id
            - name: AWS_REGION
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: aws.region
            - name: DNS_NAME
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: dns.name

            - name: PROVISIONER_NAME
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: provisioner.name
          volumeMounts:
            - name: pv-volume
              mountPath: /efs
      volumes:
        - name: pv-volume
          nfs:
            server: fs-c22a24bb.efs.us-east-2.amazonaws.com
            path: /
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: aws-efs
provisioner: synadia.com/aws-efs
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs
  annotations:
    volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Mi

Result of deploying the manifest:

serviceaccount/efs-provisioner                                        created
clusterrole.rbac.authorization.k8s.io/efs-provisioner-runner          created
clusterrolebinding.rbac.authorization.k8s.io/run-efs-provisioner      created
role.rbac.authorization.k8s.io/leader-locking-efs-provisioner         created
rolebinding.rbac.authorization.k8s.io/leader-locking-efs-provisioner  created
configmap/efs-provisioner                                             created
deployment.extensions/efs-provisioner                                 created
storageclass.storage.k8s.io/aws-efs                                   created
persistentvolumeclaim/efs                                             created

Setting up the NATS Streaming cluster

Now create a NATS Streaming cluster with FT mode enabled and using NATS embedded mode that is mounting the EFS volume:

---
apiVersion: v1
kind: Service
metadata:
  name: stan
  labels:
    app: stan
spec:
  selector:
    app: stan
  clusterIP: None
  ports:
    - name: client
      port: 4222
    - name: cluster
      port: 6222
    - name: monitor
      port: 8222
    - name: metrics
      port: 7777
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: stan-config
data:
  stan.conf: |
    http: 8222

    cluster {
      port: 6222
      routes [
        nats://stan-0.stan:6222
        nats://stan-1.stan:6222
        nats://stan-2.stan:6222
      ]
      cluster_advertise: $CLUSTER_ADVERTISE
      connect_retries: 10
    }

    streaming {
      id: test-cluster
      store: file
      dir: /data/stan/store
      ft_group_name: "test-cluster"
      file_options {
          buffer_size: 32mb
          sync_on_flush: false
          slice_max_bytes: 512mb
          parallel_recovery: 64
      }
      store_limits {
          max_channels: 10
          max_msgs: 0
          max_bytes: 256gb
          max_age: 1h
          max_subs: 128
      }  
    }
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: stan
  labels:
    app: stan
spec:
  selector:
    matchLabels:
      app: stan
  serviceName: stan
  replicas: 3
  volumeClaimTemplates:
  template:
    metadata:
      labels:
        app: stan
    spec:
      # STAN Server
      terminationGracePeriodSeconds: 30

      containers:
        - name: stan
          image: nats-streaming:alpine

          ports:
            # In case of NATS embedded mode expose these ports
            - containerPort: 4222
              name: client
            - containerPort: 6222
              name: cluster
            - containerPort: 8222
              name: monitor
          args:
            - "-sc"
            - "/etc/stan-config/stan.conf"

          # Required to be able to define an environment variable
          # that refers to other environment variables.  This env var
          # is later used as part of the configuration file.
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CLUSTER_ADVERTISE
              value: $(POD_NAME).stan.$(POD_NAMESPACE).svc
          volumeMounts:
            - name: config-volume
              mountPath: /etc/stan-config
            - name: efs
              mountPath: /data/stan
          resources:
            requests:
              cpu: 0
          livenessProbe:
            httpGet:
              path: /
              port: 8222
            initialDelaySeconds: 10
            timeoutSeconds: 5
        - name: metrics
          image: synadia/prometheus-nats-exporter:0.5.0
          args:
            - -connz
            - -routez
            - -subz
            - -varz
            - -channelz
            - -serverz
            - http://localhost:8222
          ports:
            - containerPort: 7777
              name: metrics
      volumes:
        - name: config-volume
          configMap:
            name: stan-config
        - name: efs
          persistentVolumeClaim:
            claimName: efs

Your cluster now will look something like this:

kubectl get pods
NAME                                     READY   STATUS    RESTARTS   AGE
efs-provisioner-6b7866dd4-4k5wx          1/1     Running   0          21m
stan-0                                   2/2     Running   0          6m35s
stan-1                                   2/2     Running   0          4m56s
stan-2                                   2/2     Running   0          4m42s

If everything was setup properly, one of the servers will be the active node.

$ kubectl logs stan-0 -c stan
[1] 2019/12/04 20:40:40.429359 [INF] STREAM: Starting nats-streaming-server[test-cluster] version 0.16.2
[1] 2019/12/04 20:40:40.429385 [INF] STREAM: ServerID: 7j3t3Ii7e2tifWqanYKwFX
[1] 2019/12/04 20:40:40.429389 [INF] STREAM: Go version: go1.11.13
[1] 2019/12/04 20:40:40.429392 [INF] STREAM: Git commit: [910d6e1]
[1] 2019/12/04 20:40:40.454212 [INF] Starting nats-server version 2.0.4
[1] 2019/12/04 20:40:40.454360 [INF] Git commit [c8ca58e]
[1] 2019/12/04 20:40:40.454522 [INF] Starting http monitor on 0.0.0.0:8222
[1] 2019/12/04 20:40:40.454830 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2019/12/04 20:40:40.454841 [INF] Server id is NB3A5RSGABLJP3WUYG6VYA36ZGE7MP5GVQIQVRG6WUYSRJA7B2NNMW57
[1] 2019/12/04 20:40:40.454844 [INF] Server is ready
[1] 2019/12/04 20:40:40.456360 [INF] Listening for route connections on 0.0.0.0:6222
[1] 2019/12/04 20:40:40.481927 [INF] STREAM: Starting in standby mode
[1] 2019/12/04 20:40:40.488193 [ERR] Error trying to connect to route (attempt 1): dial tcp: lookup stan on 10.100.0.10:53: no such host
[1] 2019/12/04 20:40:41.489688 [INF] 192.168.52.76:40992 - rid:6 - Route connection created
[1] 2019/12/04 20:40:41.489788 [INF] 192.168.52.76:40992 - rid:6 - Router connection closed
[1] 2019/12/04 20:40:41.489695 [INF] 192.168.52.76:6222 - rid:5 - Route connection created
[1] 2019/12/04 20:40:41.489955 [INF] 192.168.52.76:6222 - rid:5 - Router connection closed
[1] 2019/12/04 20:40:41.634944 [INF] STREAM: Server is active
[1] 2019/12/04 20:40:41.634976 [INF] STREAM: Recovering the state...
[1] 2019/12/04 20:40:41.655526 [INF] STREAM: No recovered state
[1] 2019/12/04 20:40:41.671435 [INF] STREAM: Message store is FILE
[1] 2019/12/04 20:40:41.671448 [INF] STREAM: Store location: /data/stan/store
[1] 2019/12/04 20:40:41.671524 [INF] STREAM: ---------- Store Limits ----------
[1] 2019/12/04 20:40:41.671527 [INF] STREAM: Channels:                   10
[1] 2019/12/04 20:40:41.671529 [INF] STREAM: --------- Channels Limits --------
[1] 2019/12/04 20:40:41.671531 [INF] STREAM:   Subscriptions:           128
[1] 2019/12/04 20:40:41.671533 [INF] STREAM:   Messages     :     unlimited
[1] 2019/12/04 20:40:41.671535 [INF] STREAM:   Bytes        :     256.00 GB
[1] 2019/12/04 20:40:41.671537 [INF] STREAM:   Age          :        1h0m0s
[1] 2019/12/04 20:40:41.671539 [INF] STREAM:   Inactivity   :     unlimited *
[1] 2019/12/04 20:40:41.671541 [INF] STREAM: ----------------------------------
[1] 2019/12/04 20:40:41.671546 [INF] STREAM: Streaming Server is ready

NATS Streaming Cluster with FT Mode on Azure

First need to create a PVC (PersistentVolumeClaim), in Azure we can use azurefile to get a volume with ReadWriteMany:

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: stan-efs
  annotations:
    volume.beta.kubernetes.io/storage-class: "azurefile"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Mi

Next create a NATS cluster using the Helm charts:

helm repo add nats https://nats-io.github.io/k8s/helm/charts/
helm install nats nats/nats

To create an FT setup using AzureFile you can use the following Helm chart values file:

stan:
  image: nats-streaming:alpine
  replicas: 2
  nats:
    url: nats://nats:4222
store:
  type: file
  ft:
    group: my-group
  file:
    path: /data/stan/store
  volume:
    enabled: true

    # Mount path for the volume.
    mount: /data/stan

    # FT mode requires a single shared ReadWriteMany PVC volume.
    persistentVolumeClaim:
      claimName: stan-efs

Now deploy with Helm:

helm install stan nats/stan -f ./examples/deploy-stan-ft-file.yaml

Send a few commands to the NATS Server to which STAN/NATS Streaming is connected:

kubectl port-forward nats-0 4222:4222 &

stan-pub -c stan foo bar.1
stan-pub -c stan foo bar.2
stan-pub -c stan foo bar.3

Subscribe to get all the messages:

stan-sub -c stan  -all foo