DEV Community

Mark Zlamal
Mark Zlamal

Posted on

CockroachDB on OpenShift: Separate your logs from data!

CockroachDB and persistent volumes

When deployed on Kubernetes or OpenShift, CockroachDB uses persistent volumes (PVs) to store DB data, metadata, state-data, user-data, log files, configuration files. These volumes are typically file-system mounts that are mapped to disks/SSDs where the data is physically saved in a distributed fashion. When you operate CockroachDB and run queries, data must be read or written where these operations translate to frequent or continuous disk reads & writes.

Managing the disk: IOPS & throughput

On cloud-managed orchestrators, when you read or write data to disk (PVs), this consumes IOPS and utilizes some of the available IO throughput. These are limiting factors that can result bandwidth saturation, or worse, throttling by the cloud provider under heavier workloads. This condition can be identified by the combination of low CPU usage and high disk latencies, visualized through the CockroachDB UI console hardware dashboard metrics and charts.

Divide & conquer

To overcome these limitations, CockroachDB lets you take advantage of multiple, independent PVs to separate the destination of the cockroach runtime data. CockroachDB Logging is a good candidate to move out of the critical path by dedicating its own volume/storage. This will help with performance tuning since your SQL/schemas live on their own dedicated volume. In fact it's the production readiness recommendation to split the data from the logs into separate PVs.

Typical CockroachDB deployments

Most CockroachDB clusters implement a single PVC that is assigned to each node in a stateful set. Default configurations in both HELM and Operator managed environments create this 1:1 mapping as follows:

Default PV/PVC relationship between nodes and volumes

Default PV/PVC relationship between nodes and volumes

Our planned deployment with multiple PVs

By introducing a second PV dedicated for logs, we split the workload and effectively double the IO channels and allows for each to be independently configured. Storage for logs can be significantly reduced when compared to the cockroach-data PV since logs can be rotated/truncated while your business data can grow over time. This illustration highlights the logical infrastructure layout between nodes and PVs. Multiple PV/PVCs assigned to each node

Multiple PV/PVCs assigned to each node

…to the implementation

We need to make additions to the StatefulSet template along with custom log-configuration settings to direct CockroachDB logs into the new destination PV.

The logging “secret” configuration

This resource is the one-stop-shop for all your customized logging properties, including log sinks (output logs to different locations, including over the network), logging channels that are mapped to each sink, the format used by the log messages, any redaction-flags of log messages, the buffering and max sizes of log messages.

The following log configuration is the smallest/simplest configuration that we will use as a starting point. Here we keep most defaults, only adjusting the file-defaults destination path for the actual files, where this path will be mounted to a separate PV defined in the StatefulSet template.

file-defaults:
  dir: /cockroach/cockroach-logs
sinks:
  file-groups:
    default:
      channels:
      - ALL
Enter fullscreen mode Exit fullscreen mode

For a comprehensive explanation of this fragments, along with working examples and code-fragments, please refer to the Cockroach log configuration documentation so you can tailor the actual logging to your needs.

The StatefulSet template configuration

This statefulset fragment only highlights the added template properties to define the PVC and specific mount points to both the log config secret and the new logs folder. A full, complete StatefulSet example follows this fragment to show the entirety of an actual solution I deployed.

kind: StatefulSet
apiVersion: apps/v1
spec:
  volumeClaimTemplates:
    # ...
    # ...
    # Fragment 1
    # New volumeClaimTemplate to generate Log PVC & PV
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: logsdir
        labels:
          app.kubernetes.io/instance: zlamal
          app.kubernetes.io/name: cockroachdb
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        volumeMode: Filesystem
  template:
    spec:
      containers:
        - # ...
          # ...
          volumeMounts:
            # ...
            # ...
            # Fragment 2
            # Additional mount-points for path to logs and log-config
            - name: logsdir
              mountPath: /cockroach/cockroach-logs/
            - name: log-config
              readOnly: true
              mountPath: /cockroach/log-config
          # Fragment 3
          # Addition of a new “cockroach start” parameter --log-config-file=...
          # This parameter points CRDB to the mounted log-config secret
          args:
            - shell
            - '-ecx'
            - |-
              exec /cockroach/cockroach start --log-config-file=/cockroach/log-config/log-config.yaml --join=... --advertise-host=... --certs-dir=/cockroach/cockroach-certs/ --http-port=8081 --port=26257 --cache=11% --max-sql-memory=10%
      volumes:
        - name: datadir
          persistentVolumeClaim:
            claimName: datadir
        # Fragment 4
        # Establish the logical YAML reference to the logging directory
        - name: logsdir
          persistentVolumeClaim:
            claimName: logsdir
        # Fragment 5
        # Establish logical YAML reference to the log-config secret resource
        - name: log-config
          secret:
            secretName: zlamal-cockroachdb-log-config
            defaultMode: 420
  # ...
  # ...
Enter fullscreen mode Exit fullscreen mode
Note the “Fragment 1, 2, 3, 4, 5” additions to the StatefulSet

Here is the complete StatefulSet of these changes,including tags/labels specific to my cluster as a reference example that you can copy and edit to make your own (eg sizes, storage classes, IOPS, tags/labels. etc):

kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: zlamal-cockroachdb
  labels:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: zlamal
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cockroachdb
    helm.sh/chart: cockroachdb-14.0.4
spec:
  serviceName: zlamal-cockroachdb
  volumeClaimTemplates:
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: datadir
        labels:
          app.kubernetes.io/instance: zlamal
          app.kubernetes.io/name: cockroachdb
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        volumeMode: Filesystem
    - kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: logsdir
        labels:
          app.kubernetes.io/instance: zlamal
          app.kubernetes.io/name: cockroachdb
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        volumeMode: Filesystem
  template:
    metadata:
      labels:
        app.kubernetes.io/component: cockroachdb
        app.kubernetes.io/instance: zlamal
        app.kubernetes.io/name: cockroachdb
    spec:
      restartPolicy: Always
      initContainers:
        - resources: {}
          terminationMessagePath: /dev/termination-log
          name: copy-certs
          command:
            - /bin/sh
            - '-c'
            - cp -f /certs/* /cockroach-certs/; chmod 0400 /cockroach-certs/*.key
          env:
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: certs
              mountPath: /cockroach-certs/
            - name: certs-secret
              mountPath: /certs/
          terminationMessagePolicy: File
          image: busybox
      serviceAccountName: zlamal-cockroachdb
      schedulerName: default-scheduler
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app.kubernetes.io/component: cockroachdb
                    app.kubernetes.io/instance: zlamal
                    app.kubernetes.io/name: cockroachdb
                topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 300
      securityContext: {}
      containers:
        - resources: {}
          readinessProbe:
            httpGet:
              path: /health?ready=1
              port: http
              scheme: HTTPS
            initialDelaySeconds: 10
            timeoutSeconds: 1
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 2
          terminationMessagePath: /dev/termination-log
          name: db
          livenessProbe:
            httpGet:
              path: /health
              port: http
              scheme: HTTPS
            initialDelaySeconds: 30
            timeoutSeconds: 1
            periodSeconds: 5
            successThreshold: 1
            failureThreshold: 3
          env:
            - name: STATEFULSET_NAME
              value: zlamal-cockroachdb
            - name: STATEFULSET_FQDN
              value: zlamal-cockroachdb.mz-helm-v11.svc.cluster.local
            - name: COCKROACH_CHANNEL
              value: kubernetes-helm
          ports:
            - name: grpc
              containerPort: 26257
              protocol: TCP
            - name: http
              containerPort: 8081
              protocol: TCP
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: datadir
              mountPath: /cockroach/cockroach-data/
            - name: logsdir
              mountPath: /cockroach/cockroach-logs/
            - name: log-config
              readOnly: true
              mountPath: /cockroach/log-config
            - name: certs
              mountPath: /cockroach/cockroach-certs/
          terminationMessagePolicy: File
          image: 'cockroachdb/cockroach:v23.2.1'
          args:
            - shell
            - '-ecx'
            - |-
              exec /cockroach/cockroach start --log-config-file=/cockroach/log-config/log-config.yaml --join=${STATEFULSET_NAME}-0.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-1.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-2.${STATEFULSET_FQDN}:26257 --advertise-host=$(hostname).${STATEFULSET_FQDN} --certs-dir=/cockroach/cockroach-certs/ --http-port=8081 --port=26257 --cache=11% --max-sql-memory=10% 
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/component: cockroachdb
              app.kubernetes.io/instance: zlamal
              app.kubernetes.io/name: cockroachdb
      serviceAccount: zlamal-cockroachdb
      volumes:
        - name: datadir
          persistentVolumeClaim:
            claimName: datadir
        - name: logsdir
          persistentVolumeClaim:
            claimName: logsdir
        - name: log-config
          secret:
            secretName: zlamal-cockroachdb-log-config
            defaultMode: 420
        - name: certs
          emptyDir: {}
        - name: certs-secret
          projected:
            sources:
              - secret:
                  name: zlamal-cockroachdb-node-secret
                  items:
                    - key: ca.crt
                      path: ca.crt
                      mode: 256
                    - key: tls.crt
                      path: node.crt
                      mode: 256
                    - key: tls.key
                      path: node.key
                      mode: 256
            defaultMode: 420
      dnsPolicy: ClusterFirst
  podManagementPolicy: Parallel
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app.kubernetes.io/component: cockroachdb
      app.kubernetes.io/instance: zlamal
      app.kubernetes.io/name: cockroachdb
Enter fullscreen mode Exit fullscreen mode
The logical names/mappings of the volumes are connected together

Conclusion & References

This is a versatile addition to the standard statefulset because the IOPS can be managed between the PVs, and the plumbing is in-place for log customization. DB admins can easily make changes the to logging channels in a running environment by editing a single log-config file that saved as a secrets object.

Cockroach Logging Overview
Cockroach log configuration
Cockroach start: logging
Production recommendations

Top comments (2)

Collapse
 
jhatcher9999 profile image
Jim Hatcher

Mark, I'm guessing you could take a similar approach to having multiple data store devices?

Collapse
 
world2mark profile image
Mark Zlamal

Yes indeed! Adding additional data-stores is an ideal solution to address several use-cases:

  1. CRDB on high-vCPU worker-nodes: From our production readiness guidelines, we do not recommend workers with > 32 vCPUs. If you're bound to servers with 32 or more vCPUs, the additional store will benefit from the extra compute/processing power by creating additional processes per-store. These include splitting the GC workload, compactions, replica management, WAL, monitoring, etc. In the end you will leverage the additional CPU and will experience less waiting times on I/O operations.

  2. You can create custom stores dedicated for specialized activities such as encryption at-rest for a subset of your data. In most cases there is a performance cost to encrypt/decrypt data, and you may not want to do this for the entirety of your data, maybe just a few tables managing PII. This is nicely written up with tangible examples in this blog: cockroachlabs.com/blog/selective-e...