Anybody doing something interesting to a production Cassandra cluster is generally advised, for a host of excellent reasons, to try it out in a test environment first. Here's how to make those environments effectively disposable.
The something interesting we're trying to do to our Cassandra cluster is actually two somethings: upgrading from v2 to v3, while also factoring Cassandra itself out from the group of EC2 servers that currently run Cassandra-and-also-some-other-important-stuff. We have a "pets" situation and want a "cattle" situation, per Bill Baker: pets have names and you care deeply about each one's welfare, while cattle are, not to put too fine a point on it, fungible. If we can bring new dedicated nodes into the cluster, start removing the original nodes as replication takes its course, and finally upgrade this Database of Theseus, that'll be some significant progress -- and without downtime, even! But it's going to take a lot of testing, to say nothing of managing the new nodes for real.
We already use SaltStack to monitor and manage other areas of our infrastructure besides the data pipeline, and SaltStack includes a "salt-cloud" module which can work with EC2. I'd rather have a single infra-as-code solution, so that part's all good. What isn't: the official Cassandra formula is geared more towards single-node instances or some-assembly-required clusters, and provisioning is a separate concern. I expect to be creating and destroying clusters with abandon, so I need this to be as automatic as possible.
Salt-Cloud Configuration
The first part of connecting salt-cloud is to set up a provider and profile. On the Salt master, these are in /etc/cloud.providers.d and /etc/cloud.profiles.d. We keep everything in source control and symlink these directories.
Our cloud stuff is hosted on AWS, so we're using the EC2 provider. That part is basically stock, but in profiles we do need to define a template for the Cassandra nodes themselves.
etc/cloud.profiles.d/ec2.conf
cassandra_node:
provider: [your provider from etc/cloud.providers.d/ec2.conf]
image: ami-abc123
ssh_interface: private_ips
size: m5.large
securitygroup:
- default
- others
cassandra-test.map
With the cassandra_node
template defined in the profile configuration, we can establish the cluster layout in a map file. The filename doesn't matter; mine is cassandra-test.map. One important thing to note is that we're establishing a naming convention for our nodes: cassandra-*
. Each node is also defined as t2.small
size, overriding the default m5.large
-- we don't need all that horsepower while we're just testing! t2.micro
instances, however, did prove to be too underpowered to run Cassandra.
cassandra_node:
- cassandra-1:
size: t2.small
cassandra-seed: true
- cassandra-2:
size: t2.small
cassandra-seed: true
- cassandra-3:
size: t2.small
cassandra-seed
(and size
, for that matter) is a grain, a fact each Salt-managed "minion" knows about itself. When Cassandra comes up in a multi-node configuration, each node looks for help joining the cluster from a list of "seed" nodes. Without seeds, nothing can join the cluster; however, only non-seeds will bootstrap data from the seeds on joining so it's not a good idea to make everything a seed. And the seed layout needs to toposort: if A has B and C for seeds, B has A and C, and C has A and B, it's the same situation as no seeds. If two instances know that they're special somehow, we can use grain matching to target them specifically.
Pillar and Mine
The Salt "pillar" is a centralized configuration database stored on the master. Minions make local copies on initialization, and their caches can be updated with salt minion-name saltutil.refresh_pillar
. Pillars can target nodes based on name, grains, or other criteria, and are commonly used to store configuration. We have a lot of configuration, and most of it will be the same for all nodes, so using pillars is a natural fit.
srv/salt/pillar/top.sls
Like the top.sls
for Salt itself, the Pillar top.sls
defines a highstate or default state for new minions. First, we declare the pillars we're adding appertain to minions whose names match the pattern cassandra-*
.
base:
'cassandra-*':
- system-user-ubuntu
- mine-network-info
- java
- cassandra
srv/salt/pillar/system-user-ubuntu.sls
Nothing special here, just a user so we can ssh in and poke things. The private key for the user is defined in the cloud provider configuration.
system:
user: ubuntu
home: /home/ubuntu
srv/salt/pillar/mine-network-info.sls
The Salt "mine" is another centralized database, this one storing grain information so minions can retrieve facts about other minions from the master instead of dealing with peer-to-peer communication. Minions use a mine_functions
pillar (or salt-minion configuration, but we're sticking with the pillar) to determine whether and what to store. For Cassandra nodes, we want internal network configuration and the public DNS name, which latter each node has to get by asking AWS where it is with curl
.
mine_functions:
network.interfaces: [eth0]
network.ip_addrs: [eth0]
# ask amazon's network config what we're public as
public_dns:
- mine_function: cmd.run
- 'curl -s http://169.254.169.254/latest/meta-data/public-hostname'
srv/salt/pillar/java.sls
Cassandra requires Java 8 to be installed (prospective Java 9 support became prospective Java 11 support and is due with Cassandra 4). This pillar sets up the official Java formula accordingly.
java:
# vitals
release: '8'
major: '0'
minor: '202'
development: false
# tarball
prefix: /usr/share/java # unpack here
version_name: jdk1.8.0_202 # root directory name
source_url: https://download.oracle.com/otn-pub/java/jdk/8u202-b08/1961070e4c9b4e26a04e7f5a083f551e/server-jre-8u202-linux-x64.tar.gz
#source_hash: sha256=9efb1493fcf636e39c94f47bacf4f4324821df2d3aeea2dc3ea1bdc86428cb82
source_hash: sha256=61292e9d9ef84d9702f0e30f57b208e8fbd9a272d87cd530aece4f5213c98e4e
dl_opts: -b oraclelicense=accept-securebackup-cookie -L
srv/salt/pillar/cassandra.sls
Finally, the Cassandra pillar defines properties common to all nodes in the cluster. My upgrade plan is to bring everything up on 2.2.12, switch the central pillar definition over, and then supply the new version number to each minion by refreshing its pillar as part of the upgrade process.
cassandra:
version: '2.2.12'
cluster_name: 'Test Cluster'
authenticator: 'AllowAllAuthenticator'
endpoint_snitch: 'Ec2Snitch'
twcs_jar:
'2.2.12': 'TimeWindowCompactionStrategy-2.2.5.jar'
'3.0.8': 'TimeWindowCompactionStrategy-3.0.0.jar'
The twcs_jar
dictionary gets into one of the reasons I'm not using the official formula: we're using the TimeWindowCompactionStrategy. TWCS was integrated into Cassandra starting in 3.0.8 or 3.8, but it has to be compiled and installed separately for earlier versions. Pre-integration versions of TWCS also have a different package name (com.jeffjirsa
instead of org.apache
). 3.0.8 is the common point, having the org.apache
TWCS built in but also being a valid compilation target for the com.jeffjirsa
TWCS. After upgrading to 3.0.8 I'll be able to ALTER TABLE
to apply the org.apache
version before proceeding.
With the provider, profile, map file, and pillar setup we can actually spin up a barebones cluster of Ubuntu VMs now and retrieve the centrally-stored network information from the Salt mine:
sudo salt-cloud -m cassandra-test.map
sudo salt 'cassandra-1' 'mine.get' '*' 'public_dns'
We can't do much else, since we don't have anything installed on the nodes yet, but it's progress!
The Cassandra State
The state definition includes everything a Cassandra node has to have in order to be part of the cluster: the installed binaries, a cassandra
group and user, a config file, a data directory, and a running SystemD unit. The definition itself is sort of an ouroboros of YAML and Jinja:
srv/salt/cassandra/defaults.yaml
First, there's a perfectly ordinary YAML file with some defaults. These could easily be in the pillar we set up above (or the pillar config could all be in this file); the principal distinction seems to be in whether you want to propagate changes via saltutil.refresh_pillar
, or by (re)applying the Cassandra state either directly or via highstate. This is definitely more complicated than it needs to be right now, but given that this is my first major SaltStack project, I don't yet know enough to land on one side or the other, or if combining a defaults file with the pillar configuration will eventually be necessary.
cassandra:
dc: dc1
rack: rack1
srv/salt/cassandra/map.jinja
The map template loads the defaults file and merges them with the pillar, creating a server
dictionary with all the Cassandra parameters we're setting.
{% import_yaml "cassandra/defaults.yaml" as default_settings %}
{% set server = salt['pillar.get']('cassandra', default=default_settings.cassandra, merge=True) %}
srv/salt/cassandra/init.sls
Finally, the Cassandra state entrypoint init.sls is another Jinja template that happens to look a lot like a YAML file and renders a YAML file, which for SaltStack is good enough. Jinja is required here since values from the server
dictionary, like the server version or the TWCS JAR filename, need to be interpolated at the time the state is applied.
When the Cassandra state is applied to a fresh minion:
-
wget
will be installed - A
CASSANDRA_VERSION
environment variable will be set to the value defined in the pillar - A user and group named
cassandra
will be created - A script named
install.sh
will download and extract Cassandra itself, once the above three conditions are met - A node configuration file named
cassandra.yaml
will be generated from a Jinja template and installed to/etc/cassandra
- If necessary, the TWCS jar will be added to the Cassandra lib directory
- The directory
/var/lib/cassandra
will be created and chowned to thecassandra
user - A SystemD unit for Cassandra will be installed and started once all its prerequisites are in order
{% from "cassandra/map.jinja" import server with context %}
wget:
pkg.installed
cassandra:
environ.setenv:
- name: CASSANDRA_VERSION
- value: {{ server.version }}
cmd.script:
- require:
- pkg: wget
- user: cassandra
- environ: CASSANDRA_VERSION
- source: salt://cassandra/files/install.sh
- user: root
- cwd: ~
group.present: []
user.present:
- require:
- group: cassandra
- gid_from_name: True
- createhome: False
service.running:
- enable: True
- require:
- file: /etc/cassandra/cassandra.yaml
- file: /etc/systemd/system/cassandra.service
{%- if server.twcs_jar[server.version] %}
- file: /opt/cassandra/lib/{{ server.twcs_jar[server.version] }}
{%- endif %}
# Main configuration
/etc/cassandra/cassandra.yaml:
file.managed:
- source: salt://cassandra/files/{{ server.version }}/cassandra.yaml
- template: jinja
- makedirs: True
- user: cassandra
- group: cassandra
- mode: 644
# Load TWCS jar if necessary
{%- if server.twcs_jar[server.version] %}
/opt/cassandra/lib/{{ server.twcs_jar[server.version] }}:
file.managed:
- require:
- user: cassandra
- group: cassandra
- source: salt://cassandra/files/{{ server.version }}/{{ server.twcs_jar[server.version] }}
- user: cassandra
- group: cassandra
- mode: 644
{%- endif %}
# Data directory
/var/lib/cassandra:
file.directory:
- user: cassandra
- group: cassandra
- mode: 755
# SystemD unit
/etc/systemd/system/cassandra.service:
file.managed:
- source: salt://cassandra/files/cassandra.service
- user: root
- group: root
- mode: 644
srv/salt/cassandra/files/install.sh
This script downloads and extracts the target version of Cassandra and points the symlink /opt/cassandra
to it. If the target version already exists, it just updates the symlink since everything else is already set up.
#!/bin/bash
update_symlink() {
rm /opt/cassandra
ln -s "/opt/apache-cassandra-$CASSANDRA_VERSION" /opt/cassandra
echo "Updated symlink"
}
# already installed?
if [-d "/opt/apache-cassandra-$CASSANDRA_VERSION"]; then
echo "Cassandra $CASSANDRA_VERSION is already installed!"
update_symlink
exit 0
fi
# download and extract
wget "https://archive.apache.org/dist/cassandra/$CASSANDRA_VERSION/apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"
tar xf "apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"
rm "apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"
# install to /opt and link /opt/cassandra
mv "apache-cassandra-$CASSANDRA_VERSION" /opt
update_symlink
# create log directory
mkdir -p /opt/cassandra/logs
# set ownership
chown -R cassandra:cassandra "/opt/apache-cassandra-$CASSANDRA_VERSION"
chown cassandra:cassandra /opt/cassandra
It's probably possible to do most of this, at least the symlink juggling and directory management, with "pure" Salt (and the environment variable could be eliminated by rendering install.sh
as a Jinja template with the server
dictionary), but the script does what I want it to and it's already idempotent and centrally managed.
srv/salt/cassandra/files/cassandra.service
This is a basic SystemD unit, with some system limits customized to give Cassandra enough room to run. It starts whatever Cassandra executable it finds at /opt/cassandra, so all that's necessary to resume operations after the symlink changes during the upgrade is to restart the service.
[Unit]
Description=Apache Cassandra database server
Documentation=http://cassandra.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=forking
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/bin/cassandra -Dcassandra.config=file:///etc/cassandra/cassandra.yaml
LimitNOFILE=100000
LimitNPROC=32768
LimitMEMLOCK=infinity
LimitAS=infinity
[Install]
WantedBy=multi-user.target
srv/salt/cassandra/files/2.2.12/cassandra.yaml
The full cassandra.yaml
is enormous, so I won't reproduce it here in full. The interesting parts are where values are being automatically interpolated by Salt. Like the Cassandra state, this is actually a Jinja template which renders a YAML file.
First, we get a list of internal IP addresses corresponding to cassandra-seed
minions from the Salt mine and build a list of known_seeds
.
{%- from 'cassandra/map.jinja' import server with context -%}
{% set known_seeds = [] %}
{% for minion, ip_array in salt['mine.get']('cassandra-seed:true', 'network.ip_addrs', 'grain').items() if ip_array is not sameas false and known_seeds|length < 2 %}
{% for ip in ip_array %}
{% do known_seeds.append(ip) %}
{% endfor %}
{% endfor %}
This becomes the list of seeds the node looks for when trying to join the cluster.
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "{{ known_seeds|unique|join(',') }}"
Listen and broadcast addresses are configured per node. The broadcast addresses are a little special due to our network configuration needs: each node has to get its public dns name from the Salt mine. This is perhaps a bit overcomplicated compared to a custom grain or capturing the output from running the Salt modules at render time, but it's there and it works and at this point messing with it isn't a great use of time.
listen_address: {{ grains['fqdn'] }}
broadcast_address: {{ salt['mine.get'](grains['id'], 'public_dns').items()[0][1] }}
rpc_address: {{ grains['fqdn'] }}
broadcast_rpc_address: {{ salt['mine.get'](grains['id'], 'public_dns').items()[0][1] }}
The cluster name and other central settings are interpolated from the pillar+defaults server
dictionary.
cluster_name: "{{ server.cluster_name }}"
...
authenticator: "{{ server.authenticator }}"
...
endpoint_snitch: "{{ server.endpoint_snitch }}"
The changes to the Cassandra 3.0.8 configuration are identical.
srv/salt/cassandra/files/2.2.12/TimeWindowCompactionStrategy-2.2.5.jar
See this post on TheLastPickle for directions on building the TWCS jar.
Highstate
Finally, the Salt highstate needs to ensure that our cassandra-*
nodes have the Java and Cassandra states applied. Since Salt-Cloud minions come configured, however, we have to ensure the default salt.minion
state is excluded from our Cassandra nodes since otherwise a highstate will blow away the cloud-specific configuration.
srv/salt/top.sls changes
base:
'not cassandra-*':
- match: compound
- salt.minion
'cassandra-*':
- sun-java
- sun-java.env
- cassandra
Startup!
Set the Salt config dir to etc
with -c
and pass in the map file with -m
:
sudo salt-cloud -c etc -m cassandra-test.map
To clean up:
sudo salt-cloud -d cassandra-1 cassandra-2 cassandra-3
Top comments (0)