Highly Available Cassandra Data Services on Mesos

If you’ve been following the blog for a while you have probably seen some of the work the team has put into making Mesos able to leverage underlying storage environments for your persistent applications. In an earlier post we shared a few examples of applications that could benefit from this, but we felt we wanted to show something even better. Enter Cassandra.

We will start the post out with some basics of Casssandra, but for those already familiar here are some interesting points we make here.  Cassandra and other distributed persistent applications can have challenges around:

  • Complexity of install
  • Data movement expense and manual processes with node failures
  • Performance limited by direct attached storage

We are going to show some really interesting things here, where Cassandra can overcome these challenges through running inside of Docker containers with Mesos while leveraging an external storage platform for its persistent data.

What is Cassandra?

In its most basic form, it’s a database. But it’s not just any database, it’s a fully distributed, no SPOF, multi-master, multi-DC, linearly scalable database. There are currently over 660 companies ranging from small to gigantic with names like eBay, Netflix, Reddit, The Weather Channel and others that have publicly declared that they are using Cassandra in production, and the community is growing every day.

Cassandra is extremely fault tolerant because of it’s built-in replication features, it’s highly performant, it’s decentralized and most importantly it’s extremely durable for companies that cannot afford to lose any data.

So why run it on Mesos?

As we explained in the earlier post, Mesos has a lot of the same qualities as Cassandra seeing as it is highly distributed, multi-master and so on. Because of this, we believe Mesos to be a great fit to run these types of applications and data services. We’ll explain two different approaches and what we envision will happen in the near future.

First off we have something called a Mesos Framework. These are mature pieces of software that plugs into Mesos to allow operators and users to deploy applications in a very effect way, examples include Hadoop, Spark, Storm, Jenkins, ElasticSearch, Marathon and of course, Cassandra. This would be the perfect solution for what we’re trying to achieve, if it wasn’t for one small issue: the data storage part isn’t really fully fledged yet. We hope that in the coming months we can see a change and start using even more robust frameworks for software like Cassandra, but right now it didn’t really work for our purposes.Screenshot 2016-01-20 21.21.58

So we went another route, running Cassandra in Docker containers, still on top of Mesos of course but not as a framework but rather as an application group. There have already been a lot of work done here, namely getting Cassandra in containers, but we had to tailor it for our purposes. We now have a Cassandra container image for clustered setups that automatically sets itself up, finds other nodes in the same cluster and joins it. We also added the functionality to use DataStax OpsCenter to manage the cluster nodes, which is nice if you want a graphical representation of your cluster.

Screenshot 2016-01-20 21.24.11

How does it work?

To run Cassandra on Mesos we’re using Marathon to initialize and control our services. Marathon uses simple JSON files as manifests where we can specify exactly how a service or application should run, the amount of resources it should have, what ports should be opened and so on. We created the following example manifest to deploy a 3-node Cassandra cluster, running in Docker containers, and using REX-Ray as the volume driver to be able to carve out proper data storage resources for the service. Each Cassandra node gets its own data volume and when setting up REX-Ray we also included the pre-emptive mount feature to make sure that we can automatically forcefully detach volumes on hosts that are dead or the services have crashed on, attach them to a new host and restart the containerized Cassandra node there with very little delay.

This is extremely important as without it you would have to manually detach and attach the underlying storage which would mean a lot more disruption to your services, and nobody likes that.

Since I’m sure you’d like to see what it looks like in action we have a video showing a demonstration of just this🙂

The full code and more examples can be found at our vagrant-mesos demo repo on GitHub.

{
  "id": "cassandra-cluster",
  "groups": [
    {
      "id": "cluster1",
      "apps": [
        {
          "id": "cassandra1",
          "container": {
            "docker": {
              "image": "jonasrosland/docker-cassandra:cluster",
              "privileged": true,
              "forcePullImage": true,
              "parameters": [
                {
                  "key": "env",
                  "value": "CASSANDRA_CLUSTERNAME=cassandra-cluster"
                },
                {
                  "key": "env",
                  "value": "CASSANDRA_SEEDS=172.31.2.11,172.31.2.12,172.31.2.13"
                },
                {
                  "key": "volume-driver",
                  "value": "rexray"
                },
                {
                  "key": "volume",
                  "value": "cassandra1:/var/lib/cassandra"
                }
              ]
            }
          },
          "cpus": 2,
          "mem": 8000,
          "instances": 1,
          "constraints": [
            [
              "hostname",
              "UNIQUE"
            ]
          ]
        },
        {
          "id": "cassandra2",
          ...
        },
        {
          "id": "cassandra3",
          ...
        }
      ]
    }
  ]
}

Cassandra itself has a page dedicated to external storage as an “anti-pattern” here.  What do you think?  If we can demonstrate value in handling operations and node failures differently, and can scale better using external storage is it an anti-pattern?  We believe the technologies can be very complimentary and are very interested in continuing work to ensure distributed persistent applications like Cassandra can properly take advantage of external storage.

Overall this has been an extremely interesting project to work on, to show off advanced capabilities in several Open Source projects put together. We are very proud to be a part of this amazing community and are looking forward to even more collaboration across project borders🙂

3 thoughts on “Highly Available Cassandra Data Services on Mesos

  1. Massimo Re Ferre' (@mreferre)

    Jonas, great article! Quick comment / question. I kind of see Cassandra as a backend service that doesn’t have a high “churn rate” like applications that you could/should approach with an immutable infrastructure pattern where Docker would be a killer. Long story short… what would be the advantage of running such a service with such different patterns (?) on say Mesos or in containers in general? Especially when there are so many stunts to make it work against natural anti-pattern, you’d think there is a HUGE to gain? What is it that you are gaining?

    Is there lecture that gets into deeper details re “why Cassandra in container?”.

    Thanks!

    Like

    Reply
    1. clintonskitson

      Massimo,

      Thank you for your feedback and question.

      Running the persistent applications with patterns that we associate with non-persistence can have a ton of value. In this case with Cassandra and Mesos, you are able to gain form the simplicity and agility of using containers for your application. If we assume Mesos is the existing infrastructure layer, how did Jonas fire up a new cluster of Cassandra nodes? A JSON file. Further than this, how is it healed? By starting a new Cassandra process in a container on an available node. This is pretty powerful and happens when separating the application from it’s data.

      The strategy for using containers to run your persistent applications in this way and providing portability is a new pattern. The evolution of data center strategies for physical -> virtualization -> physical/physical+containers/virtualization/virtualization+containers is leaving data centers silo’d. A key aspect of this is that it moves the ball forward where a homogenous application scheduling layer provided by those like Mesos can be responsible for scheduling *all* applications with their diverse requirements.

      Distributed persistent applications like Cassandra play very well with Mesos. The programmable scheduling layer is key to minimizing complexity of running these types of apps. This leads towards the fact that running it in containers is not an anti-pattern since there is a framework for Mesos (which we did not show). As a takeaway from the article, I believe the valid anti-pattern is about distributed storage algorithms in Cassandra (which typically is on DAS) aligns with those a layer below in a storage OR virtualization platform.

      Like

      Reply
      1. Massimo Re Ferre' (@mreferre)

        Thanks Clint. I am still wrapping my head around this. I definitely see the value of what you are referring to for stateless services but for stateful services I need more crunching/thinking. Perhaps the part that confuses me is that we are looking at this with different lenses. E.g. “If we assume Mesos is the existing infrastructure layer” <- that's quite a big if and an even bigger assumption😉

        Kidding aside I can see trying to do this IF Cassandra is the last thing you need to "mesosize" (TM) and so you trade off efficiency for consistency but we seem to be at the very very early stages of this revolution and I don't consider stateful services to be low hanging fruits (albeit I am sure we will get there).

        As for deploying from a JSON/YAML file, no question. I am bought on that. But to my previous point, there may be some 20 other tools that could provision that way against some 30 backends that would make it easier to run a stateful service today.

        Never mind. Love your work, sorry if I hijacked this thread😉

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s