DEV Community

Robin Moffatt
Robin Moffatt

Posted on • Originally published at rmoff.net on

Should you run Kafka Connect in Distributed or Standalone mode?

Kafka Connect can be deployed in two modes: Standalone or Distributed.

connect

I usually recommend Distributed for several reasons:

  • You can run just a single node of it if you want

  • It can scale

  • It is fault-tolerant

  • It can be run on a single node sandbox or a multi-node production environment

  • It is the same configuration method however you run it

I usually find that Standalone is appropriate when:

  • You need to guarantee locality of task execution, such as picking up a log file from a folder on a specific machine

  • You donโ€™t care about scale or fault-tolerance ;-)

  • You like re-learning how to configure something when you realise that you do care about scale or fault-tolerance X-D

My last snarky point on the list is why even if youโ€™re just playing around with Kafka Connect on a laptop, learning it in Distributed mode means you learn it once, and then youโ€™re all set. If you start with Standalone and its .properties method of passing configuration files to the worker at startup, and then come to use Distributed you have to re-learn how to use the REST interface etc.



Some follow-ups to this:

Top comments (0)