Setting Up a Confluent Kafka Docker Cluster Tutorial

1. Install Docker and Docker Compose

wget https://raw.githubusercontent.com/confluentinc/cp-all-in-one/7.5.3-post/cp-all-in-one-kraft/docker-compose.yml
docker-compose up -d

2. Verify Kafka Cluster

It will take a few minutes for all the components to spin up. It should look like this when complete.

Creating broker ... done
Creating schema-registry ... done
Creating rest-proxy ... done
Creating connect ... done
Creating ksqldb-server ... done
Creating control-center ... done
Creating ksql-datagen ... done
Creating ksqldb-cli ... done

Next open your web browser and navigate to the Confluent Control Center UI: http://localhost:9021/

From there you can use the UI to create topics, connectors, etc

3. Additional Configuration

Typically you won't be settings up all the cluster services for your development. It is good to disable some of the services in the yaml file. Keep a watch on the "depends on" section so that you don't turn off dependent services.

The only required services if you need topics and the control UI are "broker" and "command-center". You'll notice the "control-center" service has several dependencies but if you only need "broker" the UI will work but the page for those services will report they are unable to use the connect or other service. No need to worry about that./p>

4. Confluent Services

5. Why not use open source kafka?

Simple, if you are an enterprise then you use Confluent, period. You get enterprise support for all the services, peace of mind. Experts on-call to help you with anything in the stack. I've used it and their enterprise support is outstanding.

The Confluent Command center allow non-developers to easily create/modify topics and connectors. Confluent Connectors like "JDBC Source" is a long-running service that keeps a watermark of the last timestamp and ensures that you get all the records immediately. We tested this and have end-to-end (SQL Server to DynamoDB) records in milliseconds. You also have the ability to provide a regex for the schema and tables, so you can easily ingest updates from multiple tables with one connector. You can also view messages and schema in the Control Center UI which simplifies and speeds up integration testing.

Additional Resources

Related Articles

Streaming from Kafka with PySpark