This module demonstrates the following:
- The usage of the Kafka Streams DSL, including
cogroup()
,groupBy()
,aggregate()
,toStream()
andpeek()
. - Unit testing using Topology Test Driver.
In this module, records of type <String, KafkaPerson>
are streamed from two topics named PERSON_TOPIC
and PERSON_TOPIC_TWO
.
The streams are processed using the Kafka Streams DSL to perform cogrouping based on the last name of each KafkaPerson
record.
The following tasks are performed:
- Group the streams by last name using the
groupBy()
operation. - Apply a cogroup operation to combine the records from both streams with the same last name.
- Apply an aggregator that combines each
KafkaPerson
record with the same last name into aKafkaPersonGroup
object and aggregates the first names by last name. - Write the resulting records to a new topic named
PERSON_COGROUP_TOPIC
.
The output records will be in the following format:
{"firstNameByLastName":{"Last name 1":{"First name 1", "First name 2", "First name 3"}}}
{"firstNameByLastName":{"Last name 2":{"First name 4", "First name 5", "First name 6"}}}
{"firstNameByLastName":{"Last name 3":{"First name 7", "First name 8", "First name 9"}}}
To compile and run this demo, you will need the following:
- Java 21
- Maven
- Docker
To run the application manually, please follow the steps below:
- Start a Confluent Platform in a Docker environment.
- Produce records of type
<String, KafkaPerson>
to topics namedPERSON_TOPIC
andPERSON_TOPIC_TWO
. You can use the producer person to do this. - Start the Kafka Streams.
To run the application in Docker, please use the following command:
docker-compose up -d
This command will start the following services in Docker:
- 1 Kafka broker KRaft
- 1 Schema registry
- 1 Control Center
- 1 producer Person
- 1 Kafka Streams Cogroup