DBZ-7148 New JDBC sink connector batch support blog post

debezium · Dec 14, 2023 · 829eafd · 829eafd
1 parent 5f79143
commit 829eafd
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 9 deletions.
diff --git a/_posts/2023-12-06-JDBC-sink-connector-batch-support.adoc b/_posts/2023-12-06-JDBC-sink-connector-batch-support.adoc
@@ -81,7 +81,6 @@ CREATE TABLE `aviation` (
 We planned to execute these tests:
 
 * 100K events from single table
-** Baseline without batch for (Oracle, MySQL, PostgreSQL, SQLServer)
 ** MySQL batch vs without batch
 * 100K events from three different table
 ** MySQL batch vs without batch
@@ -96,19 +95,14 @@ We planned to execute these tests:
 .{nbsp}
 image::100k-batch-no-batch.png[role=centered-image]
 
-_Figure 1_ illustrates the total execution time required to process 100,000 events from a single table, comparing different connectors without batch support and the MySQL connector with the default batch size.
+_Figure 1_ illustrates the total execution time required to process 100,000 events from a single table, comparing MySQL connector with and without the batch support.
 
 [NOTE]
 ====
 Despite the default values being set to `500` for both `batch.size` and `consumer.max.poll.records`, the observed actual size was reduced to `337` records due to payload size considerations.
 ====
 
-We can observe two things:
-
-* There are difference between different connectors due to specific database technology
-* As expected, the Debezium JDBC connector with batch support is faster
-
-For the following tests we will focus on MySQL since it was the one with highest execution time without the batch support.
+We can observe, as expected, that the Debezium JDBC connector with batch support is faster.
 
 .{nbsp}
 image::100k-3-tables.png[role=centered-image]
@@ -137,7 +131,7 @@ It's important to note that, for these tests, we used the `org.apache.kafka.conn
 .{nbsp}
 image::1M-different-batch-size-avro.png[role=centered-image]
 We then conducted experiments with Avro, and as depicted in _Figure 5_, the results show a significant improvement.
-As expected, processing 1 million events with `batch.size=500` is slower than with `batch.size=10000`.
+As expected, processing 1,000,000 events with `batch.size=500` is slower than with `batch.size=10000`.
 Notably, in our test configuration, the optimal value for `batch.size` is 1000, resulting in the fastest processing time.
 
 Although the results are better compared to JSON, there is still some performance degradation.

diff --git a/assets/images/2023-12-06-JDBC-sink-connector-batch-support/100k-batch-no-batch.png b/assets/images/2023-12-06-JDBC-sink-connector-batch-support/100k-batch-no-batch.png