adding support for partitioned s3 source #4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
S3-SQS source doesn't support reading partition columns from the S3 bucket. As a result, the dataset formed using S3-SQS source doesn't contain the partition columns leading to issue #2
How this PR Handles the Problem
With the new changes, the user can specify partition columns in the schema with
isPartitioned
set totrue
in column metadata.Example:
Also, the user needs to specify the
basePath
in options if the schema contains partition columns. Specifying partitionedcolumns without specifying the
basePath
will throw an error.Example: