-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/s3): ignore depth mismatched path #12326
base: master
Are you sure you want to change the base?
feat(ingestion/s3): ignore depth mismatched path #12326
Conversation
e6d168e
to
d5a4387
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Continue to review full report in Codecov by Sentry.
|
c63a02e
to
a2690ac
Compare
a2690ac
to
98d9263
Compare
def _is_allowed_path(path_spec_: PathSpec, s3_uri: str) -> bool: | ||
allowed = path_spec_.allowed(s3_uri) | ||
if not allowed: | ||
logger.debug(f"File {s3_uri} not allowed and skipping") | ||
return allowed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this function to log when a file is ignored.
If you have a better approach, please feel free to suggest it. 🙏
98d9263
to
651892a
Compare
for key, group in group_s3_objects_by_dirname(s3_objects).items(): | ||
file_size = 0 | ||
creation_time = None | ||
modification_time = None | ||
|
||
for item in group: | ||
file_path = self.create_s3_path(item.bucket_name, item.key) | ||
if not path_spec.allowed(file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify my intention, I wrote it as qna format.
Q: Why path_spec.allowed
moved to L879?
A: Since path_spec.allowed
already implements depth comparison, I used it to determine whether each path should be skipped or not while iterating over s3_objects.
Q: Does moving this logic to L879 make any breaking changes?
A: No, because it simply skips each disallowed path even though it has been moved to L879.
651892a
to
98366cd
Compare
98366cd
to
60d83cd
Compare
60d83cd
to
c1841f6
Compare
c1841f6
to
ab015f5
Compare
ab015f5
to
23db0f8
Compare
Summary
Feature Addition:
Ignores the paths whose depth dose not match to
path_spec.include
while ingesting S3.Changes
As-is
To-be
Background
path_spec.include
.Logs
How to reproduce
You can reproduce this issue by running the code below.
Checklist