Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for maintain_order param in joins #17698
base: branch-25.02
Are you sure you want to change the base?
Add support for maintain_order param in joins #17698
Changes from 21 commits
056ca24
b952e15
0d702fe
e07cef8
3005745
2695735
01007fb
e01aa2e
84830b6
7238d16
5df0303
08b3a83
6022db5
1daa8c6
e38d5a1
712146b
ad44798
807de8f
6f1741f
6a10590
14e2508
c2a3be3
415b894
632c1cb
f358d58
82440db
f420425
6004241
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the reviewer: This PR needs more work, but I'm opening it up for review so I can get some help handling a special case: full joins (where we maintain the order of the right table). Specifically, the case where the test fails is when there are unmatched keys in the left dataframe. Any advice on how to handle this?
Example:
The dataframe differ at column "a"
The
a=2
entry is unmatched in the right dataframe, so it should be appended to the end of the result, not included with the other matches.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is expected. Because those last two rows have the same sort key in the right table column, there's no disambiguator to decide which order the left column result comes in.
e.g.
(No GPU engine involved):
Notice how the last two rows flip around between the two runs.