-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43926: [C++] Compute: RowEncoder eliminates offsets when all columns are fixed-sized #43931
base: main
Are you sure you want to change the base?
Conversation
|
f8953d5
to
cd30962
Compare
cd30962
to
7deeb8c
Compare
What does performance look like? |
CI need fix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some preliminary comments, mostly nits.
@@ -132,6 +147,8 @@ struct ARROW_EXPORT DictionaryKeyEncoder : FixedWidthKeyEncoder { | |||
Result<std::shared_ptr<ArrayData>> Decode(uint8_t** encoded_bytes, int32_t length, | |||
MemoryPool* pool) override; | |||
|
|||
// Uses `GetEncoderInfo` in `FixedWidthKeyEncoder` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this line for?
Co-authored-by: Rossi Sun <[email protected]>
@@ -83,6 +83,13 @@ struct ARROW_EXPORT KeyEncoder { | |||
static bool IsNull(const uint8_t* encoded_bytes) { | |||
return encoded_bytes[0] == kNullByte; | |||
} | |||
|
|||
struct EncoderInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using a more descriptive name - XxInfo
generally doesn't give much information about Xx
. Maybe something like EncodeColumnMeta
?
I would go out for tour from 9.14 to 9.21. I'll focus on List Join these days, mark this as wip first |
With:
After:
Before:
|
Rationale for this change
Not use
offsets_
vector when all columns in RowEncoder is fixed-width.This:
This might enlarge 8-byte memory in RowEncoder.
What changes are included in this PR?
Not use
offsets_
vector when all columns in RowEncoder is fixed-width.Are these changes tested?
Covered by existing
Are there any user-facing changes?
no