Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43902: [Java] Support for Long memory addresses #43903

Merged
merged 19 commits into from
Sep 4, 2024

Conversation

vibhatha
Copy link
Collaborator

@vibhatha vibhatha commented Sep 2, 2024

Rationale for this change

The usage of Integer instead of Long must be encouraged with the usage of memory sizing, indexing and addresses.

What changes are included in this PR?

This PR refactors the usage of Integer into Long along with utilities refactors.

Are these changes tested?

Existing test cases.

Are there any user-facing changes?

Yes, certain API calls may subject changes.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right idea but we should be adding the long version as an overload in most cases (and if it's in return position we need to add a separate method), then we need to deprecate the int variants

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Sep 3, 2024
@vibhatha
Copy link
Collaborator Author

vibhatha commented Sep 3, 2024

I think this is the right idea but we should be adding the long version as an overload in most cases (and if it's in return position we need to add a separate method), then we need to deprecate the int variants

Right. I will update. Thank you!

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Sep 3, 2024
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Sep 3, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 3, 2024
@vibhatha vibhatha marked this pull request as ready for review September 3, 2024 04:48
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Sep 3, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 3, 2024
@vibhatha
Copy link
Collaborator Author

vibhatha commented Sep 3, 2024

@lidavidm I updated. Sorry I have missed a few things earlier.

@vibhatha vibhatha requested a review from lidavidm September 3, 2024 11:07
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Sep 4, 2024
@lidavidm
Copy link
Member

lidavidm commented Sep 4, 2024

Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  The following options were not recognized by any processor: '[skipDefs, atfDoNotCache]'
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BaseAllocator.java:[995,19] [removal] reserve(int) in AllocationReservation has been deprecated and marked for removal
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BaseAllocator.java:[894,19] [removal] add(int) in AllocationReservation has been deprecated and marked for removal
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[64,46] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[77,16] Unsafe is internal proprietary API and may be removed in a future release

Can we ensure we fix the warnings? (Also can we enable warnings-as-errors?)

@vibhatha
Copy link
Collaborator Author

vibhatha commented Sep 4, 2024

Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  The following options were not recognized by any processor: '[skipDefs, atfDoNotCache]'
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[27,15] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BaseAllocator.java:[995,19] [removal] reserve(int) in AllocationReservation has been deprecated and marked for removal
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/BaseAllocator.java:[894,19] [removal] add(int) in AllocationReservation has been deprecated and marked for removal
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[36,23] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[64,46] Unsafe is internal proprietary API and may be removed in a future release
Warning:  /build/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/MemoryUtil.java:[77,16] Unsafe is internal proprietary API and may be removed in a future release

Can we ensure we fix the warnings? (Also can we enable warnings-as-errors?)

Right, we discussed this earlier as well. We can create a ticket to follow up with this. But in relation to this context, I think we need to sort of organize the warning -> error migration as Unsafe components probably need some waiting. But for the rest and for each module I can enable the warnings to errors iteratively.

I created a parent issue apache/arrow-java#59, and I will update each module upon feasibility and create new issues for cases like Unsafe to be fixed when we are supporting minimal JDKs in future.

@lidavidm
Copy link
Member

lidavidm commented Sep 4, 2024

At least here, let's fix any warnings about deprecation of things we deprecated in this PR

@vibhatha
Copy link
Collaborator Author

vibhatha commented Sep 4, 2024

@lidavidm fixed the warnings and locally they don't appear, but let's check via the CI logs as well.

@lidavidm lidavidm merged commit b2e0668 into apache:main Sep 4, 2024
14 of 15 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Sep 4, 2024
@laurentgo
Copy link
Collaborator

Sorry for not having commented earlier but I wonder how much the changes do increase the memory footprint overall. When many buffers are created, this has a non negligeable impact abd I wonder if we should not adapt the fields based on the actual need for more than 2gb (4gb if we consider it unsigned) allocations

@lidavidm
Copy link
Member

lidavidm commented Sep 4, 2024

Hmm, is there a use case where you'd expect many tens of thousands of buffers allocated but memory is constrained (measured in megabytes, I suppose)?

Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit b2e0668.

There were 2 benchmark results with an error:

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them.

@laurentgo
Copy link
Collaborator

Hmm, is there a use case where you'd expect many tens of thousands of buffers allocated but memory is constrained (measured in megabytes, I suppose)?

The query engine I work with runs multiple fragments concurrently and we are regularly seing hundreds of thousands of buffers allocated and being live. We try to keep the heap size at a minimum size (4g/8g) and use as much direct memory as possible and so arrow bufs footprint (along with related objects) have a large impact for us

@lidavidm
Copy link
Member

lidavidm commented Sep 5, 2024

But even for 1 million buffers, 4 bytes of overhead per buffer is 4 megabytes (or we can say 8 bytes of overhead: 8 megabytes).

I suppose we could continue storing int but require all interfaces to use long...

zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Sep 6, 2024
### Rationale for this change

The usage of `Integer` instead of `Long` must be encouraged with the usage of memory sizing, indexing and addresses. 

### What changes are included in this PR?

This PR refactors the usage of `Integer` into `Long` along with utilities refactors. 

### Are these changes tested?

Existing test cases. 

### Are there any user-facing changes?

Yes, certain API calls may subject changes. 
* GitHub Issue: apache#43902

Authored-by: Vibhatha Lakmal Abeykoon <[email protected]>
Signed-off-by: David Li <[email protected]>
khwilson pushed a commit to khwilson/arrow that referenced this pull request Sep 14, 2024
### Rationale for this change

The usage of `Integer` instead of `Long` must be encouraged with the usage of memory sizing, indexing and addresses. 

### What changes are included in this PR?

This PR refactors the usage of `Integer` into `Long` along with utilities refactors. 

### Are these changes tested?

Existing test cases. 

### Are there any user-facing changes?

Yes, certain API calls may subject changes. 
* GitHub Issue: apache#43902

Authored-by: Vibhatha Lakmal Abeykoon <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants