Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Updated GPU cache size retrieval and refined closest_pow_of_2 #28059

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

arshadlab
Copy link
Contributor

@arshadlab arshadlab commented Dec 13, 2024

Details:
Existing method for cache size calculation was static and need continious updates to the sku table which was already being missed for latest skus e.g DG2.
This update introduces a new member variable, max_global_cache_size, to store the GPU's global cache size, obtained via the OpenCL property CL_DEVICE_GLOBAL_MEM_CACHE_SIZE. The existing hard coded cache calculations are removed. Additionally, the closest_pow_of_2 function has been enhanced to return the nearest power of 2, favoring the upper value if the input is within 30% of the range for the upper bound. These changes improve memory management and ensure better utilization of GPU resources towards bottle neck situations.

Tickets:
CVS-159076

@arshadlab arshadlab requested review from a team as code owners December 13, 2024 10:07
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Dec 13, 2024
@sys-openvino-ci sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Dec 13, 2024
Details:
Existing method for cache size calculation was static and need continious
updates to the sku table which was already being missed for latest skus e.g DG2.
This update introduces a new member variable, max_global_cache_size,
to store the GPU's global cache size, obtained via the OpenCL property
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE. The existing hard coded cache
calculations are removed.  Additionally, the closest_pow_of_2 function
has been enhanced to return the nearest power of 2, favoring the upper
value if the input is within 30% of the range for the upper bound.
These changes improve memory management and ensure better utilization
of GPU resources towards bottle neck situations.

Tickets:
     CVS-159076

Signed-off-by: Arshad Mehmood <[email protected]>
@txlim96
Copy link

txlim96 commented Dec 16, 2024

Verified with ADLS+A310E
auto batching.txt
BS32.txt

@p-durandin
Copy link
Contributor

build_jenkins

@vladimir-paramuzov vladimir-paramuzov added this to the 2025.0 milestone Dec 17, 2024
@vladimir-paramuzov vladimir-paramuzov added this pull request to the merge queue Dec 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 17, 2024
@p-durandin p-durandin added this pull request to the merge queue Dec 17, 2024
Merged via the queue into openvinotoolkit:master with commit b0a8c14 Dec 17, 2024
160 checks passed
11happy pushed a commit to 11happy/openvino that referenced this pull request Dec 23, 2024
…penvinotoolkit#28059)

Details:
Existing method for cache size calculation was static and need
continious updates to the sku table which was already being missed for
latest skus e.g DG2.
This update introduces a new member variable, max_global_cache_size, to
store the GPU's global cache size, obtained via the OpenCL property
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE. The existing hard coded cache
calculations are removed. Additionally, the closest_pow_of_2 function
has been enhanced to return the nearest power of 2, favoring the upper
value if the input is within 30% of the range for the upper bound. These
changes improve memory management and ensure better utilization of GPU
resources towards bottle neck situations.

Tickets:
     CVS-159076

Signed-off-by: Arshad Mehmood <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin ExternalIntelPR External contributor from Intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants