Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mechanism to limit global memory usage #213

Open
mkitti opened this issue Jan 16, 2025 · 2 comments
Open

Mechanism to limit global memory usage #213

mkitti opened this issue Jan 16, 2025 · 2 comments

Comments

@mkitti
Copy link

mkitti commented Jan 16, 2025

If there a method to limit global memory usage?

Say I want to do something simple such as copying one array to another. Implementing this as follows can consume a lot of memory if the arrays are large?

import tensorstore as ts

input_arr = ts.open({ ... })
output_arr = ts.open({ ... })
output_arr.write(input_array).result()

I usually end up having to manually loop over the arrays chunk by chunk to force it to copy in parts. I'm aware of the Context API and concurrency limits, but this does not seem to have effectively constrain operations such as the one above.

Is there some way to limit the total memory used to avoid the Out Of Memory killer?

@mkitti
Copy link
Author

mkitti commented Jan 17, 2025

Attemping to answer my own question, it seems that applying resource limits on the input_array is particuarly useful.

input_array = ts.open({
    "driver": ...
    "context": {
        "cache_pool": {
            "total_bytes_limit": 100_000_000_000
        },
        "data_copy_concurrency": {
            "limit": 16
        }
    }
}

@laramiel
Copy link
Collaborator

There is no such semaphore built-in at the moment, though I'm working towards some internal updates that will allow such a feature in the future. You can do part of this with concurrency limits, but it will not provide any guarantees.

For an example of how you might do this yourself (in C++), look here:

Basically that benchmark uses a counter to track the expected in-flight bytes, and starts many full tensorstore reads as will fit into the limit, using the future completion to trigger the next read.

Orbax also limits memory use when loading in a very similar way. They use a python context manager with a byte_limiter type to act as a semaphore. See, for example: https://github.com/google/orbax/blob/22695625af8a7c22140c6c5270253e1d0aa7aa64/checkpoint/orbax/checkpoint/_src/serialization/serialization.py#L435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants