-
-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JobQueueCluster
with local worker(s)
#596
Comments
This is an interesting question. I'm not sure that having the Howerver I agree manually creating a local worker right now isn't the best user experience. I had a quick go and it was something like this: import asyncio
# Using LocalCluster to sake of demo but could be any FooCluster
from dask.distributed import LocalCluster, Nanny, Client
from distributed.comm import get_address_host_port
async def main():
async with LocalCluster() as cluster:
async with Nanny(*get_address_host_port(cluster.scheduler_address), silence_logs=True) as nanny:
async with Client(cluster) as client:
print(cluster)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main()) It would be much nicer if there was some non-asyncio way of starting a worker by simply passing it the cluster object, the same way we can create a client from a cluster today. Something like this: from dask.distributed import LocalCluster, Nanny, Client
cluster = LocalCluster()
local_worker = Nanny(cluster)
client = Client(cluster) |
Hi @fnattino, thanks for raising this. This is a problem we already identified but with no good solutions yet. There is some discussion about it in some issues, (here for example) and a solution (yet to be implemented) is proposed in #419. Basically, the idea I had is to be able to create a But I also like the proposal of @jacobtomlinson! @fnattino if you'd like to give a try implementing one of these solutions, this would be very much appreciated :). |
@jacobtomlinson @guillaumeeb thank you both for your quick replies and for pointing out other relevant issues! I will try to experiment a bit with your suggestions and, should I get to something useful, I will keep you posted. Thanks again! |
Thanks a lot for this fantastic library - it is really awesome!
I'd love to hear your opinion on the following use case. I have access to a SLURM cluster where I am not allowed (or I am at least discouraged) to run tasks such as a Jupyter server or a Dask scheduler on the login node, and the minimal partition size that I can request via a batch job is 32 cores.
I can start the Jupyter server and instantiate a
SLURMCluster
in a batch job, then scale up the cluster by adding resources via SLURM. However, the initial batch job where Jupyter and the Dask scheduler are running still occupies 32 cores - which is a bit wasteful. Right now I could "manually" create an additional local worker and connect it to the scheduler to fill the remaining allocated resources. But maybe it could be useful to have the option to add a local worker when instantiating aJobQueueCluster
?The text was updated successfully, but these errors were encountered: