Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP, RFC] Add sharedmem #86

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Mar 15, 2023

This is some initial work that was done around adding shared-memory spilling to Zict for use by Distributed. Note this is incomplete.

Opening this in response to comment ( #72 (comment) )

cc @crusaderky

@jakirkham jakirkham changed the title [RFC] Add sharedmem [WIP, RFC] Add sharedmem Mar 15, 2023
@jakirkham
Copy link
Member Author

cc @martindurant (who may also be interested)

@crusaderky
Copy link
Collaborator

Hi,

  • It looks like the code is still mostly a copy-paste of LMDB? There are references to self.db and the actual self.shmm is not used anywhere.

  • It's not clear, given the incomplete code, what's the design for exporting a buffer from a SMDB running on a process to an SMDB running on another process?

  • multiprocessing.managers.SharedMemoryManager baffles me. In particular, I can't understand how to share memory created after forking because its method SharedMemoryManager.SharedMemory doesn't have name and create parameters, unlike the top-level SharedMemory, so all memory is created anew with a random name. Of course you can call

smm = multiprocessing.managers.SharedMemoryManager("foo")
smm.start()
mem1 = smm.SharedMemory(100)
mem2 = multiprocessing.shared_memory.SharedMemory(mem1.name, create=False)  # Same buffer

but that would defeat the purpose of tracking who's using the memory.

  • Lifecycle management is a big worry. Dask workers crash. A lot. It is imperative that there is a robust way to clean up shared memory after a worker died, without having to reboot the whole host. The POSIX API shm_open, which multiprocessing.shared_memory is a thin wrapper around on non-Windows systems, offers nothing of the sort: if your process dies, unless you are tracking somewhere else a reference counter of the shared memory buffers, the memory stays there. On paper, this is what SharedMemoryManager should address - except that it doesn't:
from multiprocessing.managers import SharedMemoryManager
import psutil
smm = SharedMemoryManager("foo")
smm.start()
mem = smm.SharedMemory(100)
# File created in Linux: /dev/shm/psm_27305d79 (file name is random)
psutil.Process().kill()
# The file remains there
  • It's important to note that, in order for this to be actually useful in Dask, you will need to tamper with your OS setup both in Linux as well as in MacOSX, as the default allotment for shared memory is by default half the physical memory and a few MiBs respectively. This is a problem shared with File-based shared memory model #80.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants