Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Library's "Import from User Directory" interface's display is very slow for full directories #19209

Open
vladvisan opened this issue Nov 26, 2024 · 3 comments
Assignees

Comments

@vladvisan
Copy link

Description of the problem

(Tested on 23.2.2.dev0)

I'm referring to this interface:

Before showing anything to the user, the page loads the whole folder structure recursively (it doesn't unfold it automatically, but it does load it)
Which, in our case (large sub-directory structure with many files, AND network-mounted ) leads to:

  • the whole Galaxy instance being unresponsive for a few minutes, for all users
  • eventually aborting because we reach "MAX_WALK_DIRS" (10K by default)

Relevant issues

Workaround
As a workaround, I did 2 things:

  1. Lower "MAX_WALK_DIRS" drastically, so that at least the abort happens sooner so that other users are less impacted
  • Maybe if the user closes his tab it would solve it, I haven't tested that. But either way, we can't assume they will do that.
  1. Switch to the "Upload" interface, which, as you can see, only loads one depth level at a time, which is a lot more efficient:
  • image
  • image
  • image

Remaining problems after the workaround

  • Even the Upload interface still takes a long time for folders with alot of files directly (as opposed to in sub-folders)
  • The Upload interface does not support linking files instead of copying

Potential solutions

  • re-use the Upload tool's interface in the Data Library's "Import from User Directory" interface?
    • At least reduces the timeout pb to numerous files one-level, from numerous files recursively (much more common)
  • modify the "recursive" option here? https://github.com/galaxyproject/galaxy/blob/release_23.2/lib/galaxy/managers/remote_files.py#L90
    • which calls -> lib/galaxy/files/sources/init.py / list -> /lib/galaxy/files/sources/posix.py / _list -> lib/galaxy/util/path/init.py / safe_walk
    • NB: the file has been modified since the 23.2 I am referencing (since it's my tested version)
  • start with a call to the OS to get the amount of files (potentially recursively)
    • to be able to abort pre-emptively instead of trying for nothing
    • to be able to display a progress bar
  • max amount of RAM per user instead of affecting other users
@itisAliRH
Copy link
Member

  • the whole Galaxy instance being unresponsive for a few minutes, for all users

Most performance issues have been fixed here #18638.

@vladvisan
Copy link
Author

Thank you all for taking a look at this issue.

I'm currently updating my Galaxy instance so I can test #18638 and #19132

Will update this week or next with the results (upgrade taking a while because I'm fully migrating to Ansible at the same time)

@vladvisan
Copy link
Author

vladvisan commented Jan 15, 2025

Sorry for the delay, holidays + time-consuming update since first time doing it on an instance

So far I've been able to test #18638

  • It no longer freezes the whole instance, nor the current window, and you can easily exit the screen
  • It also feels a lot more modern
  • Big improvement, thank you Alireza!
  • I would say a potential improvement would be to load the contents of a folder only as it's opened, like for the upload tool, to reduce the delay itself and not just reduce the impact of the delay on the application
    • Since in our case at least we have a lot of depth/files in our folders (+ potential impact from network delay, but the upload tool works basically instantly given the same config)
  • I also sometimes experience some delay when selecting files (it can take a few seconds for it to appear "checked", and for the Import button to update) : https://github.com/user-attachments/assets/f50e4d37-33b3-4c20-aa54-af7d38c21a99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants