Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding limits on file uploads #475

Merged
merged 5 commits into from
Oct 28, 2023
Merged

Adding limits on file uploads #475

merged 5 commits into from
Oct 28, 2023

Conversation

ml-evs
Copy link
Member

@ml-evs ml-evs commented Oct 24, 2023

Closes #471 by adding file upload limits:

  • In the Uppy UI (100 GB in total and 10,000 files per upload)
    • Both values configurable via env variables
  • In the API:
    • At the level of Flask MAX_CONTENT_SIZE -- now this value is configurable to prevent massive files filling /tmp disks that could lead to instability on some systems -- think we are unaffected on the current deployment but will try to verify later
    • At the level of remaining disk space in the configured mount point, e.g., whether there is enough space in CONFIG.FILE_DIRECTORY to actually store the thing. This is done with a pymongo lock that prevents new files being uploaded until previous files have been correctly stored in the fs and database.

Also this PR makes the files directory on app startup, if not present. Previously the routes themselves were making it on first upload, potentially with all required subdirectories. This can be a bit dangerous if the files directory is a dynamic mount point, as the wrong disk may silently be used if not previously mounted. We should probably make this more robust in the future, with better specification of e.g. remote data sources too for the files directory (S3 or other blob storage). May be worth abstracting this away with something like Maggma as a "store" controller but equally this may overcomplicate things for little gain for us.

Additionally, in the future/now if easily done, it would be nice to add better feedback on failures in the UI, i.e., overriding the default uppy error message with better messages directly from our API. This might be straightforward to do but tricky to test.

@cypress
Copy link

cypress bot commented Oct 24, 2023

Passing run #479 ↗︎

0 44 0 0 Flakiness 0

Details:

Merge 88d13bf into 789a452...
Project: datalab Commit: 6fc7917103 ℹ️
Status: Passed Duration: 05:44 💡
Started: Oct 28, 2023 1:42 PM Ended: Oct 28, 2023 1:48 PM

Review all test suite changes for PR #475 ↗︎

@ml-evs ml-evs marked this pull request as ready for review October 24, 2023 18:02
@ml-evs ml-evs requested a review from jdbocarsly as a code owner October 24, 2023 18:02
@ml-evs ml-evs force-pushed the ml-evs/file_upload_limits branch from 36dc8a9 to 1fee301 Compare October 24, 2023 18:09
@jdbocarsly
Copy link
Member

Looks good to me, but I would set the max files and filesize much lower, at least for a demo folder. I can't think of a case we currently need to support 10,000 files uploaded at once-- and I imagine the server would not be happy.

jdbocarsly
jdbocarsly previously approved these changes Oct 24, 2023
Copy link
Member

@jdbocarsly jdbocarsly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ml-evs ml-evs force-pushed the ml-evs/file_upload_limits branch from 1fee301 to 8ceae6d Compare October 28, 2023 12:13
@ml-evs
Copy link
Member Author

ml-evs commented Oct 28, 2023

Looks good to me, but I would set the max files and filesize much lower, at least for a demo folder. I can't think of a case we currently need to support 10,000 files uploaded at once-- and I imagine the server would not be happy.

If the 10,000 were tracked as one "file object" in the database then I think it would be fine -- doesn't seem unreasonable once we move to uploading folders of in situ data, or even bruker folders, but take your point for the demo server

@ml-evs ml-evs merged commit 4ba8d13 into main Oct 28, 2023
4 checks passed
@ml-evs ml-evs deleted the ml-evs/file_upload_limits branch October 28, 2023 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a file upload size limit
2 participants