Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replaced ZipInputStream in CreateNewDataFilesCommand #199

Open
wants to merge 20 commits into
base: v6.2-DANS-DataStation
Choose a base branch
from

Conversation

jo-pol
Copy link

@jo-pol jo-pol commented Aug 29, 2024

Why we need this PR:

Zips created by downloading multiple files from an ownCloud service throws java.util.zip.ZipException: only DEFLATED entries can have EXT descriptor when using ZipInputStream to iterate over the entries. This causes the zipfiles to be ingested as is. With ZipFile we don't encounter the exception.

What this PR does:

Replaces ZipInputStream with ZipFile.

  • in CreateNewDataFilesCommand
  • in ShapefileHandler: abandoned ShapefileHandler constructor with FileInputStream to allow the use of ZipFile

Additional

  • for the new iteration method over zip entries in CreateNewDataFilesCommand: extracted methods filteredZipEntries, openZipFile, getShortName and isNotFakeFile
  • introduced some unit tests for CreateNewDataFilesCommand or should I call it an integragtion test for the changed classes.
  • to allow (or at least simplify) the new unit test, FileUtil.determineFileType catches Bag exceptions The method will now return application/zip rather than throw. Without catching I would need complex mocking to get a BagitFileHandler via CDI just to test the rest.

Which issue(s) this PR closes:

Closes #

Special notes for your reviewer:

Reduce differnces by a compare ignoring white space:
https://github.com/DANS-KNAW/dataverse/pull/199/files?w=1

Suggestions on how to test this:

Upload the zips in src/test/resources/own-cloud-downloads/

  • Before deploy: you will get the message Failed to unzip the file. Saving the file as is.
  • After deploy: the the zipfiles are split into individuals files, the directory structure is obeyed and sets of shape files are rezipped.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

no

Is there a release notes update needed for this change?:

Zips downloaded from an ownCloud service are no longer uploaded as is but unzipped.

Additional documentation:
Screenshots of result before and after deploy

image
image

@coveralls
Copy link

coveralls commented Aug 29, 2024

Pull Request Test Coverage Report for Build #81

Details

  • 69 of 96 (71.88%) changed or added relevant lines in 4 files are covered.
  • 3 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.3%) to 20.943%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/main/java/edu/harvard/iq/dataverse/util/FileUtil.java 3 6 50.0%
src/main/java/edu/harvard/iq/dataverse/util/ShapefileHandler.java 14 20 70.0%
src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceShapefileHelper.java 4 12 33.33%
src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateNewDataFilesCommand.java 48 58 82.76%
Files with Coverage Reduction New Missed Lines %
src/main/java/edu/harvard/iq/dataverse/util/FileUtil.java 1 38.34%
src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateNewDataFilesCommand.java 1 54.58%
src/main/java/edu/harvard/iq/dataverse/util/ShapefileHandler.java 1 67.6%
Totals Coverage Status
Change from base Build #60: 0.3%
Covered Lines: 17492
Relevant Lines: 83522

💛 - Coveralls

@jo-pol jo-pol marked this pull request as draft September 2, 2024 08:22
@jo-pol jo-pol marked this pull request as ready for review September 17, 2024 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants