-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the class ZipArchiveOperations
, which implements archive operations on zip-files
#407
Add the class ZipArchiveOperations
, which implements archive operations on zip-files
#407
Conversation
177fccb
to
b258777
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #407 +/- ##
==========================================
- Coverage 92.45% 92.07% -0.38%
==========================================
Files 128 124 -4
Lines 9641 9128 -513
Branches 1045 0 -1045
==========================================
- Hits 8914 8405 -509
- Misses 703 723 +20
+ Partials 24 0 -24
☔ View full report in Codecov by Sentry. |
Modelled after the `ZipArchiveOperations` tests in datalad#407
Modelled after the `ZipArchiveOperations` tests in datalad#407
- More sensible `__repr__` for `ArchiveOperations` - Document methods - More flexible handling of archive item names. `open()` and all other such methods can now accept `item.name` from `__iter__` results directly. Based on the assumption that all TAR archives use member names in POSIX notation. - Add unit tests with comprehensive coverage. Modelled after the `ZipArchiveOperations` tests in datalad#407 but with more context manager use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR. It motivated me to implement #415 which addresses a number of insufficiencies that became apparent while reviewing this PR. Please have a look re documentation and flexibility re archive item identifiers. I think it would make sense to align zip and tar support in this regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we sorted out how this is supposed to be in various other PRs. I think mostly the item identifier flexibility needs to be aligned (allow for PurePosixPath
) still.
Here is the plan:
|
f52ffc6
to
c179406
Compare
The zip file format introduces another naming-related quirk. Directory names in zip-files end with "/". All driectory-items (and file-items) that are yielded by |
FMPOV this is ready to go |
This commit adds `ZipArchiveOperations`, a class that implements `ArchiveOperations` for zip files. It makes use of `datalad_next.iter_collections.zipfile.iter_zip` to implement iteration. The commit comes with tests. The test code includes a pytest-fixture called `sample_zip`, which creates a sample zip-file with known content, much like the `sample_tar_xz`-fixture.
This commit fixes a bug in the tests where Unix path-separators were always used to identify directory entries in a zip-archive. That is wrong on Windows, where Windows-path separators have to be used.
This commit ensures that all zip-file names that are used as parameters in methods of ZipArchiveOperations use '/' in names of zip-archive items. That is due to the fact that zip enforces the use '/' in the names of its items (zip-APPNOTE.TXT 6.3.3 from September 1st, 2012). Until PR datalad#409 is merged this will ensure that the proper file names are used to check for items in zip-archives. The commit can be reverted once PR datalad#409 is merged.
This commit removes duplicated code for the sample_zip-fixture. It makes the sample_zip- fixture avaiable through conftest.py and re-uses it on the ZipArchiveOperations tests.
This commit uses PurePosixPath to properly read the path of a zip-archive on Windows in test_open()
This commit renames the file `datalad_next/archive_operations/tests/test_zipfile_archive_operations.py` to `datalad_next/archive_operations/tests/test_zipfile.py`. This is better readable and in sync with the name of the tests for `datalad_next/archive_operations/tarfile.py`
This commit adapts the ZipArchiveOperations code to the current type of `ZipfileItem.name`, i.e. to `PurePosixPath`. The commit also adapts the tests for ZipArchiveOperations to the updated types, and aligns the tests to the test code for TarArchiveOperations. This is in preparation for a test code consolidation
9bf90b6
to
5fbb2a0
Compare
Looks like this PR is not longer up-to-speed with the other developments that happened since it was posted.
|
Started working on an update |
Replaced by #578 |
This PR adds
ZipArchiveOperations
, a class that implementsArchiveOperations
for zip files. It makes use ofdatalad_next.iter_collections.zipfile.iter_zip
to implement iteration.The PR includes tests. The test code comes with a new pytest-fixture called
sample_zip
, which creates a sample zip-file with known content, much like thesample_tar_xz
-fixture.Remark: while working on this PR, I came across a type-annotation issue #406. I will create a PR to solve #406 later