Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TarArchiveOperations and equip with tests #415

Merged
merged 3 commits into from
Jun 12, 2023
Merged

Conversation

mih
Copy link
Member

@mih mih commented Jun 10, 2023

  • More sensible __repr__ for ArchiveOperations
  • Document methods
  • More flexible handling of archive item names. open() and all other
    such methods can now accept item.name from __iter__ results
    directly. Based on the assumption that all TAR archives use member
    names in POSIX notation.
  • Add unit tests with comprehensive coverage. Modelled after the
    ZipArchiveOperations tests in
    Add the class ZipArchiveOperations, which implements archive operations on zip-files #407 but with more
    context manager use.
Stuff from the development history of this PR

Interesting test failure pattern:

FAILED ../archive_operations/tests/test_tarfile.py::test_tararchive_iterator - AssertionError: assert 'test-archive/' in <datalad_next.archive_operations.tarfile.TarArchiveOperations object at 0x108f29fd0>

It is only happening with PY3.8, and not with PY3.9 or later. Likely a change in the tarfile module. Needs an investigation. The documentation is silent about such a change.

This also points out the need for a custom __repr__ implementation.

Update: I think the trailing-slash business is not needed:

appveyor@appveyor-vm:~$ /home/appveyor/venv3.7/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getnames())"
['test-archive', 'test-archive/123.txt', 'test-archive/123_hard.txt', 'test-archive/subdir', 'test-archive/subdir/onetwothree_again.txt', 'test-archive/onetwothree.txt']
appveyor@appveyor-vm:~$ /home/appveyor/venv3.8/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getnames())"
['test-archive', 'test-archive/123.txt', 'test-archive/123_hard.txt', 'test-archive/subdir', 'test-archive/subdir/onetwothree_again.txt', 'test-archive/onetwothree.txt']
appveyor@appveyor-vm:~$ /home/appveyor/venv3.9/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getnames())"
['test-archive', 'test-archive/123.txt', 'test-archive/123_hard.txt', 'test-archive/subdir', 'test-archive/subdir/onetwothree_again.txt', 'test-archive/onetwothree.txt']
appveyor@appveyor-vm:~$ /home/appveyor/venv3.10/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getnames())"
['test-archive', 'test-archive/123.txt', 'test-archive/123_hard.txt', 'test-archive/subdir', 'test-archive/subdir/onetwothree_again.txt', 'test-archive/onetwothree.txt']
appveyor@appveyor-vm:~$ /home/appveyor/venv3.11/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getnames())"
['test-archive', 'test-archive/123.txt', 'test-archive/123_hard.txt', 'test-archive/subdir', 'test-archive/subdir/onetwothree_again.txt', 'test-archive/onetwothree.txt']
appveyor@appveyor-vm:~$ /home/appveyor/venv3.7/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive'))"
<TarInfo 'test-archive' at 0x7f82cbfa0bb0>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.8/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive'))"
<TarInfo 'test-archive' at 0x7f380a2bf7c0>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.9/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive'))"
<TarInfo 'test-archive' at 0x7fe0ec693340>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.10/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive'))"
<TarInfo 'test-archive' at 0x7f221e870100>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.11/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive'))"
<TarInfo 'test-archive' at 0x7fde6ee504c0>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.7/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive/'))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/appveyor/.localpython3.7.16/lib/python3.7/tarfile.py", line 1754, in getmember
    raise KeyError("filename %r not found" % name)
KeyError: "filename 'test-archive/' not found"
appveyor@appveyor-vm:~$ /home/appveyor/venv3.8/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive/'))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/appveyor/.localpython3.8.16/lib/python3.8/tarfile.py", line 1782, in getmember
    raise KeyError("filename %r not found" % name)
KeyError: "filename 'test-archive/' not found"
appveyor@appveyor-vm:~$ /home/appveyor/venv3.9/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive/'))"
<TarInfo 'test-archive' at 0x7fa7c6813340>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.10/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive/'))"
<TarInfo 'test-archive' at 0x7fb3e8568100>
appveyor@appveyor-vm:~$ /home/appveyor/venv3.11/bin/python -c "import tarfile as tf; print(tf.open('test_archive.tar.xz').getmember('test-archive/'))"
<TarInfo 'test-archive' at 0x7fc59346c4c0>

mih added 3 commits June 11, 2023 19:54
- More sensible `__repr__` for `ArchiveOperations`
- Document methods
- More flexible handling of archive item names. `open()` and all other
  such methods can now accept `item.name` from `__iter__` results
  directly. Based on the assumption that all TAR archives use member
  names in POSIX notation.
- Add unit tests with comprehensive coverage. Modelled after the
  `ZipArchiveOperations` tests in
  datalad#407 but with more
  context manager use.
@mih mih changed the title Archivedoc Update TarArchiveOperations and equip with tests Jun 11, 2023
@mih mih marked this pull request as ready for review June 11, 2023 18:01
@mih mih merged commit 59a6317 into datalad:main Jun 12, 2023
@mih mih deleted the archivedoc branch June 12, 2023 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant