Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfile: TarFile.offset attribute is not updated with the remainder when closing a TarFile #129255

Open
emontnemery opened this issue Jan 24, 2025 · 0 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@emontnemery
Copy link
Contributor

emontnemery commented Jan 24, 2025

Bug report

Bug description:

The TarFile.offset attribute is not updated with the remainder when closing the tar.

import io
import pathlib
import tarfile

tar = tarfile.open("test.tar", "w")
t = tarfile.TarInfo("foo")
t.size = 123
tar.addfile(t, io.BytesIO(b"a" * t.size))
tar.close()
assert len(pathlib.Path("test.tar").read_bytes()) == tar.offset  # Fails because tar.offset is 2048, although 10240 bytes have been written

If this intentional, a test case asserting the current behavior should be added.
In case this is not intentional, a suggested change to make the offset match the number of bytes written, and a test case asserting it, are included in the snippet below.
In either case, I'd be happy to submit a PR.

diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index a0fab46b24e..feafb88d2d3 100644
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -2027,6 +2027,7 @@ def close(self):
                 blocks, remainder = divmod(self.offset, RECORDSIZE)
                 if remainder > 0:
                     self.fileobj.write(NUL * (RECORDSIZE - remainder))
+                    self.offset += (RECORDSIZE - remainder)
         finally:
             if not self._extfileobj:
                 self.fileobj.close()
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index 2549b6b35ad..4d1a2b2171b 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -1333,6 +1333,18 @@ def test_eof_marker(self):
         with self.open(tmpname, "rb") as fobj:
             self.assertEqual(len(fobj.read()), tarfile.RECORDSIZE * 2)

+    def test_offset_on_close(self):
+        # Check the offset after calling close matches the total number of
+        # bytes written.
+        tar = tarfile.open(tmpname, self.mode)
+        t = tarfile.TarInfo("foo")
+        tar.addfile(t)
+        tar.close()
+
+        with self.open(tmpname, "rb") as fobj:
+            self.assertEqual(len(fobj.read()), tar.offset)
+

 class WriteTest(WriteTestBase, unittest.TestCase):

CPython versions tested on:

3.14, 3.13

Operating systems tested on:

Linux

@emontnemery emontnemery added the type-bug An unexpected behavior, bug, or error label Jan 24, 2025
@emontnemery emontnemery changed the title tarfile: The offset does not match the number of bytes written after closing a tarfile which has been written to tarfile: TarFile.offset attribute is not updated with the remainder when closing a TarFile Jan 24, 2025
@picnixz picnixz added the stdlib Python modules in the Lib dir label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

2 participants