-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43299: [Release][Packaging] Only include pyarrow folder when finding packages on setuptools #43325
Conversation
@github-actions crossbow submit -g python wheel |
|
|
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
There seemed to be a minor issue:
|
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
Revision: 46d1afc Submitted crossbow builds: ursacomputing/crossbow @ actions-db1794c125
|
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
Revision: d954d75 Submitted crossbow builds: ursacomputing/crossbow @ actions-922921124d
|
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
Revision: 2fa434f Submitted crossbow builds: ursacomputing/crossbow @ actions-0cd7a8ae05
|
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
Revision: 204a27b Submitted crossbow builds: ursacomputing/crossbow @ actions-5d877a45d2
|
I am unsure on why an empty directory with
|
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
Revision: a1d73a2 Submitted crossbow builds: ursacomputing/crossbow @ actions-ebc75ddd3d
|
python/pyproject.toml
Outdated
@@ -73,7 +73,9 @@ zip-safe=false | |||
include-package-data=true | |||
|
|||
[tool.setuptools.packages.find] | |||
where = ["."] | |||
include = ["pyarrow"] | |||
exclude = ["pyarrow/tests", "pyarrow."] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem to have any effect, @jorisvandenbossche @pitrou any idea?
The other stray directories are filtered from the wheel now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the syntax is pyarrow.tests
. What does pyarrow.
refer to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a folder included in the wheel pyarrow.
which is empty and I was trying to remove it. I'll try with pyarrow.tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't remove tests either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about removing the tests, they are comparatively small and can serve to test PyArrow on a third-party machine.
$ du -hs venv-3.10/lib/python3.10/site-packages/pyarrow
187M venv-3.10/lib/python3.10/site-packages/pyarrow
$ du -hs venv-3.10/lib/python3.10/site-packages/pyarrow/tests/
4,0M venv-3.10/lib/python3.10/site-packages/pyarrow/tests/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about cpp and header files, should we do something like pandas here:
https://github.com/pandas-dev/pandas/blob/main/pyproject.toml#L131
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the files under pyarrow/src/arrow/python
are included on the wheel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll stress that the problem we're trying to solve is that installing a PyArrow wheel creates top-level directories outside of the pyarrow
source tree. This is the urgency. Cleaning up the contents of pyarrow
is a separate task, much less critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess those are negligible and we still require to distribute them on the source distitrbution:
pyarrow/src $ du -hs .
728K .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll stress that the problem we're trying to solve is that installing a PyArrow wheel creates top-level directories outside of the
pyarrow
source tree.
ok, then, this is solved by this PR, are you ok with the current changes @pitrou ?
was something wrong with #43281 from 3 days ago? |
Sorry, I didn't see that issue, it would be good to mark the second one as a duplicate. I also think that as the issue was introduced with a wrongly configured pyproject.toml it might be worth also fixing it there |
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
@github-actions crossbow submit wheel-manylinux-2-28-cp310-amd64 |
@github-actions crossbow submit wheel-windows-cp39-amd64 |
Revision: 915d70f Submitted crossbow builds: ursacomputing/crossbow @ actions-5b1cb7ff99
|
Revision: 915d70f Submitted crossbow builds: ursacomputing/crossbow @ actions-aca9360964
|
@github-actions crossbow submit wheel-windows-cp39-amd64 |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
@jorisvandenbossche the check is working on macOS, manylinux and Windows wheels tests now. If you can review again. Thanks! |
I think I've found the source of the stray |
@github-actions crossbow submit wheelcp312 |
Revision: f87612a Submitted crossbow builds: ursacomputing/crossbow @ actions-9544e3c33d |
@github-actions crossbow submit -g wheel |
Revision: f87612a Submitted crossbow builds: ursacomputing/crossbow @ actions-a951b5bfb8 |
CI failures are unrelated. @jorisvandenbossche Do you want to take a last look? |
Nice catch! |
Thanks @raulcd |
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit f545b90. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them. |
… finding packages on setuptools (apache#43325) ### Rationale for this change Currently we include everything when building wheels, see: ``` $ pip install pyarrow Collecting pyarrow Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (39.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.9/39.9 MB 33.8 MB/s eta 0:00:00 Collecting numpy>=1.16.6 Using cached numpy-2.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB) Installing collected packages: numpy, pyarrow Successfully installed numpy-2.0.0 pyarrow-17.0.0 (test-env) $ ls test-env/lib/python3.10/site-packages/ benchmarks/ distutils-precedence.pth numpy-2.0.0.dist-info/ pip-22.0.2.dist-info/ pyarrow-17.0.0.dist-info/ setuptools-59.6.0.dist-info/ cmake_modules/ examples/ numpy.libs/ pkg_resources/ scripts/ _distutils_hack/ numpy/ pip/ pyarrow/ setuptools/ ``` ### What changes are included in this PR? Use `include` as seen here: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#finding-simple-packages ### Are these changes tested? Will check via the build wheel on CI ### Are there any user-facing changes? No and yes :) We will remove unnecessary files * GitHub Issue: apache#43299 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
… finding packages on setuptools (apache#43325) ### Rationale for this change Currently we include everything when building wheels, see: ``` $ pip install pyarrow Collecting pyarrow Downloading pyarrow-17.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (39.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.9/39.9 MB 33.8 MB/s eta 0:00:00 Collecting numpy>=1.16.6 Using cached numpy-2.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB) Installing collected packages: numpy, pyarrow Successfully installed numpy-2.0.0 pyarrow-17.0.0 (test-env) $ ls test-env/lib/python3.10/site-packages/ benchmarks/ distutils-precedence.pth numpy-2.0.0.dist-info/ pip-22.0.2.dist-info/ pyarrow-17.0.0.dist-info/ setuptools-59.6.0.dist-info/ cmake_modules/ examples/ numpy.libs/ pkg_resources/ scripts/ _distutils_hack/ numpy/ pip/ pyarrow/ setuptools/ ``` ### What changes are included in this PR? Use `include` as seen here: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#finding-simple-packages ### Are these changes tested? Will check via the build wheel on CI ### Are there any user-facing changes? No and yes :) We will remove unnecessary files * GitHub Issue: apache#43299 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
Rationale for this change
Currently we include everything when building wheels, see:
What changes are included in this PR?
Use
include
as seen here: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#finding-simple-packagesAre these changes tested?
Will check via the build wheel on CI
Are there any user-facing changes?
No and yes :)
We will remove unnecessary files