-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-100239: specialize bitwise logical binary ops on ints #128927
base: main
Are you sure you want to change the base?
Conversation
Python/specialize.c
Outdated
return (is_nonnegative_compactlong(lhs) && is_nonnegative_compactlong(rhs)); | ||
} | ||
|
||
#define NONNEGATIVE_LONGS_ACTION(NAME, OP) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why restrict to nonnegative longs here? If we do restrict, then we can replace the calls _PyLong_CompactValue with direct access to op->long_value.ob_digit[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because negative ints need more work at runtime and I don't think they're common with bitwise logical ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra work is already done by _PyLong_CompactValue
or am missing something? The output of ls OP rhs
might not be a compact int, but there are no guards for the output type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra work is already done by
_PyLong_CompactValue
or am missing something?
No, I think you're right. Good point.
The output of
ls OP rhs
might not be a compact int, but there are no guards for the output type.
For bitwise logical operators we should expect the results to have the same size as the inputs.
Misc/NEWS.d/next/Core_and_Builtins/2025-01-16-22-54-12.gh-issue-100239.7_HpBU.rst
Outdated
Show resolved
Hide resolved
…e-100239.7_HpBU.rst Co-authored-by: Pieter Eendebak <[email protected]>
The stats are weird with this PR, lot of misses: https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20250116-3.14.0a4%2B-c57fe46/bm-20250116-azure-x86_64-iritkatriel-binaryops-3.14.0a4%2B-c57fe46-pystats-vs-base.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think non-negative
should be dropped in the news entry now?
Misc/NEWS.d/next/Core_and_Builtins/2025-01-16-22-54-12.gh-issue-100239.7_HpBU.rst
Outdated
Show resolved
Hide resolved
Python/specialize.c
Outdated
{ \ | ||
Py_ssize_t rhs_val = _PyLong_CompactValue((PyLongObject *)rhs); \ | ||
Py_ssize_t lhs_val = _PyLong_CompactValue((PyLongObject *)lhs); \ | ||
return PyLong_FromLong(lhs_val OP rhs_val); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the conversion warnings on Windows: they stem from long
being 32bit on Windows x86_64, whereas Py_ssize_t
is 64bit there. @markshannon forsightfully constructed _PyLong_CompactValue()
to return 64bit, so that digits
in the future can grow above 30 bits.
Maybe use PyLong_FromSsize_t()
here?
P.S: The only disadvantage is, that PyLong_FromSsize_t()
misses the fast path using _PyLong_FromMedium()
like PyLong_FromLong()
(and also PyLong_FromLongLong()
) does.
Is this worth an issue? Maybe then implement PyLong_FromSsize_t()
, PyLong_FromLong()
and PyLong_FromLongLong()
) using a macro similar to PYLONG_FROM_UINT to get rid of the repetitive code?
BTW: PYLONG_FROM_UINT is missing the fast path for medium values, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've updated. Feel free to create an issue about the fast path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: #129149.
This is my first issue, hopefully I didn't mess up :)
And maybe in the title of this pull request, too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are seeing a high proportion of specialization failures for non-compact ints with the &
operator.
I don't know why that would be. My guess is that some of the benchmarks are using ints as bit vectors and using more than one digit.
@@ -2556,6 +2609,7 @@ binary_op_extended_specialization(PyObject *lhs, PyObject *rhs, int oparg, | |||
|
|||
LOOKUP_SPEC(compactlong_float_specs, oparg); | |||
LOOKUP_SPEC(float_compactlong_specs, oparg); | |||
LOOKUP_SPEC(compactlongs_specs, oparg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why three tables, rather than one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single table would need to have a list of (guard, action) pairs for each OP. So it's a table of tables. Same thing basically.
Just a wild guess: could those be from enums, which derive from int? Especially flag enums? |
I don't think so. We check for int with |
I suspect the misses might be Other python constructions that would causes misses (but are not in pyperformance afaics) are |
(Would be nice if the report was rendered so that it's easy to see which benchmark contributed to a stat.) |
This adds specialisations for bitwise |, &, ^ on non-negative ints.
I'm not adding more in the same PR so we can more easily bisect in the future if we need to.