-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize __str_base10() #147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please describe how you measure the execution time for your proposed changes and obtain these results.
Range Old New [0, 10) 0.473s 0.293s ... ... ... [0, 100000000) 2.463s 2.122s
f111a2d
to
a247a6f
Compare
I have updated the commit message to make the process of how I tested and obtained the experimental data clearer. For each test range, values within the range were used as inputs, and __str_base10() was called 100,000,000 times with these inputs uniformly distributed. The execution time was measured using the time command. The test code is as follows:
|
@visitorckw , I apologize that I forgot to read your commit messages first. But I have understood that the timing data is collected by your test code and |
Replace the loop-based method for converting integers to base-10 with a more efficient approach using bitwise operations. The new method simulates division and modulus operations by 10 without using multiplication, division, or modulus instructions, leading to improved performance. This optimization reduces the number of branches compared to the original loop-based approach, resulting in fewer conditional checks and a more streamlined execution path. Experimental results demonstrate significant performance improvements with the new method. For each test range, values within the range were used as inputs, and __str_base10() was called 10,000,000 times with these inputs distributed uniformly. Execution time was measured using the time command. The results show the following reductions in execution time: | Range | Old | New | |-----------------|--------|--------| | [0, 10) | 0.473s | 0.293s | | [0, 100) | 0.619s | 0.434s | | [0, 1000) | 0.818s | 0.646s | | [0, 10000) | 1.715s | 0.902s | | [0, 100000) | 2.166s | 1.169s | | [0, 1000000) | 2.239s | 1.453s | | [0, 10000000) | 2.359s | 1.773s | | [0, 100000000) | 2.463s | 2.122s | Link: http://web.archive.org/web/20180517023231/http://www.hackersdelight.org/divcMore.pdf
Modify the handling of negative numbers in the __str_base10(). Previously, adding a negative sign required searching for the first '0' in the result and replacing it with a '-'. The updated approach allows for directly appending the negative sign to pb[i] if needed, simplifying the implementation.
a247a6f
to
85f67c5
Compare
FWIW, I wrote a unit test to compare the results of the new function with the old one to verify correctness. Within the range [0, INT_MAX], the results of both functions are identical. Testing code:
|
Thank @visitorckw for contributing! |
The base-10 conversion method has been optimized by replacing the previous loop-based approach with a more efficient technique using bitwise operations. This new method simulates division and modulus operations by 10, avoiding multiplication, division, or modulus instructions, and results in notable performance gains. Experimental results show a considerable reduction in execution time across various ranges, with the new method consistently outperforming the old one. Additionally, the handling of negative numbers has been simplified.