Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization + Threaded #19

Merged
merged 8 commits into from
Jul 6, 2021
Merged

Conversation

niyonx
Copy link
Contributor

@niyonx niyonx commented Jun 28, 2021

Optimization Changes

  • Thread-safe multithreading
  • Expectedrows estimation for parsing
  • Python best practices
  • In-kernel Pytables search
  • Code changes

Copy link
Owner

@yohanchatelain yohanchatelain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add small comments but it's great work overall Nigel, congrats! 🎉

@@ -130,9 +135,9 @@ def merge_dict(self, args):
def _merge(self, values, attr, do_not_check=False):
attrs = None
if isinstance(attr, str):
attrs = [value[attr] for value in values]
attrs = [*map(lambda value: value[attr], values)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rewriting is not faster. Here are my tests:

(10:41) ~:$ python3 -m timeit -n 1000000 -r 5  -v "values = [ {'a':i.upper()} for i in 'abcde' ]; attr = 'a'; attrs = [value[attr] for value in values]"
raw times: 866 msec, 881 msec, 863 msec, 934 msec, 881 msec

1000000 loops, best of 5: 863 nsec per loop
(10:41) ~:$ python3 -m timeit -n 1000000 -r 5  -v "values = [ {'a':i.upper()} for i in 'abcde' ]; attr = 'a'; attrs = [*map(lambda value : value[attr], values)]"
raw times: 1.09 sec, 1.1 sec, 1.09 sec, 1.09 sec, 1.08 sec

Where have you find this optimization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in general map is faster than for loops. You're right though, list comprehension are faster than map, if it requires a lambda. I will revert the map with lambda back to list comprehension!

Comment on lines 560 to 561
print("STARTING")
start = time.time()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an global variable like enable_timer to turn on/off the timer.

Comment on lines 558 to 559
# pr = cProfile.Profile()
# pr.enable()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comments.

Comment on lines 594 to 601
# pr.disable()
# pr.print_stats(sort="cumtime")
# pr.dump_stats("output.prof")
#
# stream = open('output.txt', 'w')
# stats = pstats.Stats('output.prof', stream=stream)
# stats.sort_stats('cumtime')
# stats.print_stats()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comments.

Comment on lines 591 to 592
end = time.time()
print(f"DONE in time: {end - start}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an global variable like enable_timer to turn on/off the timer.

Comment on lines 426 to 429
# e1 = time.perf_counter()
# print("pgc.data.filter x,y", e1-b1)

b2 = time.perf_counter()
# b2 = time.perf_counter()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments.

Comment on lines 448 to 451
# e2 = time.perf_counter()
# print("extra_value", e2-b2)

b3 = time.perf_counter()
# b3 = time.perf_counter()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments

Comment on lines 464 to 468
# e3 = time.perf_counter()
# print("scattergl", b3-e3)

e = time.perf_counter()
print("get_scatter_timeline", e-b)
# e = time.perf_counter()
# print("get_scatter_timeline", e-b)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments.

Comment on lines 494 to 497
# e = time.perf_counter()
# print("add_scatter", e-b)

# @cache.memoize(timeout=TIMEOUT)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments

pytracer/gui/callbacks.py Outdated Show resolved Hide resolved
@niyonx niyonx requested a review from yohanchatelain June 29, 2021 16:14
@niyonx
Copy link
Contributor Author

niyonx commented Jun 29, 2021

#17 #18 can be linked as well

@niyonx
Copy link
Contributor Author

niyonx commented Jul 5, 2021

#17 #18 can be linked as well

@yohanchatelain is it good to be merged now?

@yohanchatelain yohanchatelain merged commit 1575c94 into yohanchatelain:master Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation Fault Multithreading Race conditions Threaded option in visualize command
2 participants