-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce transient memory allocations in integrate/pw_lin. #2208
Reduce transient memory allocations in integrate/pw_lin. #2208
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall! Just some minor comment for future maintainability.
case 7: { paint(rc.cables, std::get<init_reversal_potential>(what), rc.region); break; } | ||
case 8: { paint(rc.cables, std::get<density>(what), rc.region); break; } | ||
case 9: { paint(rc.cables, std::get<voltage_process>(what), rc.region); break; } | ||
case 10: { paint(rc.cables, std::get<scaled_mechanism<density>>(what), rc.region); break; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new paintable types, adding it here this might get overlooked. std::visit
would offer a way to throw a compile time error for such cases, but from the comments elsewhere, I'm assuming this was changed from std::visit
for performance reasons here as well. Maybe at least add a default case that throws an error at runtime for unmatched types.
case 0: { place(lc.places, std::get<i_clamp>(what), label); break; } | ||
case 1: { place(lc.places, std::get<threshold_detector>(what), label); break; } | ||
case 2: { place(lc.places, std::get<synapse>(what), label); break; } | ||
case 3: { place(lc.places, std::get<junction>(what), label); break; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here as with the paintings case matching.
@@ -512,7 +522,7 @@ struct pw_elements<void> { | |||
auto lower_bound() const { return bounds().first; } | |||
auto upper_bound() const { return bounds().second; } | |||
|
|||
auto extent(size_type i) const { return extents()[i]; } | |||
std::pair<double, double> extent(size_type i) const { return extents()[i]; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is auto
not enough here?
double lower_bound() const { return pw_->extent(*c_).first; } | ||
double upper_bound() const { return pw_->extent(*c_).second; } | ||
|
||
const X& value() const { return pw_->cvalue(*c_); }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so the const_iterator
is the culprit? i.e. here the copying is going on when dereferencing the iterator it->lower_bound()
, correct? If so, wouldn't having a const
specialization for pw_element_proxy<const X>
help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that reference
is not, in fact, a reference:
using reference = pw_element<X>;
reference operator[](difference_type j) const { return (*pw_)[j+*c_]; }
reference operator*() const { return (*pw_)[*c_]; }
pointer operator->() const { return pointer{(*pw_)[*c_]}; }
thus, []
and *
will construct a full pw_element<X>
. This is usually not offensive.
BUT, in some cases X=vector<double>
, which is obviously an issue. I couldn't get a
working instance where reference = pw_element<X>&
or what we'd actually want here
reference = pw_element<X&>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's why I proposed to using reference = pw_element_proxy<const X>
and specialize
template<typename X>
struct pw_element_proxy<const X> { ... };
@@ -240,16 +259,21 @@ struct pw_elements { | |||
|
|||
using value_type = pw_element<X>; | |||
using pointer = const pointer_proxy<pw_element<X>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using pointer = const pointer_proxy<pw_element<X>>; | |
using pointer = const pointer_proxy<pw_element_proxy<const X>>; |
double lower_bound() const { return pw_->extent(*c_).first; } | ||
double upper_bound() const { return pw_->extent(*c_).second; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be needed anymore, as it is "cheap enough" now to dereference the iterator: it->lower_bound()
double lower_bound() const { return pw_->extent(*c_).first; } | ||
double upper_bound() const { return pw_->extent(*c_).second; } | ||
|
||
const X& value() const { return pw_->cvalue(*c_); }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same: shouldn't be needed anymore, as it is "cheap enough" now to dereference the iterator
double lower_bound() const { return pw_->extent(*c_).first; } | ||
double upper_bound() const { return pw_->extent(*c_).second; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see ^^
double a_right = ai.upper_bound(); | ||
double b_right = bi.upper_bound(); | ||
double right = std::min(a_right, b_right); | ||
return f(left, right, ai.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do this again now I believe
double a_right = ai.upper_bound(); | |
double b_right = bi.upper_bound(); | |
double right = std::min(a_right, b_right); | |
return f(left, right, ai.value()); | |
double a_right = ai->upper_bound(); | |
double b_right = bi->upper_bound(); | |
double right = std::min(a_right, b_right); | |
return f(left, right, *ai); |
As discussed in #2208, use a specialized proxy to avoid some allocations.
Another break-out of #2005
embed_pwlin
integration, save a bit of allocation.Overall saves ~10% when creating 2048 complex cells of depth
10 in
busyring
.NOTE
pw_zip_*
andpw_over_cable
still allocate tons of transientsas they use
pw_elements<pw_elements<double>>
ie a vector of vectors.Due to the way this is constructed, dereferencing iterators on these types
returns their contained value not const value&, thus copying the inner
vectors each time. Overall -- despite the frustration and the complexity -- the
remainder doesn't seem worth addressing. For now.