-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended source response #223
Extended source response #223
Conversation
I'm closing and reopening this PR to kick codecov. There was a recent change in GitHub that broke it, but I think it's fixed now. |
Codecov ReportAttention: Patch coverage is
|
Thanks for the changes @hiyoneda. I haven't gone through all of the code, but I noticed that in ExtendedSourceModel you are tracking _contents, _axes, and _units manually. You can instead inherit from Histogram, as PointSourceModel does, and that way you'll get all features, which is also less error-prone. |
I needed to do it on purpose. If histpy.Histogrom is used, the overflow/underflow bins are automatically created. For the gamma-ray line case (when the number of energy bins is one), the loaded matrix size gets 9x larger than the necessary size. (actually, the extended model fitting notebook could not be performed on Saurabh's environment due to this issue). So the response components are manually loaded, and the overflow/underflow bins are ignored in this update. |
Another future concern is when the response is very large (~O(100-1000) GB), for example, nside = 64. If these features can be implemented in Histogram, I agree that we should use it as a base class. But if not, I am worrying that it rather reduces flexibility in the future. |
I see. Yeah, the underflow/overflow bins are an issue in that case, and there's no easy way to change that in histpy. It's certainly possible, but since the assumption of those bins permeates the code, it will take a while to implement. Partial reading was part of the reason for using HDF5, and FullDetectorResponse only loads one direction at a time. Indeed it might not be enough. On a tangent note, if we move to an unbinned analysis, we can load from the HDF5 only the nearest bins for each event, one event at a time ¯_(ツ)_/¯ |
Personally, I want histpy to keep the underflow/overflow bins, because I like the concept of histpy very much, which is useful for manipulating histograms in data analysis. It is also useful that histograms have the underflow/overflow bins in many cases. I think that the fundamental issue here would be that the response is a matrix or tensor rather than a histogram.
Actually, I was wondering if we could keep the instance of h5py instead of its contents in a future update. FullDetectorResponse will be a good reference for how to update ExtendedSourceResponse in the future.
It sounds interesting. Considering that it seems always useful to keep the HDF5 in the response classes, should any response class have three attributes, the HDF5 data, axes, unit? |
…response/function.py
@hiyoneda I didn't want to give away all the slicing, projecting, plotting, etc. capabilities that have shown useful for debugging and concise coding, so I spent some time adding the capability to histpy to disable under/overflow tracking (axis by axis). I haven't merged it yet, but you can checked it out here: https://gitlab.com/burstcube/histpy/-/merge_requests/1 . Please let me know what you think! |
@israelmcmc Thanks for making the change on histpy. I tried the new code from the |
I modified the class as a child class of Histogram. Now it can save the memory space.
|
Thanks, @hiyoneda:
|
@hirokiyoneda I made a new histpy release (1.1.4) that includes the optional under/overflow |
opened #250 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a few general questions. Everything looks good to me, I will run TS map and spectral fitting notebook to see if everything goes through.
unit = unit, | ||
track_overflow = track_overflow) | ||
|
||
del hist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After del hist
, the data referenced by hist
will still remain in the RAM (or it will cleaned when the python does the reference count)
Using gc.collect()
will remove the data right away. Since the response is quiet large, so might help to remove from RAM right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you reconstruct the Histogram
instead of using the opened one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the code corresponding to the first comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you reconstruct the Histogram instead of using the opened one?
This is because the opened one is an instance of histpy.Histogram, not ExtendedSourceResponse. So, I needed to instantiate ExtendedSourceResponse at a new line. @israelmcmc Is there any nice way to directly open a hdf5 file as an instance of ExtendedSourceResponse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hiyoneda I think that you can do cls.open(filename, name)
instead of super().open(filename, name)
. The output would be an ExtendedSourceResponse
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. In principle this should work, but there's an issue with histpy that prevents it at the moment. I think this can be merged as is, and we can come back to it one I fix the issue in histpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the code corresponding to the first comment.
Thanks @hirokiyoneda !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. In principle this should work, but there's an issue with histpy that prevents it at the moment. I think this can be merged as is, and we can come back to it one I fix the issue in histpy
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the corresponding issue, for the record https://gitlab.com/burstcube/histpy/-/issues/23
I ran through the spectral and TS map fitting notebooks without issues. They return the same fitted values. I will merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge.
I created the first version of the extended source response related to #216 .
get_integrated_spectral_model
) was moved from threeml/custom_functions.py to response/functions.py.get_integrated_extended_model
was added in threeml/custom_functions.py, which calculates integrated flux over the sky from a given astromodel.ExtendedSourceExtendedSourceResponse
. Currently, the functionalities are limited. I assume that we will add more functions in the futureExtendedSourceResponse
.Now, the updates are minimal. Any feedback is very welcome.
@saurabhmittal23 Please use this PR and test your extended model fitting. I am happy if you could check the 511 keV notebook first and see if the results are the same as before.