-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paganinsweep #521
Paganinsweep #521
Conversation
While working on this I thought that may be we could resolve the limits on the sweep preview defined by the GPU memory by using the CPU Paganin method instead? In this case, we can be less conservative on what size of the preview we take, even if the whole projection image should be used for a very wide kernel. Also doing things like this doesn't feel very elegant and the main question how Worth having a chat about it @yousefmoazzam ? |
Yeah sure, I'm happy to discuss things at some point. For now, I'd like to try and enumerate the different points raised for the sake of clarity, I think I counted three (but do correct me if there's more/less):
|
thanks @yousefmoazzam, you've summarised the issues well. So I'd say we either should go full CPU or GPU, no hybrid modes, as it will essentially result in the same OOM GPU issue at some point. Of course the full CPU will make it slightly inconvenient to users as they would need to build a TomoPy analogue. As a possible solution (if we decided to go that way) is to create I know that it doesn't sound ideal, but it is one possible way to avoid potentially frequent situations when the large blocks won't fit the GPU resulting in constant warning messages to users. if we're still deciding to proceed with the GPU implementation, I think we need to deal with the situation that the largest block defines the needed size for the sweep run. I suggest that the accepted blocks (bellow the upper limit) will be taken and therefore the list of parameters will be modified accordingly. For instance alpha = [0.1, 0.2, 0.3] is given but the blocks sizes are acceptable only for [0.1, 0.2] values. I'd discard 0.3 in that instance and proceed with the run for 2 parameters, rather than completely abort the whole run. So you can see, more hacks basically to make the GPU stuff to work... |
OK, so as a conclusion of our discussion @yousefmoazzam , I'll do the following.
|
Sounds good, worth a shot to see how it goes 👍 It's worth pointing out since we didn't note it in the discussion earlier: with this approach we've addressed points 1 and 2 (in the points listed above in my attempted summary), but point 3 is still unaddressed. Dunno if you'd like more discussion before going ahead and trying this approach, or we just deal with point 3 later, it's up to you. |
Actually to some degree the point 3 is addressed by relying on a memory hungry method (Paganin) to decide the maximum size for a block. I'm hoping that methods in pre/post Paganin pipeline require less memory than Paganin itself. We will see if this is the case when we implement our hack around memory estimation. |
I think I need some help fixing this test please @yousefmoazzam . The reason why this test fail is that the I've tested this approach for 20,40,80Gb datasets (different number angles) and also different GPU cards, mainly P100 and V100. I didn't get OOM error so far. This factor defines the number of slices essentially for Paganin method not to break (or any other method in the pipeline I hope). I think the second memory hungry one is FBP so far, it doesn't break. But with Fourier recon we might want to reconsider and increase |
And a follow-up, a slightly different approach to the one I suggested earlier. I do not kill the run even if the kernel width is larger than the block that fits the memory. I just take the largest block in this case and proceed with the run. The users still get the result that is smoothed and probably discard it themselves, but we're safer here from questions why the runs are so frequently terminated. |
I saw that the PR is now marked as ready for review, so I'm happy to go over it at some point soon 🙂 For now, I'll try to answer this:
Yep, the issue seems to be that:
I would say that the best way to resolve this is to not use the private attribute, and find another way to get access to the raw data global shape. This would involve more changes and some thinking of course, but I think it's the safer way to do things - I'm not sure there's many places where reliance on private members of an object is advocated. If accessing the private attribute absolutely must be the way to do this, then I agree that the test will need to be changed to accommodate the assumption the code is making about the existence of the private attribute Before, the loader wrapper + loader mocks could be used because the sweep runner was not making any assumption about private members of a |
An example of a potential way to solve this without doing the private attribute access would be to modify If it makes sense for any data source to have the information about the raw data's global shape (which should be checked if it's reasonable or not), then one could:
@property
def raw_shape(self) -> Tuple[int, int, int]
return self._data.shape
This change would be minimal, and would avoid doing any private member access. This is just an example to illustrate that there are ways to go about this without the private member access, and to provide some sort of substance to my advice to "find another way to get access to the raw data global shape". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A decent start, thanks! I've made some comments on a few things, the public vs. private stuff and the lack of tests for the new feature I think are the main points (in addition to the concern about the private member access of source._data
mentioned outside of this review).
Thanks, I've done some changes as you suggested, |
This is a similar issue to the previous one, but now with
Also similar to before, a potential solution would be to modify the protocol in question ( If it would make sense to have any "loader wrapper" provide the associated preview config (which does sound reasonable to me), then it may make sense to:
@property
def preview(self) -> PreviewConfig
return self.preview
Side note: given that the loader wrapper would provide the preview if the above is done, and that the loader itself also would be able to provide this info, I'm becoming more wary of if the existence of the loader wrapper makes much sense (and if the wrapper should instead just go away, and loaders should be direct implementors of |
Thanks Yousef, this gets a bit more in-depth, but I will give it a go. Also the tests will be failing even if the preview stuff is sorted, because of this as |
Sorry, I'm not sure I understand: where is the need for |
Yes, its the tests for the sweep runs, they all build around using DataSetBlock rather than the loader where |
Ah ok, thanks for the link, I see that the tests create a I'm still not sure where there's a need for The function/purpose of the
then the So, there's something I'm still not understanding I think: could you point me to where exactly you think the |
…in kernel estimation
OK, so a couple of things here:
|
The preview needs to be passed to the autospeccing of the Lines 51 to 64 in fc38600
The
Yeah that makes sense, by default the mock loader wrapper won't have an implementation of the In the past I've seen issues with trying to use |
@yousefmoazzam sorry even with the changes you suggested I still cannot get correct |
ok, currently all tests pass for me except two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for the updates!
I think we're close, but there are a few things I'd like to be addressed before merging. The main one is stuff around the assertion on the private ParamSweepRunner._vertical_slices_preview
, and to double-check if you also see an incorrectly modified preview as mentioned in the comments below.
fe719cc
to
8fde01d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much appreciated for sticking it out, it was a bit of a journey! Things look good to me, thanks again 🙂
One last thing that came to my mind is that this condition catches both TomoPy and Savu implementations but then proceed to work with TomoPy parameters, so most likely Savu implementation will fail here. As Savu implementation pads the data in the method itself I'm thinking to just let it work as a normal method by taking 5 (or whatever is default) slices. So I guess I change this condition to look for |
Fixes #424
Checklist