Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paganinsweep #521

Merged
merged 18 commits into from
Nov 18, 2024
Merged

Paganinsweep #521

merged 18 commits into from
Nov 18, 2024

Conversation

dkazanc
Copy link
Collaborator

@dkazanc dkazanc commented Nov 4, 2024

Fixes #424

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have made corresponding changes to the documentation

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 4, 2024

While working on this I thought that may be we could resolve the limits on the sweep preview defined by the GPU memory by using the CPU Paganin method instead?

In this case, we can be less conservative on what size of the preview we take, even if the whole projection image should be used for a very wide kernel. Also doing things like this doesn't feel very elegant and the main question how vertical_slices_preview_max can be estimated across different GPU devices. To what memory size we should be linking that value? It just feels a bit fragile at the moment. We could, potentially, swap the backend from httomolibgpu to httomolib for that method? It, however, can fail on some other GPU function, if we decide to take the previewed block size too large.

Worth having a chat about it @yousefmoazzam ?

@yousefmoazzam
Copy link
Collaborator

Yeah sure, I'm happy to discuss things at some point.

For now, I'd like to try and enumerate the different points raised for the sake of clarity, I think I counted three (but do correct me if there's more/less):

  1. the idea of using the CPU paganin filter in sweep pipelines that involve paganin, to try and reduce the threshold on the maximum number of vertical slices
  2. if we keep GPU paganin filter in sweep pipelines: the question of how can the calculation of the maximum vertical slices be made to work with multiple GPUs (ie, how to make the maximum vertical slices threshold generic across GPU models)
  3. regardless of if the paganin filter is CPU or GPU: the concern that the modifications to the preview to accommodate the paganin filter kernel size could cause GPU OOM errors for other GPU methods in the sweep pipeline

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 5, 2024

thanks @yousefmoazzam, you've summarised the issues well. So I'd say we either should go full CPU or GPU, no hybrid modes, as it will essentially result in the same OOM GPU issue at some point.

Of course the full CPU will make it slightly inconvenient to users as they would need to build a TomoPy analogue. As a possible solution (if we decided to go that way) is to create httomolib CPU duplicates of GPU methods in the httomolibgpu library. So that when the user would run the sweep pipeilne with Paganin involved, we would modify the module paths from httomolibgpu to httomolib leaving everything else intact. Some httomolib methods could import and reuse tomopy methods, if needed.

I know that it doesn't sound ideal, but it is one possible way to avoid potentially frequent situations when the large blocks won't fit the GPU resulting in constant warning messages to users.

if we're still deciding to proceed with the GPU implementation, I think we need to deal with the situation that the largest block defines the needed size for the sweep run. I suggest that the accepted blocks (bellow the upper limit) will be taken and therefore the list of parameters will be modified accordingly. For instance alpha = [0.1, 0.2, 0.3] is given but the blocks sizes are acceptable only for [0.1, 0.2] values. I'd discard 0.3 in that instance and proceed with the run for 2 parameters, rather than completely abort the whole run. So you can see, more hacks basically to make the GPU stuff to work...

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 7, 2024

OK, so as a conclusion of our discussion @yousefmoazzam , I'll do the following.

  • get a simple memory estimator that is linearly projecting the amount of slices that fit the GPU for Paganin method based on the GPU available (e.g., 4gb,6gb,12gb,16gb,32gb)
  • Do not kill the run if the set of parameters lead to block sizes larger than the device can fit. Threshold those blocks based on the maximum allowed block size and continue with the run

@yousefmoazzam
Copy link
Collaborator

Sounds good, worth a shot to see how it goes 👍

It's worth pointing out since we didn't note it in the discussion earlier: with this approach we've addressed points 1 and 2 (in the points listed above in my attempted summary), but point 3 is still unaddressed. Dunno if you'd like more discussion before going ahead and trying this approach, or we just deal with point 3 later, it's up to you.

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 8, 2024

Actually to some degree the point 3 is addressed by relying on a memory hungry method (Paganin) to decide the maximum size for a block. I'm hoping that methods in pre/post Paganin pipeline require less memory than Paganin itself. We will see if this is the case when we implement our hack around memory estimation.

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 8, 2024

I think I need some help fixing this test please @yousefmoazzam . The reason why this test fail is that the source._data object here is not accessible (mocked?) in the test.
I'm not sure if there any other way to obtain the unpreviewed shape of the raw data, that is what source._data.shape[1]does, or may be the tests itself needs to be modified?

I've tested this approach for 20,40,80Gb datasets (different number angles) and also different GPU cards, mainly P100 and V100. I didn't get OOM error so far. This factor defines the number of slices essentially for Paganin method not to break (or any other method in the pipeline I hope). I think the second memory hungry one is FBP so far, it doesn't break. But with Fourier recon we might want to reconsider and increase factor. thanks

@dkazanc dkazanc marked this pull request as ready for review November 8, 2024 14:19
@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 8, 2024

And a follow-up, a slightly different approach to the one I suggested earlier. I do not kill the run even if the kernel width is larger than the block that fits the memory. I just take the largest block in this case and proceed with the run. The users still get the result that is smoothed and probably discard it themselves, but we're safer here from questions why the runs are so frequently terminated.

@yousefmoazzam
Copy link
Collaborator

I saw that the PR is now marked as ready for review, so I'm happy to go over it at some point soon 🙂

For now, I'll try to answer this:

I think I need some help fixing this test please @yousefmoazzam . The reason why this test fail is that the source._data object here is not accessible (mocked?) in the test. I'm not sure if there any other way to obtain the unpreviewed shape of the raw data, that is what source._data.shape[1]does, or may be the tests itself needs to be modified?

Yep, the issue seems to be that:

  • the test is using a mock loader wrapper + mock loader (to make things simpler and make the test faster, where we don't have to create a real loader that loads an actual file from the filesystem)
  • the mock loader has type DataSetSource (as expected; that's the protocol for a data source)
    def mock_make_data_source(padding) -> DataSetSource:
    ret = mocker.create_autospec(
    DataSetSource,
    global_shape=block.global_shape,
    dtype=block.data.dtype,
    chunk_shape=block.chunk_shape,
    chunk_index=block.chunk_index,
    slicing_dim=1 if interface.pattern == Pattern.sinogram else 0,
    aux_data=block.aux_data,
    )
    slicing_dim: Literal[0, 1, 2] = (
    1 if interface.pattern == Pattern.sinogram else 0
    )
    mocker.patch.object(
    ret,
    "read_block",
    side_effect=lambda start, length: DataSetBlock(
    data=block.data[start : start + length, :, :],
    aux_data=block.aux_data,
    global_shape=block.global_shape,
    chunk_shape=block.chunk_shape,
    slicing_dim=slicing_dim,
    block_start=start,
    chunk_start=block.chunk_index[slicing_dim],
    ),
    )
    return ret
    mocker.patch.object(
    interface,
    "make_data_source",
    side_effect=mock_make_data_source,
    )
  • the DataSetSource type says nothing about having the private attribute ._data (as expected; protocols shouldn't enforce what private attributes are needed for an implementation)
    class DataSetSource(Protocol):
    """MPI-aware source for full datasets, where each process handles a *chunk*, and
    the data can be read in *blocks*, sliced in the given slicing dimension"""
    @property
    def dtype(self) -> np.dtype: ... # pragma: no cover
    @property
    def global_shape(self) -> Tuple[int, int, int]:
    """Global data shape across all processes that we eventually have to read."""
    ... # pragma: no cover
    @property
    def chunk_shape(self) -> Tuple[int, int, int]:
    """Returns the shape of a chunk, i.e. the data processed in the current
    MPI process (whether it fits in memory or not)"""
    ... # pragma: no cover
    @property
    def global_index(self) -> Tuple[int, int, int]:
    """Returns the start index of the chunk within the global data array"""
    ... # pragma: no cover
    @property
    def slicing_dim(self) -> Literal[0, 1, 2]:
    """Slicing dimension - 0, 1, or 2"""
    ... # pragma: no cover
    @property
    def aux_data(self) -> AuxiliaryData:
    """Auxiliary data"""
    ... # pragma: no cover
    def read_block(self, start: int, length: int) -> DataSetBlock:
    """Reads a block from the dataset, starting at `start` of length `length`,
    in the current slicing dimension. Note that `start` is chunk-based,
    i.e. mean different things in different processes."""
    ... # pragma: no cover
    def finalize(self):
    """Method intended to be called after reading all blocks is done,
    to give implementations a chance to close files, free memory, etc."""
    ... # pragma: no cover
  • the updated code in the sweep runner is assuming the existence of the private attribute ._data on the data source (which is questionable)
    preview_new_start_stop = _preview_modifier(self, source._data.shape[1])

I would say that the best way to resolve this is to not use the private attribute, and find another way to get access to the raw data global shape. This would involve more changes and some thinking of course, but I think it's the safer way to do things - I'm not sure there's many places where reliance on private members of an object is advocated.

If accessing the private attribute absolutely must be the way to do this, then I agree that the test will need to be changed to accommodate the assumption the code is making about the existence of the private attribute ._data. How it should be changed, I don't yet know, that will also require some thinking.

Before, the loader wrapper + loader mocks could be used because the sweep runner was not making any assumption about private members of a DataSetSource, so we didn't need to provide a "real" loader. This in turn made writing the test simpler, since we could use a mock loader instead of a real one. But now we need a real loader (because only a real loader has the private attribute ._data), and the test needs to be rethought a bit.

@yousefmoazzam
Copy link
Collaborator

yousefmoazzam commented Nov 8, 2024

An example of a potential way to solve this without doing the private attribute access would be to modify DataSetSource, and its implementors.

If it makes sense for any data source to have the information about the raw data's global shape (which should be checked if it's reasonable or not), then one could:

  • add a raw_shape property to DataSetSource (or some other name that implies "the shape of the raw data")
  • for all implementors of DataSetSource, add an implementation of that property
  • for example, in the case of StandardTomoLoader, it might do something like this:
@property
def raw_shape(self) -> Tuple[int, int, int]
    return self._data.shape
  • then, the sweep runner could do source.raw_shape instead of source._data.shape, and would not be assuming the existence of any private attribute on the loader

This change would be minimal, and would avoid doing any private member access.

This is just an example to illustrate that there are ways to go about this without the private member access, and to provide some sort of substance to my advice to "find another way to get access to the raw data global shape".

Copy link
Collaborator

@yousefmoazzam yousefmoazzam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A decent start, thanks! I've made some comments on a few things, the public vs. private stuff and the lack of tests for the new feature I think are the main points (in addition to the concern about the private member access of source._data mentioned outside of this review).

httomo/sweep_runner/paganin_kernel.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 11, 2024

This is just an example to illustrate that there are ways to go about this without the private member access, and to provide some sort of substance to my advice to "find another way to get access to the raw data global shape".

Thanks, I've done some changes as you suggested, test_execute_modifies_block still fails but now on PreviewConfig is not being available in the loader as this is something we need, before it gets reassigned. Should make_test_loader take preview into account somehow?

@yousefmoazzam
Copy link
Collaborator

yousefmoazzam commented Nov 11, 2024

Thanks, I've done some changes as you suggested, test_execute_modifies_block still fails but now on PreviewConfig is not being available in the loader as this is something we need, before it gets reassigned. Should make_test_loader take preview into account somehow?

This is a similar issue to the previous one, but now with StandardLoaderWrapper (and not involving private member access, but instead involving implementation-specific assumptions):

  • self._pipeline.loader is an object which implements LoaderInterface
    @property
    def loader(self) -> LoaderInterface:
    return self._loader
  • LoaderInterface doesn't say anything about having a .preview attribute (or getter method):
    class LoaderInterface(Protocol):
    """Interface to a loader object"""
    # Patterns the loader supports
    pattern: Pattern = Pattern.all
    # purely informational, for use by the logger
    method_name: str
    package_name: str = "httomo"
    def make_data_source(self, padding: Tuple[int, int]) -> DataSetSource:
    """Create a dataset source that can produce padded blocks of data from the file.
    This will be called after the patterns and sections have been determined,
    just before the execution of the first section starts."""
    ... # pragma: no cover
    @property
    def detector_x(self) -> int:
    """detector x-dimension of the loaded data"""
    ... # pragma: no cover
    @property
    def detector_y(self) -> int:
    """detector y-dimension of the loaded data"""
    ... # pragma: no cover
    @property
    def angles_total(self) -> int:
    """angles dimension of the loaded data"""
    ... # pragma: no cover
  • StandardLoaderWrapper is an implementor of LoaderInterface, and it happens to have a .preview attribute (ie, it's implementation-specific to this loader wrapper, the .preview attribute is not a general part of what LoaderInterface provides):
    class StandardLoaderWrapper(LoaderInterface):
    """
    Wrapper around `StandardTomoLoader` to provide its functionality as a data source to the
    runner, while also giving the runner an implementor of `LoaderInterface`.
    """
    def __init__(
    self,
    comm: MPI.Comm,
    # parameters that should be adjustable from YAML
    in_file: Path,
    data_path: str,
    image_key_path: Optional[str],
    darks: DarksFlatsFileConfig,
    flats: DarksFlatsFileConfig,
    angles: AnglesConfig,
    preview: PreviewConfig,
    ):
    self.pattern = Pattern.projection
    self.method_name = "standard_tomo"
    self.package_name = "httomo"
    self._detector_x: int = 0
    self._detector_y: int = 0
    self._angles_total: int = 0
    self.comm = comm
    self.in_file = in_file
    self.data_path = data_path
    self.image_key_path = image_key_path
    self.darks = darks
    self.flats = flats
    self.angles = angles
    self.preview = preview
  • so, the sweep runner is assuming things about the loader wrapper which it shouldn't assume (based on the current definition of LoaderInterface)

Also similar to before, a potential solution would be to modify the protocol in question (LoaderInterface) and its implementor (StandardLoaderWrapper).

If it would make sense to have any "loader wrapper" provide the associated preview config (which does sound reasonable to me), then it may make sense to:

  • add a preview getter method to LoaderInterface
  • add an implementation of it in StandardLoaderWrapper, something like:
@property
def preview(self) -> PreviewConfig
    return self.preview
  • modify make_test_loader() to take in a preview: PreviewConfig parameter and pass it in to mocker.autospec() when defining a mock implementor of LoaderInterface
  • in the test that's failing, update the call to make_test_loader() to pass in some preview config object (for the sake of consistency, making the preview config reflect what the GLOBAL_SHAPE/PREVIEWED_SLICES_SHAPE values are would make the most sense)

Side note: given that the loader wrapper would provide the preview if the above is done, and that the loader itself also would be able to provide this info, I'm becoming more wary of if the existence of the loader wrapper makes much sense (and if the wrapper should instead just go away, and loaders should be direct implementors of LoaderInterface). Tagging #504, as this is a relevant piece of info for that.

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 11, 2024

Thanks Yousef, this gets a bit more in-depth, but I will give it a go. Also the tests will be failing even if the preview stuff is sorted, because of this as source.raw_shape is not defined for DataSetBlock. Should it be though?

@yousefmoazzam
Copy link
Collaborator

Thanks Yousef, this gets a bit more in-depth, but I will give it a go. Also the tests will be failing even if the preview stuff is sorted, because of this as source.raw_shape is not defined for DataSetBlock. Should it be though?

Sorry, I'm not sure I understand: where is the need for DataSetBlock to have the .raw_shape property? The source variable in that function is only ever a value of the type returned by self._pipeline.loader.make_data_source(), and the type that function returns is DataSetSource. Could you point me to where DataSetBlock comes into the picture here?

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 11, 2024

Yes, its the tests for the sweep runs, they all build around using DataSetBlock rather than the loader where source.raw_shape would be available otherwise.

@yousefmoazzam
Copy link
Collaborator

Ah ok, thanks for the link, I see that the tests create a DataSetBlock for the block splitter to return (which gets the block from the mock loader), so that's one piece of the puzzle in understanding a bit better on my end.

I'm still not sure where there's a need for DataSetBlock to have .raw_shape though? The .raw_shape attribute is for data sources, and a DataSetBlock isn't a data source, it's something that is produced by a data source.

The function/purpose of the DataSetBlock created in the tests is that it's a block that is produced by the block splitter (which as I mentioned before, the splitter gets from the mock loader); the mock loader will already be configured to have the .raw_shape attribute from the changes made earlier in the PR, so source.raw_shape will work in the sweep runner now. And when the splitter is asked for a block here:

dataset_block = splitter[0]

then the DataSetBlock created in the test is returned by the block splitter. After that, I can't yet see where the DataSetBlock has any need for a .raw_shape attribute.

So, there's something I'm still not understanding I think: could you point me to where exactly you think the DataSetBlock would need .raw_shape?

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 11, 2024

OK, so a couple of things here:

  1. I've done the preview introduction into Loader as suggested. In the new test tests_preview_modifier_paganin I'm passing the preview config into the make_test_loader and then create_autospec. The resulting preview in the loader is a mocked object still, but I was hoping it will be what I actually passed to it. Could it be that create_autospec won't support preview classes to be passed in, as I see that other variables are just normal types?
  2. Secondly on raw_shape again. In that test above source.raw_shape is also a mocked object, but I need an actual shape of the raw data in order to get something meaningful from the updated preview in the test. May be I should mock it somehow as apparently it shouldn't be in DataSetBlock, as you pointed earlier?

@yousefmoazzam
Copy link
Collaborator

OK, so a couple of things here:

  1. I've done the preview introduction into Loader as suggested. In the new test tests_preview_modifier_paganin I'm passing the preview config into the make_test_loader and then create_autospec. The resulting preview in the loader is a mocked object still, but I was hoping it will be what I actually passed to it. Could it be that create_autospec won't support preview classes to be passed in, as I see that other variables are just normal types?

The preview needs to be passed to the autospeccing of the LoaderInterface, currently it's not:

def make_test_loader(
mocker: MockerFixture,
preview: Optional[PreviewConfig] = None,
block: Optional[DataSetBlock] = None,
pattern: Pattern = Pattern.all,
method_name="testloader",
) -> LoaderInterface:
interface: LoaderInterface = mocker.create_autospec(
LoaderInterface,
instance=True,
pattern=pattern,
method_name=method_name,
reslice=False,
)

The self._pipeline.loader.preview is essentially doing StandardLoaderWrapper.preview / LoaderInterface.preview, and because create_autospec() for LoaderInterface hasn't been given the preview, the mock loader wrapper doesn't know about the preview, so it won't have the preview value that was passed into make_test_loader().

  1. Secondly on raw_shape again. In that test above source.raw_shape is also a mocked object, but I need an actual shape of the raw data in order to get something meaningful from the updated preview in the test. May be I should mock it somehow as apparently it shouldn't be in DataSetBlock, as you pointed earlier?

Yeah that makes sense, by default the mock loader wrapper won't have an implementation of the raw_shape getter. Patching .raw_shape to return the global shape defined in the test would be a reasonable approach here I think.

In the past I've seen issues with trying to use mocker.patch.object() (which is how most things are patched in httomo tests - see make_mock_repo() in testing_utils.py for an example) for patching getter properties. Using PropertyMock seems to be the preferred way to mock getter properties. But of course, if you find a way to use mocker.patch.object() to patch the .raw_shape getter, feel free to do it that way.

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 12, 2024

@yousefmoazzam sorry even with the changes you suggested I still cannot get correct raw_shape from the loader. Can you have a look please?

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 13, 2024

ok, currently all tests pass for me except two test_insert_image_save_after_sweep and test_insert_image_save_after_sweep2, but it is expected as PR #523 should fix them. New test tests_preview_modifier_paganin tests the added functionality for calculating the new preview size for sweep runs based on the kernel size of the smoothing filter. Thanks for help @yousefmoazzam

Copy link
Collaborator

@yousefmoazzam yousefmoazzam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for the updates!

I think we're close, but there are a few things I'd like to be addressed before merging. The main one is stuff around the assertion on the private ParamSweepRunner._vertical_slices_preview, and to double-check if you also see an incorrectly modified preview as mentioned in the comments below.

tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
tests/sweep_runner/test_param_sweep_runner.py Outdated Show resolved Hide resolved
httomo/sweep_runner/param_sweep_runner.py Show resolved Hide resolved
Copy link
Collaborator

@yousefmoazzam yousefmoazzam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much appreciated for sticking it out, it was a bit of a journey! Things look good to me, thanks again 🙂

@dkazanc
Copy link
Collaborator Author

dkazanc commented Nov 18, 2024

One last thing that came to my mind is that this condition catches both TomoPy and Savu implementations but then proceed to work with TomoPy parameters, so most likely Savu implementation will fail here. As Savu implementation pads the data in the method itself I'm thinking to just let it work as a normal method by taking 5 (or whatever is default) slices. So I guess I change this condition to look for paganin_filter_tomopy specifically.

@dkazanc dkazanc merged commit 114bc25 into main Nov 18, 2024
2 of 3 checks passed
@dkazanc dkazanc deleted the paganinsweep branch November 18, 2024 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sweep issue for smaller previews
2 participants