Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive Clone for more ExecutionPlans #13202

Closed
alamb opened this issue Oct 31, 2024 · 6 comments
Closed

Derive Clone for more ExecutionPlans #13202

alamb opened this issue Oct 31, 2024 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Oct 31, 2024

Is your feature request related to a problem or challenge?

In our code / optimizers we have a lot of code that looks like this when we are
trying to rewrite a SortExec node by making a copy and changing only some of the
fields. For example:

     new_plan = Ok(Arc::new(
                SortExec::new(sort_exec.expr().to_vec(), Arc::clone(sort_exec.input()))
                    .with_fetch(sort_exec.fetch())
                    .with_preserve_partitioning(true),
            ));

This is problematic and we have had bugs in the past when SortExec got a new field (like fetch)

We would like to simply be able to make a copy and modify appropriately:

     new_plan = Ok(Arc::new(sort_exec.clone().with_preserve_partitioning(true)));

Many ExecutionPlans like ParquetExec already implement Clone so it is nice to
make the rest consistent

Describe the solution you'd like

I would like to derive Clone for other plans

Describe alternatives you've considered

No response

Additional context

No response

@alamb alamb added the enhancement New feature or request label Oct 31, 2024
@alamb alamb self-assigned this Oct 31, 2024
@findepi
Copy link
Member

findepi commented Oct 31, 2024

This is problematic and we have had bugs in the past when SortExec got a new field (like fetch)

We would like to simply be able to make a copy and modify appropriately:

     new_plan = Ok(Arc::new(sort_exec.clone().with_preserve_partitioning(true)));

IMO the right answer in all cases is full destructuring and direct creation, without any shortcuts (like .. or builders).
Why?

When a plan node gains new field, it's impossible to judge apriori about all callsites using this plan node. Does the logic still hold with this new field being added? Does it need to change?
Each callsite needs to be revisited (unless this is printing for debug or whatever).
And the only known way to force this to happen is to not have shortcuts like .. / builders or cloning.

@alamb
Copy link
Contributor Author

alamb commented Nov 1, 2024

IMO the right answer in all cases is full destructuring and direct creation, without any shortcuts (like .. or builders).

I basically agree with this assesment -- and that is in my mind what a Clone able ExecutionPlan permits

Given the fact that we almost always have Arc<ExecutionPlan>, often the only way to get a full destructing would be with an owned copy

@findepi
Copy link
Member

findepi commented Nov 1, 2024

I thought it's an option to destruct taking fields by reference (like in match)

@alamb
Copy link
Contributor Author

alamb commented Nov 1, 2024

I thought it's an option to destruct taking fields by reference (like in match)

That is true but it won't help modifying the field 🤔

@findepi
Copy link
Member

findepi commented Nov 1, 2024

Oh see your point. Destruct by reference, compute new value for a field and then you want to construct new instance. You need shallow clone of all other fields. 👍

@alamb
Copy link
Contributor Author

alamb commented Nov 4, 2024

Closed in #13203

@alamb alamb closed this as completed Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants