How to evaluate function's args one by one / implement short circuit evaluation #8959
Replies: 4 comments
-
cc @alamb @jackwener |
Beta Was this translation helpful? Give feedback.
-
In my opinion, the overall solution is reasonable. But I'm not sure about details like the design of the interface. |
Beta Was this translation helpful? Give feedback.
-
I agree this is a very reasonable usecase to try and support One challenge is likely to be that Maybe we can take advantage of having a trait for defining scalar functions now, For example, perhaps we could provide a version of invoke like Something like this perhaps 🤔 pub trait ScalarUDFImpl: Debug + Send + Sync {
// existing API in terms of columnar value
fn invoke(&self, args: &[ColumnarValue]) -> Result<ColumnarValue> {
// calls new API invoke_short_circuit
}
/// Evaluate the function without first evaluating arguments.
/// if the function needs values of its arguments, it calls some
/// method on `ArgEvaluator`
fn invoke_short_circuit(&self, args: &dyn ArgEvaluator) {
...
}
} |
Beta Was this translation helpful? Give feedback.
-
I think we can use SQL annotations to implement it, like this: -- set short circuited=true |
Beta Was this translation helpful? Give feedback.
-
When working on #8927, I find when evaluating a scalar function, we first evaluate all args and then execute the actual function; see below
https://github.com/apache/arrow-datafusion/blob/827668ab3d4659329ff30e6672f398c934b803be/datafusion/physical-expr/src/scalar_function.rs#L152-L161
But for a function like
coalesce,
it is a short-circuit function; we should evaluate its args one by one(like we do forCASE
), otherwise, the below query will get the error:To do this, I have two ideas
First
change the
ScalarFunctionImplementation
Second
treat
coalesce
as a specific case in https://github.com/apache/arrow-datafusion/blob/827668ab3d4659329ff30e6672f398c934b803be/datafusion/physical-expr/src/scalar_function.rs#L152-L156and implement
coalesce
like belowThird
add a function to identify if the function is a short-circuit function, and create the function with different args for non-short-circuit and short-circuit functions(
create_physical_expr
), then add two functions toScalarUDFImpl
that make use can create themself short-circuit the function. just likeBeta Was this translation helpful? Give feedback.
All reactions