You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that UnwrapCastInComparison always assumes the cast operation can succeed, but when it cannot, it results in incorrect optimization results.
To Reproduce
Run query in CLI (compiled from the latest main: f775791)
DataFusion CLI v44.0.0
> with t as (select 1000000 as a) selecttry_cast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
|true|
+----------------+
1 row(s) fetched.
Elapsed 0.008 seconds.
> with t as (select 1000000 as a) selectcast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
|true|
+----------------+
1 row(s) fetched.
Elapsed 0.007 seconds.
Expected behavior
When optimizations are disabled, the above queries will produce different results, which are correct.
>set datafusion.optimizer.max_passes=0;
0 row(s) fetched.
Elapsed 0.003 seconds.
> with t as (select 1000000 as a) selecttry_cast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
| NULL |
+----------------+
1 row(s) fetched.
Elapsed 0.006 seconds.
> with t as (select 1000000 as a) selectcast(a as smallint) > 1 from t;
Arrow error: Cast error: Can't cast value 1000000 to type Int16
Additional context
I don't think this is a very urgent bug because both Spark and DuckDB have similar issues.
The text was updated successfully, but these errors were encountered:
I have analyzed the code in unwrap_cast_in_comparison.rs. I think adding conditional statement in OptimizerRule implementation of the UnwrapCastInComparison should fix this.
Describe the bug
I found that
UnwrapCastInComparison
always assumes the cast operation can succeed, but when it cannot, it results in incorrect optimization results.To Reproduce
Run query in CLI (compiled from the latest main: f775791)
Expected behavior
When optimizations are disabled, the above queries will produce different results, which are correct.
Additional context
I don't think this is a very urgent bug because both Spark and DuckDB have similar issues.
The text was updated successfully, but these errors were encountered: