Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnwrapCastInComparison produces incorrect results #14303

Open
jonahgao opened this issue Jan 26, 2025 · 2 comments
Open

UnwrapCastInComparison produces incorrect results #14303

jonahgao opened this issue Jan 26, 2025 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@jonahgao
Copy link
Member

jonahgao commented Jan 26, 2025

Describe the bug

I found that UnwrapCastInComparison always assumes the cast operation can succeed, but when it cannot, it results in incorrect optimization results.

To Reproduce

Run query in CLI (compiled from the latest main: f775791)

DataFusion CLI v44.0.0
> with t as (select 1000000 as a) select try_cast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
| true           |
+----------------+
1 row(s) fetched.
Elapsed 0.008 seconds.

> with t as (select 1000000 as a) select cast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
| true           |
+----------------+
1 row(s) fetched.
Elapsed 0.007 seconds.

Expected behavior

When optimizations are disabled, the above queries will produce different results, which are correct.

> set datafusion.optimizer.max_passes=0;
0 row(s) fetched.
Elapsed 0.003 seconds.

> with t as (select 1000000 as a) select try_cast(a as smallint) > 1 from t;
+----------------+
| t.a > Int64(1) |
+----------------+
| NULL           |
+----------------+
1 row(s) fetched.
Elapsed 0.006 seconds.

> with t as (select 1000000 as a) select cast(a as smallint) > 1 from t;
Arrow error: Cast error: Can't cast value 1000000 to type Int16

Additional context

I don't think this is a very urgent bug because both Spark and DuckDB have similar issues.

@jonahgao jonahgao added the bug Something isn't working label Jan 26, 2025
@Spaarsh
Copy link
Contributor

Spaarsh commented Jan 26, 2025

I have analyzed the code in unwrap_cast_in_comparison.rs. I think adding conditional statement in OptimizerRule implementation of the UnwrapCastInComparison should fix this.

@Spaarsh
Copy link
Contributor

Spaarsh commented Jan 26, 2025

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants