-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DictionaryArray
in OVER
clause
#13153
Conversation
I tested this manually with datafusion-cli. Happy to add a test if needed but maybe it's okay as is. |
cc @alamb |
It looks like a new & cool capability. Is it possible to add a test? |
More like existing functionality that wasn't enabled. Any suggestions as to where I should add said test? |
I tried to produce an error / write some examples in dictionary.slt and I couldn't figure out how to trigger an issue. Here is what I tried Basically my feeble mind can't figure out if the newly added code can ever be executed. (venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ git diff
diff --git a/datafusion/sqllogictest/test_files/dictionary.slt b/datafusion/sqllogictest/test_files/dictionary.slt
index 176331f57..e802ddfe6 100644
--- a/datafusion/sqllogictest/test_files/dictionary.slt
+++ b/datafusion/sqllogictest/test_files/dictionary.slt
@@ -320,6 +320,41 @@ ORDER BY
2023-12-20T01:30:00 1000 f1 32.0
2023-12-20T01:30:00 1000 f2 foo
+# Window Functions
+query TTTT
+SELECT "tag_id", "type",
+ lead("type") OVER (PARTITION BY "tag_id" ORDER BY "time") as "next_type",
+ lag("type") OVER (PARTITION BY "tag_id" ORDER BY "time") as "last_type"
+FROM m2;
+----
+1000 active active NULL
+1000 active active active
+1000 active active active
+1000 active active active
+1000 active active active
+1000 active passive active
+1000 passive passive active
+1000 passive passive passive
+1000 passive passive passive
+1000 passive NULL passive
+
+query TTTT
+SELECT "tag_id", "type",
+ lead("type") OVER (PARTITION BY "tag_id" ORDER BY "time" RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as "next_type",
+ lag("type") OVER (PARTITION BY "tag_id" ORDER BY "time") as "last_type"
+FROM m2;
+----
+1000 active active NULL
+1000 active active active
+1000 active active active
+1000 active active active
+1000 active active active
+1000 active passive active
+1000 passive passive active
+1000 passive passive passive
+1000 passive passive passive
+1000 passive NULL passive
+ |
I am concerned that the lack of testing here would mean we could accidentally break the behavior without knowing it in some future refactor |
Oh, now I see there is an example in #13151 (comment) Maybe we can just add that example to dictionary.slt and we'll be good? |
I have no experience with these |
Guess we'll find out soon 😆 |
@alamb looks like that worked 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @adriangb
DictionaryArray
in OVER
clause
🚀 |
Fixes #13151