Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime objects in row_filter expressions are not casted and raise an error #1456

Closed
jayceslesar opened this issue Dec 20, 2024 · 7 comments · Fixed by #1542
Closed

datetime objects in row_filter expressions are not casted and raise an error #1456

jayceslesar opened this issue Dec 20, 2024 · 7 comments · Fixed by #1542

Comments

@jayceslesar
Copy link
Contributor

Feature Request / Improvement

in the following row_filter in a scan() call,

row_filter=And(
            GreaterThanOrEqual("timestamp_received", start_time),
            LessThan("timestamp_received", end_time),
        ),

the start_time and end_time variables are datetime objects. When running a scan with that row_filter I end up with the error of

TypeError: Invalid literal value: datetime.datetime(2024, 11, 13, 11, 35, 28, 730000, tzinfo=datetime.timezone.utc)

Instead, the expressions should correctly cast the datetime object instead of expecting an isoformat string, which is the current behavior

@kevinjqliu
Copy link
Contributor

Thanks for reporting this issue!
I believe this is the relevant code

def literal(value: L) -> Literal[L]:
"""
Construct an Iceberg Literal based on Python primitive data type.
Args:
value (Python primitive type): the value to be associated with literal.
Example:
from pyiceberg.expressions.literals import literal.
>>> literal(123)
LongLiteral(123)
"""
if isinstance(value, float):
return DoubleLiteral(value) # type: ignore
elif isinstance(value, bool):
return BooleanLiteral(value)
elif isinstance(value, int):
return LongLiteral(value)
elif isinstance(value, str):
return StringLiteral(value)
elif isinstance(value, UUID):
return UUIDLiteral(value.bytes) # type: ignore
elif isinstance(value, bytes):
return BinaryLiteral(value)
elif isinstance(value, Decimal):
return DecimalLiteral(value)
else:
raise TypeError(f"Invalid literal value: {repr(value)}")

Looks like we don't take into account python's datetime object.

cc @Fokko do you have context around this?

@jayceslesar
Copy link
Contributor Author

Any thoughts here? there are a few ways to implement this from what I can tell, the easiest would be to just add a clause to the literal function that returns TimestampLiteral(datetime_to_micros(value)) if value is a datetime

@kunnapatt
Copy link
Contributor

@jayceslesar
Thanks I got the same problem, but when I followed your comment, It worked.

@kevinjqliu
Copy link
Contributor

@jayceslesar i think that makes sense. Are you interested to contribute this?

@jayceslesar
Copy link
Contributor Author

@jayceslesar i think that makes sense. Are you interested to contribute this?

Happy to contribute, was just wondering if there was a slightly prettier way but if that is acceptable than sounds good to me

@kevinjqliu
Copy link
Contributor

I think TimestampLiteral(datetime_to_micros(value)) is fine.

We use TimestampLiteral(timestamp_to_micros(self.value)) to convert string to timestamp

@to.register(TimestampType)
def _(self, _: TimestampType) -> Literal[int]:
return TimestampLiteral(timestamp_to_micros(self.value))
@to.register(TimestamptzType)
def _(self, _: TimestamptzType) -> Literal[int]:
return TimestampLiteral(timestamptz_to_micros(self.value))

cc @Fokko wdyt?

@jayceslesar
Copy link
Contributor Author

#1542

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@kevinjqliu @kunnapatt @jayceslesar and others