You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think we still need to fall back because we need to raise the same AttributeError that pandas would raise if it didn’t exist.
As a user, I consider falling back to pandas unnecessarily to be significantly worse than raising a slightly inconsistent AttributeError message - especially when using hasattr which is when the AttributeError would be caught anyway.
If exception raising needs to be perfectly consistent, perhaps there's a way to still raise it without needing to fallback to pandas? e.g.
def__getattr(self, attr):
ifhasattr(pandas.DataFrame, attr):
# do the fallback, raise if necessaryelse:
msg=f"{attr} is present in neither cuDF nor pandas"raiseAttributeError(msg)
The text was updated successfully, but these errors were encountered:
Just wanted to comment that this is an interesting class of optimization that I think would be nice for cudf.pandas - Can cudf.pandas not fall back if it's an exception that would raise in pandas too? Like this hasattr check, other cases might be:
Calling methods that don't exist in either library.
Invalid arguments
Known invalid operations between two types (e.g. binops between strings and lists)
Discovered in Narwhals
If I have a function like:
and run it with
cudf.pandas
, then this is enough to cause a fallback to pandas - even though cudf would have been perfectly capable of executing it!This isn't farfetched - you can find examples of libraries doing hasattr checks all over the place, e.g. this one in pymc
https://github.com/pymc-devs/pymc/blob/ae4a6292aa48d5722dba6ab422e1a3d895ca7bf7/pymc/data.py#L232-L248
Scikit-learn:
https://github.com/scikit-learn/scikit-learn/blob/c9aeb15f8f1c7c54ed4ef27c871f7167e2ce3077/sklearn/feature_selection/_base.py#L129-L131
Plotly:
https://github.com/plotly/plotly.py/blob/231edaa61f8590f715134d42ea0bc2f858dd713e/packages/python/plotly/plotly/express/_imshow.py#L303-L312
I asked about this on Slack, and got the response
As a user, I consider falling back to pandas unnecessarily to be significantly worse than raising a slightly inconsistent AttributeError message - especially when using
hasattr
which is when theAttributeError
would be caught anyway.If exception raising needs to be perfectly consistent, perhaps there's a way to still raise it without needing to fallback to pandas? e.g.
The text was updated successfully, but these errors were encountered: