Skip to content

Commit

Permalink
Enh: extend narwhalify (#328)
Browse files Browse the repository at this point in the history
* WIP enh: extend narwhalify

* validate single backend

* rm narhwalify method

* pyproject rollback

* feedback adjusted
  • Loading branch information
FBruzzesi authored Jun 27, 2024
1 parent 351681e commit 7889e9d
Show file tree
Hide file tree
Showing 12 changed files with 384 additions and 411 deletions.
1 change: 0 additions & 1 deletion docs/api-reference/narwhals.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ Here are the top-level functions available in Narwhals.
- mean
- min
- narwhalify
- narwhalify_method
- sum
- sum_horizontal
- show_versions
Expand Down
16 changes: 8 additions & 8 deletions docs/basics/column.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,9 +127,9 @@ of using expressions, we'll extract a `Series`.
```python exec="1" source="above" session="ex2"
import narwhals as nw

def my_func(df):
df_s = nw.from_native(df, eager_only=True)
return df_s['a'].mean()
@nw.narwhalify
def my_func(df_any):
return df_any['a'].mean()
```

=== "pandas"
Expand All @@ -148,8 +148,8 @@ def my_func(df):
print(my_func(df))
```

Note that, this time, we couldn't use `@nw.narwhalify`, as the final step in
our function wasn't `nw.to_native`, so we had to explicitly use `nw.from_native`
as the first step. In general, we recommend using the decorator where possible,
as it looks a lot cleaner, and only using `nw.from_native` / `nw.to_native` explicitly
when you need them.
Note that, even though the output of our function is not a dataframe nor a series, we can
still use `narwhalify`.

In general, we recommend using the decorator where possible, as it looks a lot cleaner,
and only using `nw.from_native` / `nw.to_native` explicitly when you need them.
15 changes: 7 additions & 8 deletions docs/basics/complete_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@ stored them in attributes `self.means` and `self.std_devs`.
## Transform method

We're going to take in a dataframe, and return a dataframe of the same type.
Therefore, we use `@nw.narwhalify_method` (the counterpart to `@nw.narwhalify` which is
meant to be used for methods):
Therefore, we use `@nw.narwhalify`:

```python
import narwhals as nw

class StandardScaler:
@nw.narwhalify_method
@nw.narwhalify
def transform(self, df):
return df.with_columns(
(nw.col(col) - self._means[col]) / self._std_devs[col]
Expand All @@ -45,16 +44,15 @@ To be able to get `Series` out of our `DataFrame`, we'll pass `eager_only=True`
This is because Polars doesn't have a concept of lazy `Series`, and so Narwhals
doesn't either.

Note how here, we're not returning a dataframe to the user - we just take a dataframe in, and
store some internal state. Therefore, we use `nw.from_native` explicitly, as opposed to using the
utility `@nw.narwhalify_method` decorator.
We can specify that in the `@nw.narwhalify` decorator by setting `eager_only=True`, and
the argument will be propagated to `nw.from_native`.

```python
import narwhals as nw

class StandardScaler:
@nw.narwhalify(eager_only=True)
def fit(self, df_any):
df = nw.from_native(df_any, eager_only=True)
self._means = {col: df[col].mean() for col in df.columns}
self._std_devs = {col: df[col].std() for col in df.columns}
```
Expand All @@ -66,12 +64,13 @@ Here is our dataframe-agnostic standard scaler:
import narwhals as nw

class StandardScaler:
@nw.narwhalify(eager_only=True)
def fit(self, df_any):
df = nw.from_native(df_any, eager_only=True)
self._means = {col: df[col].mean() for col in df.columns}
self._std_devs = {col: df[col].std() for col in df.columns}

@nw.narwhalify_method
@nw.narwhalify
def transform(self, df):
return df.with_columns(
(nw.col(col) - self._means[col]) / self._std_devs[col]
Expand Down
44 changes: 41 additions & 3 deletions docs/basics/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,9 +113,8 @@ Let's try it out:

## Example 3: horizontal sum

Expressions can be free-standing functions which accept other
expressions as inputs. For example, we can compute a horizontal
sum using `nw.sum_horizontal`.
Expressions can be free-standing functions which accept other expressions as inputs.
For example, we can compute a horizontal sum using `nw.sum_horizontal`.

Make a Python file with the following content:
```python exec="1" source="above" session="df_ex3"
Expand Down Expand Up @@ -150,3 +149,42 @@ Let's try it out:
df = pl.LazyFrame({'a': [1, 1, 2], 'b': [4, 5, 6]})
print(func(df).collect())
```

## Example 4: multiple inputs

`nw.narwhalify` can be used to decorate functions that take multiple inputs as well and
return a non dataframe/series-like object.

For example, let's compute how many rows are left in a dataframe after filtering it based
on a series.

Make a Python file with the following content:
```python exec="1" source="above" session="df_ex4"
import narwhals as nw

@nw.narwhalify(eager_only=True)
def func(df: nw.DataFrame, s: nw.Series, col_name: str):
return df.filter(nw.col(col_name).is_in(s)).shape[0]
```

We require `eager_only=True` here because lazyframe doesn't support `.shape`.

Let's try it out:

=== "pandas"
```python exec="true" source="material-block" result="python" session="df_ex4"
import pandas as pd

df = pd.DataFrame({'a': [1, 1, 2, 2, 3], 'b': [4, 5, 6, 7, 8]})
s = pd.Series([1, 3])
print(func(df, s.to_numpy(), 'a'))
```

=== "Polars (eager)"
```python exec="true" source="material-block" result="python" session="df_ex4"
import polars as pl

df = pl.DataFrame({'a': [1, 1, 2, 2, 3], 'b': [4, 5, 6, 7, 8]})
s = pl.Series([1, 3])
print(func(df, s.to_numpy(), 'a'))
```
2 changes: 0 additions & 2 deletions narwhals/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
from narwhals.translate import from_native
from narwhals.translate import get_native_namespace
from narwhals.translate import narwhalify
from narwhals.translate import narwhalify_method
from narwhals.translate import to_native
from narwhals.utils import maybe_align_index
from narwhals.utils import maybe_convert_dtypes
Expand Down Expand Up @@ -78,6 +77,5 @@
"Datetime",
"Date",
"narwhalify",
"narwhalify_method",
"show_versions",
]
6 changes: 3 additions & 3 deletions narwhals/_pandas_like/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
from narwhals._pandas_like.utils import translate_dtype
from narwhals._pandas_like.utils import validate_dataframe_comparand
from narwhals._pandas_like.utils import validate_indices
from narwhals.translate import get_cudf
from narwhals.translate import get_modin
from narwhals.translate import get_pandas
from narwhals.dependencies import get_cudf
from narwhals.dependencies import get_modin
from narwhals.dependencies import get_pandas
from narwhals.utils import flatten

if TYPE_CHECKING:
Expand Down
12 changes: 12 additions & 0 deletions narwhals/_pandas_like/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
from narwhals._pandas_like.utils import to_datetime
from narwhals._pandas_like.utils import translate_dtype
from narwhals._pandas_like.utils import validate_column_comparand
from narwhals.dependencies import get_cudf
from narwhals.dependencies import get_modin
from narwhals.dependencies import get_pandas
from narwhals.utils import parse_version

Expand Down Expand Up @@ -98,6 +100,16 @@ def __narwhals_namespace__(self) -> PandasNamespace:

return PandasNamespace(self._implementation)

def __native_namespace__(self) -> Any:
if self._implementation == "pandas":
return get_pandas()
if self._implementation == "modin": # pragma: no cover
return get_modin()
if self._implementation == "cudf": # pragma: no cover
return get_cudf()
msg = f"Expected pandas/modin/cudf, got: {type(self._implementation)}" # pragma: no cover
raise AssertionError(msg)

def __narwhals_series__(self) -> Self:
return self

Expand Down
6 changes: 3 additions & 3 deletions narwhals/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@
from typing import overload

from narwhals._pandas_like.dataframe import PandasDataFrame
from narwhals.dependencies import get_cudf
from narwhals.dependencies import get_modin
from narwhals.dependencies import get_pandas
from narwhals.dependencies import get_polars
from narwhals.dtypes import to_narwhals_dtype
from narwhals.translate import get_cudf
from narwhals.translate import get_modin
from narwhals.translate import get_pandas
from narwhals.utils import parse_version
from narwhals.utils import validate_same_library

Expand Down
Loading

0 comments on commit 7889e9d

Please sign in to comment.