PanicException after simple filter operation with LazyFrame on big dataset #20894
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
I couldn't produce some random data where the bug still persists so please find the data I used here (8 % of the original dataset size):
https://drive.proton.me/urls/XQWVS448J4#1yibTP9fy9ny
You have to keep the folder structure for the hive_partitioning.
Log output
Issue description
I have a big dataset where I am trying to filter for a list of importers to further process the data. This works in eager mode but throws an exception in lazy mode: "PanicException: The column lengths in the DataFrame are not equal." The issue does not persists if I downsample the data even more.
Expected behavior
Eager mode and lazy mode should return the exact same results and not fail.
Installed versions
The text was updated successfully, but these errors were encountered: