feat(Dataframe): add classmethod get_default_schema to generate default schema #1525

scaliseraoul · 2025-01-16T14:10:08Z

refactor pai.create to accept name, description, columns, removes cache from datasetloader

Tests added and passed if fixing a bug or adding a new feature.
All code checks passed.

Important

Add get_default_schema to DataFrame, refactor create to accept additional parameters, and remove caching from DatasetLoader.

Behavior:
- Add get_default_schema class method to DataFrame to generate default schema.
- Refactor create function in __init__.py to accept name, description, and columns parameters.
- Remove caching functionality from DatasetLoader in loader.py.
Tests:
- Update tests in test_loader.py, test_pandasai_init.py, and other test files to cover new create function parameters and get_default_schema method.
- Remove tests related to caching in test_loader.py.
Misc:
- Remove Destination import from base.py and other files where it is no longer used.

^{This description was created by}^{for 5e38b30. It will automatically update as commits are pushed.}

…lt schema, refactor pai.create to accept name, description, columns, removes cache from datasetloader

gventuri

All great, just a minor comment!

gventuri · 2025-01-16T14:14:06Z

tests/unit_tests/dataframe/test_loader.py

-            assert loader._is_cache_valid("dummy_path") is False
-
-    def test_read_cache_parquet(self, sample_schema):
+    def test_read_csv_or_parquet_parquet(self, sample_schema):


Should be def test_read_csv_or_parquet?

ellipsis-dev

👍 Looks good to me! Reviewed everything up to 5e38b30 in 1 minute and 19 seconds

More details

Looked at 823 lines of code in 7 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. pandasai/__init__.py:46

Draft comment:
Add a note in the docstring to specify that each dictionary in columns must contain at least the 'name' key, and optionally 'type' and 'description'.
Reason this comment was not posted:
Confidence changes required: 50%
The PR refactors the create function to accept additional parameters like name, description, and columns. However, the columns parameter is expected to be a list of dictionaries with specific keys. If the dictionaries do not have the required keys, it will raise an error when trying to create Column objects. This should be documented in the function's docstring to inform users of the expected structure.

2. pandasai/data_loader/loader.py:43

Draft comment:
Caching functionality has been removed. Ensure this change is intentional and consider the performance implications of loading datasets without caching.
Reason this comment was not posted:
Confidence changes required: 50%
The PR removes caching functionality from the DatasetLoader class. This is a significant change that could impact performance if datasets are frequently loaded from local sources. It's important to ensure that this change is intentional and that the performance implications are acceptable.

3. pandasai/__init__.py:91

Draft comment:
Ensure that the schema has default values for name and description to avoid potential AttributeError if they are accessed without being set.
Reason this comment was not posted:
Confidence changes required: 50%
The PR modifies the create function to update the schema's name and description based on the provided parameters. However, it does not handle the case where the schema might not have a name or description initially, which could lead to unexpected behavior if these attributes are accessed later.

Workflow ID: wflow_U2xCzZ0U4TdDvt7y

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

feat(Dataframe): add classmethod get_default_schema to generate defau…

02f5cf8

…lt schema, refactor pai.create to accept name, description, columns, removes cache from datasetloader

gventuri reviewed Jan 16, 2025

View reviewed changes

fix: windows test formatting issue

5e38b30

scaliseraoul marked this pull request as ready for review January 16, 2025 15:05

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 16, 2025

ellipsis-dev bot reviewed Jan 16, 2025

View reviewed changes

gventuri merged commit c5a2a13 into Sinaptik-AI:release/v3 Jan 16, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(Dataframe): add classmethod get_default_schema to generate default schema #1525

feat(Dataframe): add classmethod get_default_schema to generate default schema #1525

scaliseraoul commented Jan 16, 2025 •

edited by ellipsis-dev bot

Loading

gventuri left a comment

gventuri Jan 16, 2025

ellipsis-dev bot left a comment

feat(Dataframe): add classmethod get_default_schema to generate default schema #1525

feat(Dataframe): add classmethod get_default_schema to generate default schema #1525

Conversation

scaliseraoul commented Jan 16, 2025 • edited by ellipsis-dev bot Loading

gventuri left a comment

Choose a reason for hiding this comment

gventuri Jan 16, 2025

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

scaliseraoul commented Jan 16, 2025 •

edited by ellipsis-dev bot

Loading