-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Dataframe): add classmethod get_default_schema to generate default schema #1525
Conversation
…lt schema, refactor pai.create to accept name, description, columns, removes cache from datasetloader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All great, just a minor comment!
assert loader._is_cache_valid("dummy_path") is False | ||
|
||
def test_read_cache_parquet(self, sample_schema): | ||
def test_read_csv_or_parquet_parquet(self, sample_schema): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be def test_read_csv_or_parquet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 5e38b30 in 1 minute and 19 seconds
More details
- Looked at
823
lines of code in7
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. pandasai/__init__.py:46
- Draft comment:
Add a note in the docstring to specify that each dictionary incolumns
must contain at least the 'name' key, and optionally 'type' and 'description'. - Reason this comment was not posted:
Confidence changes required:50%
The PR refactors thecreate
function to accept additional parameters likename
,description
, andcolumns
. However, thecolumns
parameter is expected to be a list of dictionaries with specific keys. If the dictionaries do not have the required keys, it will raise an error when trying to createColumn
objects. This should be documented in the function's docstring to inform users of the expected structure.
2. pandasai/data_loader/loader.py:43
- Draft comment:
Caching functionality has been removed. Ensure this change is intentional and consider the performance implications of loading datasets without caching. - Reason this comment was not posted:
Confidence changes required:50%
The PR removes caching functionality from theDatasetLoader
class. This is a significant change that could impact performance if datasets are frequently loaded from local sources. It's important to ensure that this change is intentional and that the performance implications are acceptable.
3. pandasai/__init__.py:91
- Draft comment:
Ensure that the schema has default values forname
anddescription
to avoid potentialAttributeError
if they are accessed without being set. - Reason this comment was not posted:
Confidence changes required:50%
The PR modifies thecreate
function to update the schema's name and description based on the provided parameters. However, it does not handle the case where the schema might not have a name or description initially, which could lead to unexpected behavior if these attributes are accessed later.
Workflow ID: wflow_U2xCzZ0U4TdDvt7y
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
refactor pai.create to accept name, description, columns, removes cache from datasetloader
Important
Add
get_default_schema
toDataFrame
, refactorcreate
to accept additional parameters, and remove caching fromDatasetLoader
.get_default_schema
class method toDataFrame
to generate default schema.create
function in__init__.py
to acceptname
,description
, andcolumns
parameters.DatasetLoader
inloader.py
.test_loader.py
,test_pandasai_init.py
, and other test files to cover newcreate
function parameters andget_default_schema
method.test_loader.py
.Destination
import frombase.py
and other files where it is no longer used.This description was created by for 5e38b30. It will automatically update as commits are pushed.