-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(View): enabling view in SemanticLayerSchema #1532
Conversation
…into release/v3 * 'release/v3' of https://github.com/Sinaptik-AI/pandas-ai: fix(VirtualDataframe): fixing virtual dataframe name conflict (Sinaptik-AI#1531)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to cf142de in 2 minutes and 14 seconds
More details
- Looked at
338
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
8
drafted comments based on config settings.
1. tests/unit_tests/dataframe/test_semantic_layer_schema.py:113
- Draft comment:
Theview
field should be a boolean value, not a string. UseTrue
instead of'true'
. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
While technically the comment is correct that True would be more proper than "true", the code is working as intended since the validation layer handles the type conversion. The test passes with the current implementation. Making this change would be a minor style improvement but doesn't affect functionality.
The comment points out a real style issue. Using proper boolean types is generally better practice than string representations of booleans.
However, since the validation layer already handles this gracefully and the tests pass, this is a very minor style issue that doesn't impact functionality or readability significantly.
Delete the comment. While technically correct, it's a minor style suggestion that doesn't materially improve the code, and the current implementation works correctly.
2. pandasai/data_loader/semantic_layer_schema.py:93
- Draft comment:
The error message here can be improved for clarity. Consider using:
raise ValueError("A schema can only define either 'table' or 'view', not both.")
- Reason this comment was not posted:
Marked as duplicate.
3. pandasai/data_loader/semantic_layer_schema.py:95
- Draft comment:
The error message here can be improved for clarity. Consider using:
raise ValueError("A schema must define either 'table' or 'view'.")
- Reason this comment was not posted:
Marked as duplicate.
4. pandasai/data_loader/semantic_layer_schema.py:156
- Draft comment:
The error message here can be improved for clarity. Consider using:
raise ValueError("Column names must be unique. Duplicate column names detected.")
- Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The original error message is already clear and grammatically correct. The suggested change is extremely minor - changing "names found" to "column names detected". This doesn't make the message meaningfully clearer or fix any actual problems. The suggestion feels like personal preference rather than a meaningful improvement.
Perhaps in some contexts, being explicit about "column names" rather than just "names" could prevent ambiguity. The word "detected" may be more precise than "found".
While those points are valid, the original message is already sufficiently clear in context - we're clearly talking about column names from the surrounding code, and "found" vs "detected" is purely stylistic.
This comment should be removed as it suggests an extremely minor rewording that doesn't meaningfully improve clarity or fix any actual issues with the error message.
5. pandasai/data_loader/semantic_layer_schema.py:175
- Draft comment:
The error message here can be improved for clarity. Consider using:
"All column names in a view must follow the format '[table].[column]'."
- Reason this comment was not posted:
Marked as duplicate.
6. pandasai/data_loader/semantic_layer_schema.py:183
- Draft comment:
The error message here can be improved for clarity. Consider using:
"All 'from' and 'to' parameters in relations must follow the format '[table].[column]'."
- Reason this comment was not posted:
Marked as duplicate.
7. pandasai/data_loader/semantic_layer_schema.py:191
- Draft comment:
The error message here can be improved for clarity. Consider using:
f"No relations defined for the following tables: {uncovered_tables}."
- Reason this comment was not posted:
Marked as duplicate.
8. pandasai/data_loader/semantic_layer_schema.py:195
- Draft comment:
The error message here can be improved for clarity. Consider using:
"All column names in a table must not include a table prefix."
- Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_GNrarDy3W8RIJwod
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
@@ -120,6 +140,61 @@ class SemanticLayerSchema(BaseModel): | |||
None, description="Frequency of dataset updates." | |||
) | |||
|
|||
@model_validator(mode="after") | |||
def check_columns_relations(self): | |||
column_re_check = r"^[a-zA-Z_]+\.[a-zA-Z_]+$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex ^[a-zA-Z_]+\.[a-zA-Z_]+$
does not account for numeric characters in table or column names. Consider updating it to ^[a-zA-Z0-9_]+\.[a-zA-Z0-9_]+$
to allow numeric characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ellipsis-dev can you commit the changes?
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
…ing view in SemanticLayerSchema);
…ing view in SemanticLayerSchema);
docs/v3/semantic-layer.mdx
Outdated
3. **Relationships for Views**: | ||
- Each table referenced in `columns` must have at least one relationship defined in `relations`. | ||
- Relationships must specify `from` and `to` attributes in the `[table].[column]` format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well documented 💯
…icLayerSchema (#1534) * fix(VirtualDataframe): fixing virtual dataframe name conflict * feature(View): enabling view in SemanticLayerSchema * Update pandasai/data_loader/semantic_layer_schema.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * address comments left by @scaliseraoul on #1532 (feature(View): enabling view in SemanticLayerSchema); --------- Co-authored-by: Raoul <[email protected]> Co-authored-by: Raoul Scalise <[email protected]> Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Important
Adds view support to
SemanticLayerSchema
with new constraints, validation logic, and documentation updates.SemanticLayerSchema
with constraints for column and relation formats.table
andview
inSource
.Relation
class to define relationships in views.Source
class to includeview
attribute.check_columns_relations()
inSemanticLayerSchema
for view-specific validation.validate_type_and_fields()
inSource
for view constraints.semantic-layer.mdx
with view configuration and constraints.test_semantic_layer_schema.py
.This description was created by for cf142de. It will automatically update as commits are pushed.