-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Dataframe): pull method to fetch dataset from remote server #1446
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to eebfcb5 in 1 minute and 34 seconds
More details
- Looked at
192
lines of code in5
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. pandasai/dataframe/base.py:262
- Draft comment:
Consider using theget_pandaai_session()
function to obtain a session and make the request instead of usingrequests.get
directly. This ensures consistent request handling across the codebase. - Reason this comment was not posted:
Marked as duplicate.
2. pandasai/dataframe/base.py:258
- Draft comment:
Check ifapi_key
andapi_url
areNone
before using them, and raise aPandasAIApiKeyError
if they are not set. This prevents potentialTypeError
when constructing headers or making requests. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_Bi96VOywUjApszj1
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
pandasai/__init__.py
Outdated
api_url = os.environ.get("PANDAAI_API_URL", None) | ||
headers = {"accept": "application/json", "x-authorization": f"Bearer {api_key}"} | ||
|
||
file_data = requests.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the get_pandaai_session()
function to obtain a session and make the request instead of using requests.get
directly. This ensures consistent request handling across the codebase.
@@ -74,6 +81,21 @@ | |||
DataFrame: A new PandasAI DataFrame instance with loaded data. | |||
""" | |||
global _dataset_loader | |||
dataset_full_path = os.path.join(find_project_root(), "datasets", dataset_path) | |||
if not os.path.exists(dataset_full_path): | |||
api_key = os.environ.get("PANDAAI_API_KEY", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if api_key
and api_url
are None
before using them, and raise a PandasAIApiKeyError
if they are not set. This prevents potential TypeError
when constructing headers or making requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 5fd17ac in 1 minute and 2 seconds
More details
- Looked at
212
lines of code in5
files - Skipped
0
files when reviewing. - Skipped posting
5
drafted comments based on config settings.
1. pandasai/dataframe/base.py:261
- Draft comment:
The error message inPandasAIApiKeyError
is misleading. It mentions pushing datasets, but the context is about pulling datasets. Consider changing it to "Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server." - Reason this comment was not posted:
Marked as duplicate.
2. pandasai/helpers/request.py:35
- Draft comment:
The error message inPandasAIApiKeyError
is misleading. It mentions pushing datasets, but the context is about pulling datasets. Consider changing it to "Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server." - Reason this comment was not posted:
Marked as duplicate.
3. pandasai/__init__.py:91
- Draft comment:
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server"
- Reason this comment was not posted:
Marked as duplicate.
4. pandasai/dataframe/base.py:262
- Draft comment:
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server"
- Reason this comment was not posted:
Marked as duplicate.
5. pandasai/helpers/request.py:36
- Draft comment:
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server"
- Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_vULjCRdxqgSJJMf9
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
pandasai/__init__.py
Outdated
api_url = os.environ.get("PANDAAI_API_URL", None) | ||
if not api_url or not api_key: | ||
raise PandasAIApiKeyError( | ||
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to push dataset to the remote server" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message in PandasAIApiKeyError
is misleading. It mentions pushing datasets, but the context is about pulling datasets. Consider changing it to "Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on ab21002 in 48 seconds
More details
- Looked at
41
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. pandasai/dataframe/base.py:261
- Draft comment:
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to pull dataset from the remote server"
- Reason this comment was not posted:
Marked as duplicate.
2. pandasai/helpers/request.py:106
- Draft comment:
"Set PANDAAI_API_URL and PANDAAI_API_KEY in environment to push/pull dataset from the remote server"
- Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_kbXHVA7JgpmqAh5q
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Important
Adds remote dataset fetching with a new
pull
method, updates data loading, and introducesDatasetNotFound
exception.pull
method inDataFrame
class inbase.py
to fetch datasets from a remote server.load
function in__init__.py
to fetch datasets from a remote server if not found locally.DatasetNotFound
exception inexceptions.py
for handling missing datasets.get_pandaai_session
function inrequest.py
to create a session with API key and URL._read_cache
inloader.py
to includepath
inDataFrame
initialization for cache reading.bamboo_vectorstore.py
to handle API responses correctly.This description was created by for ab21002. It will automatically update as commits are pushed.