Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a range option to Worksheet.get_notes [Issue #1482] #1487

Merged
merged 14 commits into from
Oct 7, 2024

Conversation

muddi900
Copy link
Contributor

Fixes #1482

@alifeee
Copy link
Collaborator

alifeee commented Jun 25, 2024

hi! thanks for the change! your code was all good, but the test cassette was breaking things. This is common, and what I usually do is completely delete the JSON test cassette and re-create it, effectively running:

rm tests\cassettes\WorksheetTest.test_get_notes.json
GS_CREDS_FILENAME="tests/creds.json"
GS_RECORD_MODE="all"
pytest -k "test_get_notes" -v -s tests/ 
# check it worked
GS_RECORD_MODE="none"
pytest -k "test_get_notes" -v -s tests/ 

I have reformatted the test a little so it is easier to read for me, and changed range to grid_range, as range is a python built-in keyword that we should not change.

Also, in testing, I noticed that get_notes did not work with worksheets that were not the primary sheet, so I have fixed this (see 16429af and 983dcb5 above) too.

again thanks for the change! I will let @lavigne958 read this and merge when happy with it :)

@nbwzx
Copy link
Contributor

nbwzx commented Jun 29, 2024

Also, in testing, I noticed that get_notes did not work with worksheets that were not the primary sheet, so I have fixed this (see 16429af and 983dcb5 above) too.

Wait is that wrong?

For example, there are three worksheets (Sheet1, Sheet2, Sheet3).

res["sheets"][self.index]["data"][0].get("rowData", [{}]) returns the notes of current worksheet.
self.spreadsheet.worksheets()[0].get_notes() get the notes of Sheet1
self.spreadsheet.worksheets()[1].get_notes() get the notes of Sheet2
self.spreadsheet.worksheets()[2].get_notes() get the notes of Sheet3

If you change self.index to 0, you can only get the notes of the primary sheet (Sheet1),
self.spreadsheet.worksheets()[0].get_notes() get the notes of Sheet1
self.spreadsheet.worksheets()[1].get_notes() get the notes of Sheet1
self.spreadsheet.worksheets()[2].get_notes() get the notes of Sheet1
how can you get the notes of other worksheets (Sheet2, Sheet3)?

@alifeee
Copy link
Collaborator

alifeee commented Jun 29, 2024

Wait is that wrong?

As it turns out... yes and no! ;]

When you provide a range, i.e., here

https://github.com/muddi900/gspread/blob/d4a57746816ad097d56969cfab4c9c12a483dff6/gspread/worksheet.py#L2657-L2658

Then the returned results has only one sheet. Printing res, we get

> get_notes()
# res = 
{'sheets':
  [
    {'data': [{}]},
    {'data': [{'rowData': [{'values': [{'note': 'the first time'}]}, {}, {'values': [{}, {'note': 'two sheets'}]}]}]}
  ]
}
> get_notes("A2:C3")
# res = 
{'sheets':
  [
    {'data': [{'rowData': [{}, {'values': [{}, {'note': 'two sheets'}]}]}]}
  ]
}

So in the initial case, we do need to index it with worksheet.index, i.e, 1

In the second case, we need to use 0. I had spotted the latter case, which is why I made my change. However, I didn't make a very good test to be able to spot the reason it had changed.

I have made the test clearer (split it out) and 'fixed' it by always providing worksheet.title, so we can only ever expect one sheet in the response.

Thanks for spotting @nbwzx ! :)

@alifeee alifeee changed the title Added a range option to Worksheet.get_range [Issue #1482] Added a range option to Worksheet.get_notes [Issue #1482] Jun 29, 2024
@alifeee alifeee requested a review from lavigne958 June 29, 2024 15:16
@alifeee alifeee added this to the 6.2.0 milestone Jun 29, 2024
@alifeee
Copy link
Collaborator

alifeee commented Jun 29, 2024

sorry for jumping on your PR @muddi900. wouldn't have happened without you ;] so thanks for proposing the change!

will let @lavigne958 review and with an approval I will merge, and this can be included in 6.2.0 :)

@muddi900
Copy link
Contributor Author

No need to apologize. Thank you for the help.

@nbwzx
Copy link
Contributor

nbwzx commented Jul 1, 2024

@alifeee
I tried your code on one of my sheets. The code get_notes() raises an error: gspread.exceptions.APIError: APIError: [400]: Range (Introduction!UF3) exceeds grid limits. Max rows: 966, max columns: 27.
But the code get_notes() runs well before this PR, even on a very big sheet.
I think the modifications about get_notes when grid_range=None reduces the limitation significantly.
So in my opinion, When grid_range is not given, using the previous code is a better way.

@muddi900
Copy link
Contributor Author

muddi900 commented Jul 1, 2024

Do you have a public big sheet I can test this on?

@nbwzx
Copy link
Contributor

nbwzx commented Jul 1, 2024

Do you have a public big sheet I can test this on?

The one I mentioned is

https://docs.google.com/spreadsheets/d/158F-jyu8ld8kbdD4I_Lqy4fsX5vvp_MQOq7Ofg8NdSI/

on Sheet name UF3.

@muddi900
Copy link
Contributor Author

muddi900 commented Jul 3, 2024

I am getting a different error when using the latest error.

APIError                                  Traceback (most recent call last)
Cell In[10], [line 1](vscode-notebook-cell:?execution_count=10&line=1)
----> [1](vscode-notebook-cell:?execution_count=10&line=1) notes = ws.get_notes()

File ~/src/gspread/gspread/worksheet.py:2665, in Worksheet.get_notes(self, default_empty_value, grid_range)
   [2614](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2614) """Returns a list of lists containing all notes in the sheet or range.
   [2615](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2615) 
   [2616](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2616) .. note::
   (...)
   [2654](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2654)     ]
   [2655](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2655) """
   [2656](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2656) params: ParamsType = {
   [2657](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2657)     "fields": "sheets.data.rowData.values.note",
   [2658](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2658)     "ranges": (
   (...)
   [2662](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2662)     ),
   [2663](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2663) }
-> [2665](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2665) res = self.client.spreadsheets_get(self.spreadsheet_id, params)
   [2667](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2667) # access 0th sheet because we specified a sheet with params["ranges"] above
   [2668](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/worksheet.py:2668) data = res["sheets"][0]["data"][0].get("rowData", [{}])

File ~/src/gspread/gspread/http_client.py:274, in HTTPClient.spreadsheets_get(self, id, params)
    [272](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:272) """A method stub that directly calls `spreadsheets.get <[https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/get>`_](https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/get%3E%60_)."""
    [273](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:273) url = SPREADSHEET_URL % id
--> [274](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:274) r = self.request("get", url, params=params)
    [275](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:275) return r.json()

File ~/src/gspread/gspread/http_client.py:128, in HTTPClient.request(self, method, endpoint, params, data, json, files, headers)
    [126](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:126)     return response
    [127](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:127) else:
--> [128](https://file+.vscode-resource.vscode-cdn.net/Users/mudassirchapra/src/gspread/~/src/gspread/gspread/http_client.py:128)     raise APIError(response)

APIError: APIError: [400]: Range (Introduction!UF3) exceeds grid limits. Max rows: 966, max columns: 27

Since the method was on the Worksheet instance, there was no way to get the wrong index on self.index. I think the original way would be better.

right now, the request params passthrough "{first_worksheet_name}!{self.title}" as the range, which causes the error.

gspread/worksheet.py Outdated Show resolved Hide resolved
@lavigne958
Copy link
Collaborator

Thank you everyone for the hard work and the all the edge case testing ! we managed to catch a very particular use case that we would not see otherwise 💪

I tried it and it works fine, it is well explained that the resulting matrix is not square.
it's quite complicated for a novice to use the example with fill_gaps I believe, should we simply offer an option to fill the gaps for the end user ? 🤔

something like:

  • option name: auto_fill_gaps
  • option type: bool
  • option default value: False (in order to prevent breaking change_

that requires us to do some computation based on the resulting matrix, meaning on a big big sheet we need to iterate over all rows to find the longest row before iterating again on all rows to expand them.

I'm fine with it, what do you guys think about it ?

@nbwzx
Copy link
Contributor

nbwzx commented Jul 3, 2024

I agree. I think it is not convenient for users to copy the relevant code from the documentation.

@muddi900
Copy link
Contributor Author

muddi900 commented Jul 4, 2024

@alifeee Sorry for the git reset shenanigans. I missed @lavigne958 comment, and tried to redo everything.

I have incorporated the changes suggested in the comment.

@alifeee
Copy link
Collaborator

alifeee commented Jul 5, 2024

@alifeee Sorry for the git reset shenanigans. I missed @lavigne958 comment, and tried to redo everything.

no worries :) I have refreshed the test cassette (by deleting tests/cassettes/WorksheetTest.test_get_notes.json and re-running the test with GS_RECORD_MODE=all)

Also, the git reset has removed the extra test that I made for 2nd sheet notes. Please could you add this back? You can see it in here:

muddi900/gspread@add/notes-range...burnash:gspread:add/notes-range-tests

Copy link
Collaborator

@lavigne958 lavigne958 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes are good.
The CI is failing due to outdated/missing cassette.
Please remove the recorded cassette: tests/cassettes/WorksheetTest.test_get_notes.json and record the cassettes again.

You can use the value new_episodes to only record the missing cassette.

.github/CONTRIBUTING.md Outdated Show resolved Hide resolved
.github/CONTRIBUTING.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@lavigne958 lavigne958 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your tests are still failing, please remove the file tests/cassettes/WorksheetTest.test_get_notes.json and run the test with the following command: GS_RECORD_MODE='new_episodes' CREDS_FILENAME=<path to credentials> tox -e py

@alifeee
Copy link
Collaborator

alifeee commented Jul 7, 2024

the test test_get_notes_2nd_sheet is still missing, please see #1487 (comment) for context :)

tests/worksheet_test.py Outdated Show resolved Hide resolved
@muddi900
Copy link
Contributor Author

So should autofill be a different issue. Because that would be a rewrite of the method.

@lavigne958
Copy link
Collaborator

So should autofill be a different issue. Because that would be a rewrite of the method.

I am sorry I don't understand your comment, could you please elaborate ?

@muddi900
Copy link
Contributor Author

Thank you everyone for the hard work and the all the edge case testing ! we managed to catch a very particular use case that we would not see otherwise 💪

I tried it and it works fine, it is well explained that the resulting matrix is not square. it's quite complicated for a novice to use the example with fill_gaps I believe, should we simply offer an option to fill the gaps for the end user ? 🤔

something like:

* option name: `auto_fill_gaps`

* option type: bool

* option default value: `False` (in order to prevent breaking change_

that requires us to do some computation based on the resulting matrix, meaning on a big big sheet we need to iterate over all rows to find the longest row before iterating again on all rows to expand them.

I'm fine with it, what do you guys think about it ?

I was referring to this

@lavigne958
Copy link
Collaborator

I see, thank you.
we can make it 2 different issues, though we can't make a release between the time each issue is merged, we want to provide the feature once and not provide the feature then provide a different way to use it.

@lavigne958
Copy link
Collaborator

I come back to the point about fill_gaps, the current feature requires the user to call fill_gaps. so for now we can't change the the output, we can merge this as is, once the cassettes have been updated and the no-op line has been removed.

If you don't have time or struggle with the cassettes @muddi900, just let me know and I'll cherry-pick you commits, you'll be mentioned as contributor still and we'll merge these changes soon.

@lavigne958
Copy link
Collaborator

Thanks @muddi900 for this contribution !

I fixed the missing cassettes and the small conflict with master branch, this is a go for mege.

@lavigne958 lavigne958 merged commit 1960c29 into burnash:master Oct 7, 2024
5 checks passed
@twosheds twosheds mentioned this pull request Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_notes from a range
4 participants