Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert query_region to cone search #2477

Merged
merged 3 commits into from
Sep 11, 2022
Merged

Conversation

weaverba137
Copy link
Member

This PR closes #585.

query_region() now uses query_crossid() internally with slightly different parameters. Effectively, a circular/cone region around each coordinate is searched instead of a rectangle in previous versions.

In addition, this PR adds lots of updates to tests, both online and offline, and cleans up the documentation of several of the methods.

Other comments:

  • query_region() and query_crossid() accept astropy.coordinates.Angle for search radius as well as string or Quantity; internally, Angle is used to validate the search radius input and convert the angle to arcmin.
  • The maximum search radius imposed by SDSS is 3 arcmin. This is now enforced by the code.
  • Added DR17 to offline tests; it had already been enabled for online tests.

fixed up tests

restore queries to pre-branch format

all tests now passing

allow Angle inputs
@codecov
Copy link

codecov bot commented Aug 2, 2022

Codecov Report

Merging #2477 (db38571) into main (b1fcfff) will increase coverage by 0.10%.
The diff coverage is 97.59%.

@@            Coverage Diff             @@
##             main    #2477      +/-   ##
==========================================
+ Coverage   62.92%   63.03%   +0.10%     
==========================================
  Files         133      133              
  Lines       17302    17301       -1     
==========================================
+ Hits        10888    10905      +17     
+ Misses       6414     6396      -18     
Impacted Files Coverage Δ
astroquery/sdss/core.py 92.06% <97.59%> (+6.15%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@bsipocz bsipocz added the sdss label Aug 2, 2022
@keflavich
Copy link
Contributor

This is a pretty big PR that looks good at a glance. @bsipocz, will you want to do a detailed review? I'm ready to approve this, but I haven't checked every line of the tests.

@bsipocz
Copy link
Member

bsipocz commented Aug 2, 2022

Yes, I definitely want to review this with an ETA of maybe next week as this one is already been rather full.

@keflavich keflavich requested a review from bsipocz August 2, 2022 19:35
Copy link
Member

@eerovaher eerovaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments, but I wont claim to have done a thorough review.

astroquery/sdss/core.py Show resolved Hide resolved
astroquery/sdss/core.py Outdated Show resolved Hide resolved
astroquery/sdss/core.py Show resolved Hide resolved
astroquery/sdss/core.py Show resolved Hide resolved
astroquery/sdss/core.py Outdated Show resolved Hide resolved
astroquery/sdss/core.py Outdated Show resolved Hide resolved
astroquery/sdss/core.py Show resolved Hide resolved
astroquery/sdss/core.py Show resolved Hide resolved
astroquery/sdss/core.py Outdated Show resolved Hide resolved
astroquery/sdss/tests/test_sdss.py Outdated Show resolved Hide resolved
@weaverba137
Copy link
Member Author

In regards keyword-only arguments: I have converted the entire module except for the get_images and get_spectra functions, which have a fairly complicated call signature. I think that is sufficient for the current PR, given that the original target of the PR was to convert to cone-search, not specifically to change to keyword-only arguments.

@bsipocz bsipocz added this to the v0.4.7 milestone Sep 11, 2022
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it look good, bar one comment about the removal of get_query_payload, which I would like to be back, and to e.g. return the SQL string.
(It proved to be rather useful in many cases in the past.)

Anyway, I don't think it's worth dragging this largeish PR with that point further and will open an issue for a follow-up instead.

As the other review pointed out, there are a few places where a little bit of cleanup could be done, but I agree that that one is definitely beyond the scope here.

@@ -484,7 +520,7 @@ class = 'galaxy' \

def get_spectra_async(self, coordinates=None, radius=2. * u.arcsec,
matches=None, plate=None, fiberID=None, mjd=None,
timeout=TIMEOUT, get_query_payload=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't remove the get_query_payload functionality. I see that it is sadly not present consistently in all methods, but we should rather fix (can be done in a follow-up PR) it than to remove.

@bsipocz bsipocz merged commit 5c1eefc into astropy:main Sep 11, 2022
@bsipocz
Copy link
Member

bsipocz commented Sep 11, 2022

Thanks @weaverba137!

@bsipocz
Copy link
Member

bsipocz commented Sep 11, 2022

After merging this PR, I've gone through and triaged the sdss labelled issues, and happy to say it has fixed a bunch of them. There are still some long-reported bugs, but after the triage, we have half the number of open issues than before!

🎉

@pllim
Copy link
Member

pllim commented Sep 11, 2022

Thanks!

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

Just out of curiosity, is it still possible to use astroquery to make larger radius queries similar to the box version previously used by astroquery.query_region?

You know how the saying goes with old code and user workflows…, but we had built an automated pipeline around large box size crossmatches 😅.

the limit of 3 arcmin is way too small and needs a hefty workaround based on SDSS fields..

we can currently pin the old astroquery version and patch in future fixes that don’t touch this code but it’s not ideal.

As a note from a user, it would have been nice to get a depreciation warning for an API breaking change like this, one that lasted a little while to give us time to adjust. It seems to me the switchover happened rapidly, and we have to do workarounds such as building the old functionality from the git history ourselves and patching in future fixes, or switch to our own SQL queries/casjobs.

One way to enable backwards compatibility is to rename the old version to something like query_rectangle_region or to add a parameter such as rectangular: boolean = False, that preserves the breakage while the deprecation warning is active.

Anyway, hope this can be considered.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

@emirkmo - you can write any custom SQL query, so I would suggest looking into using query_sql() for larger sized queries?

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

Thanks for the advice. It is actually easy to solve as you say, but moving to a custom SQL query is exactly what I was hoping to avoid by using astroquery. I made an issue to track what I guess is actually a feature request to re-enable the rectangular search, as it has a tangible benefit, and the old implementation in astroquery from a few months ago still just works.

@pllim
Copy link
Member

pllim commented Feb 8, 2023

I am also interested in this request because the 3 arcmin limit seems pretty arbitrary and I am not sure how to explain to users why suddenly we have this limit downstream.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

cc @weaverba137

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

Frankly, we can still roll back that API change (it hasn't landed in a tagged release), but I would need a bit more feedback (doing the proper conesearch was raised as a request a few times in the past...)

cc @keflavich

@keflavich
Copy link
Contributor

I've been listening in on this, and I'd like to hear @weaverba137 's take. I think rolling back the API change makes sense; probably best would be to allow either rectangular or circular searches, which is what we do with some other modules (vizier, I think? I'd have to dig more).

Generally I agree that, for something as basic as a rectangle or circle query, we want to provide the simplified user interface and avoid having users write SQL.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

OK, but here the underlying difference is huge. E.g. for the rectangular one we just wrote a brute force +/- ra/dec box, while the radius search is using a different server-side service.

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

Frankly, we can still roll back that API change (it hasn't landed in a tagged release), but I would need a bit more feedback (doing the proper conesearch was raised as a request a few times in the past...)

cc @keflavich

I think the changes are actually very nice, especially with more proper reporting of errors and test failures. Run query also works as expected now given the "radius" argument. I would just appreciate having access to the rectangular search as well. I will leave any decisions up to the maintainers of course.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

I would see value in not having API change for the old setup, but then we run into the core of the previous issues, e.g. that query_region is not actually a cone search, but a box while a cone search is available. Adding a kwarg to fall back to the previous behaviour is still an API change, so while semantically would work, I'm looking for other options, too, maybe the we need a new function name for the current cone search-based implementation. (but I don't have any good ideas how to call it.)

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

query_cone_region

query_rectangle_region

Is what I would look for in the docs (and first looked for).

Where either one or the other just calls query_region under the hood. That depends on whether backwards compatibility is more important or whether the function arguments making sense is more important, for query_region.

The docstring for either can point to each other and mention the different endpoints used and the radius limitation.

Thanks for the follow-up!

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

Yeah, let's wait for Ben's take and then we'll figure out something that brings in the least amount of API change.

@keflavich
Copy link
Contributor

If we do stick with the API change, we should include a deprecation warning that points users to the right way to do things.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

I'm starting to incline towards adding a kwarg to query_region that defaults to rectangular. Quite ugly, but that would be seamless, and no API change for all the previous use cases (I totally missed this server-side limitation on the query size, so the change doesn't feel like a simple backend refactoring any more to make the code more efficient and to fix a bug)

@weaverba137
Copy link
Member Author

Before I reply fully, I'm not entirely sure this is the thread we should continue the discussion on. For one, it's harder to find in the GitHub system since it's closed. Another open issue or PR might be easier for everyone to see.

@emirkmo
Copy link

emirkmo commented Feb 8, 2023

Before I reply fully, I'm not entirely sure this is the thread we should continue the discussion on. For one, it's harder to find in the GitHub system since it's closed. Another open issue or PR might be easier for everyone to see.

I opened #2659

@weaverba137
Copy link
Member Author

Unless anyone objects, I am happy to move over to #2659.

@bsipocz
Copy link
Member

bsipocz commented Feb 8, 2023

Yes, please move this convo to the issue linked above!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make SDSS query region a circle
6 participants