Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor integration tests to remove random collection sampling #749

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
413b086
Extract duplicated function
mfisher87 Jul 6, 2024
0ffc1b4
Add popular collection script proof of concept
mfisher87 Jul 6, 2024
8123e3a
Use union type instead of union operator
mfisher87 Jul 6, 2024
7e4c0f9
Remove logic which accepts up to 10% integration test failure
mfisher87 Jul 23, 2024
ef7bc0a
Enable import of sampling test utility function
mfisher87 Jul 23, 2024
a371dc2
Adjust test logging/docstrings for consistent & correct language
mfisher87 Jul 23, 2024
668c18e
Update generate script to fail if paging
mfisher87 Aug 6, 2024
1b09388
Add helper function to sample from collection list file
mfisher87 Aug 6, 2024
022613d
Fix granule sampling bug that can result in dupes
mfisher87 Aug 6, 2024
cc1c932
Update test parameter schema (WIP)
mfisher87 Aug 6, 2024
b1d39ea
Make granules hashable, fix granule sampling logic
mfisher87 Aug 20, 2024
e1f635b
Remove random collection sampling from test module
mfisher87 Nov 26, 2024
3f13536
loop through all providers while generating collection lists
danielfromearth Aug 20, 2024
fae144a
add collection text files for all currently listed providers
danielfromearth Aug 20, 2024
3a09c11
Re-enable temporarily commented test parameters
mfisher87 Aug 21, 2024
e5aef6b
Update uv pre-commit hook
mfisher87 Nov 26, 2024
6e0ca0b
Extract test parameter type
mfisher87 Nov 26, 2024
85c4bb9
Run integration tests on the main branch
mfisher87 Nov 26, 2024
b26e7f3
Skip failing Kerchunk tests
mfisher87 Nov 26, 2024
bbe4878
Re-add LPDAAC on-prem open tests
mfisher87 Nov 26, 2024
1b9de98
Remove random collection sampling from all tests
mfisher87 Nov 26, 2024
ebfad63
Fix typechecker error
mfisher87 Nov 26, 2024
5b40af6
Fix incorrect annotation
mfisher87 Nov 26, 2024
b063347
HACK: Ignore unexpected mypy error
mfisher87 Nov 26, 2024
1d3df25
Update uv lock file
mfisher87 Nov 26, 2024
670b0f1
Cleanup unused import
mfisher87 Nov 26, 2024
ffc1fec
Add documentation on integration tests
mfisher87 Nov 27, 2024
bf8b3d6
Merge branch 'main' into integration-tests-refactor
mfisher87 Dec 4, 2024
45b12c4
Merge branch 'main' into integration-tests-refactor
mfisher87 Jan 18, 2025
c28b8fa
Unmark kerchunk tests as broken
mfisher87 Jan 18, 2025
836258e
Use pathlib method to write text to file
mfisher87 Jan 22, 2025
2000420
Be explicit about what data we're caching
mfisher87 Jan 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/contributing/integration-tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Integration tests

## Testing most popular datasets

Some integration tests operate on the most popular collections for each provider in CMR.
Those collection IDs are cached as static data in `tests/integration/popular_collections/`
to give our test suite more stability. The list of most popular collections can be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!

updated by running a script in the same directory.

Sometimes, we find collections with unexpected conditions, like 0 granules, and
therefore "comment" those collections from the list by prefixing the line with a `#`.

!!! note

Let's consider a CSV format for this data; we may want to, for example, allow
skipping collections with a EULA by representing that as a column.
3 changes: 3 additions & 0 deletions earthaccess/results.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,9 @@ def _repr_html_(self) -> str:
granule_html_repr = _repr_granule_html(self)
return granule_html_repr

def __hash__(self) -> int: # type: ignore[override]
return hash(self["meta"]["concept-id"])
Copy link
Collaborator Author

@mfisher87 mfisher87 Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betolink @chuckwondo This seems reasonable to me, but please validate me :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it for like 5 minutes, this is obviously a bad idea. This class is subclassing dict. We'd need to implement like a frozendict.


def get_s3_credentials_endpoint(self) -> Union[str, None]:
for link in self["umm"]["RelatedUrls"]:
if "/s3credentials" in link["URL"]:
Expand Down
12 changes: 7 additions & 5 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,13 +60,15 @@ nav:
- "What is earthaccess?": "index.md"
- "Quick start": "quick-start.md"
- "Work with us":
- "contributing/index.md"
- "Development": "contributing/development.md"
- "Releasing": "contributing/releasing.md"
- "Our meet-ups": "contributing/our-meet-ups.md"
- "contributing/index.md" # << Link target of the parent node
- "Development Guide": "contributing/development.md"
- "Releasing Guide": "contributing/releasing.md"
- "Maintainers Guide": "contributing/maintainers-guide.md"
- "Code of Conduct": "contributing/code-of-conduct.md"
- "Contributing naming convention": "contributing/naming-convention.md"
- "Meet-ups": "contributing/our-meet-ups.md"
- "Topics":
- "Naming conventions": "contributing/naming-convention.md"
- "Integration tests": "contributing/integration-tests.md"
- "Resources": "resources.md"
- USER GUIDE:
- "user_guide/index.md"
Expand Down
5 changes: 0 additions & 5 deletions noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,6 @@ def integration_tests(session: nox.Session) -> None:
EARTHDATA_USERNAME=os.environ["EARTHDATA_USERNAME"],
EARTHDATA_PASSWORD=os.environ["EARTHDATA_PASSWORD"],
),
external=True,
# NOTE: integration test are permitted to pass if the failure rate was less than a hardcoded threshold.
# PyTest will return 99 if there were some failures, but less than the threshold. For more details, see:
# `pytest_sessionfinish` in tests/integration/conftest.py
success_codes=[0, 99],
)


Expand Down
Empty file added tests/integration/__init__.py
Empty file.
40 changes: 0 additions & 40 deletions tests/integration/conftest.py
Original file line number Diff line number Diff line change
@@ -1,49 +1,9 @@
import os
import pathlib
from warnings import warn

import earthaccess
import pytest

ACCEPTABLE_FAILURE_RATE = 10


@pytest.hookimpl()
def pytest_sessionfinish(session, exitstatus):
"""Return exit code 99 if up to N% of tests have failed.

N = ACCEPTABLE_FAILURE_RATE

99 was chosen arbitrarily to avoid conflict with current and future pytest error
codes (https://docs.pytest.org/en/stable/reference/exit-codes.html), and avoid
other exit codes with special meanings
(https://tldp.org/LDP/abs/html/exitcodes.html).

IMPORTANT: This is calculated against every test collected in the session, so the
ratio will change depending on which tests are executed! E.g. executing integration
tests and unit tests at the same time allows more tests to fail than executing
integration tests alone.

NOTE: The return exit code can be customized with the `EARTHACCESS_ALLOWABLE_FAILURE_STATUS_CODE`
environment variable.
"""
if exitstatus != pytest.ExitCode.TESTS_FAILED:
# Exit status 1 in PyTest indicates "Tests were collected and run but some of
# the tests failed". In all other cases, for example "an internal error happened
# while executing the tests", or "test execution interrupted by the user", we
# want to defer to original pytest behavior.
return

failure_rate = (100.0 * session.testsfailed) / session.testscollected
if failure_rate <= ACCEPTABLE_FAILURE_RATE:
status_code = os.environ.get("EARTHACCESS_ALLOWABLE_FAILURE_STATUS_CODE", 99)
warn(
f"\nWARNING: The integration test suite has returned {status_code} because the "
"failure rate was less than a hardcoded threshold. For more details see:\n"
"`pytest_sessionfinish` in tests/integration/conftest.py."
)
session.exitstatus = status_code


@pytest.fixture
def mock_missing_netrc(tmp_path: pathlib.Path, monkeypatch: pytest.MonkeyPatch):
Expand Down
17 changes: 17 additions & 0 deletions tests/integration/param.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from typing import TypedDict


class TestParam(TypedDict):
provider_name: str

# How many of the top collections we will test, e.g. top 3 collections
n_for_top_collections: int

# How many granules we will query
granules_count: int

# How many granules we will randomly select from the query
granules_sample_size: int

# The maximum allowed granule size; if larger we'll try to find another one
granules_max_size_mb: int
100 changes: 100 additions & 0 deletions tests/integration/popular_collections/ASF.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
C1214470488-ASF
C1327985661-ASF
C1214470533-ASF
C1206487504-ASF
C1214353986-ASF
C1327985645-ASF
C1595422627-ASF
C1206485527-ASF
C2011599335-ASF
C1206485320-ASF
C1206485940-ASF
C1206487217-ASF
C1214470496-ASF
C1214470576-ASF
C1206156901-ASF
C1214470532-ASF
C1214472977-ASF
C1808440897-ASF
C1214472994-ASF
C1327985660-ASF
C1266376001-ASF
C1206936391-ASF
C1327985617-ASF
C1327985741-ASF
C1214472336-ASF
C1327985579-ASF
C1243122884-ASF
C1327985740-ASF
C1214470732-ASF
C1327985650-ASF
C1206897141-ASF
C1208794942-ASF
C1243124139-ASF
C1327985619-ASF
C1327985646-ASF
C1208662092-ASF
C1214473170-ASF
C1327985739-ASF
C1327985578-ASF
C1206500826-ASF
C1214470561-ASF
C1209373626-ASF
C1207933168-ASF
C1211627521-ASF
C1210197768-ASF
C1214471521-ASF
C1214336717-ASF
C1214336045-ASF
C1214470682-ASF
C1206500991-ASF
C1213921661-ASF
C1327985571-ASF
C2859376221-ASF
C1327985647-ASF
C1214337770-ASF
C1214343609-ASF
C1243124754-ASF
C1214353859-ASF
C1214353754-ASF
C1213928843-ASF
C1214353593-ASF
C1214336154-ASF
C1214354031-ASF
C1661710588-ASF
C1214408428-ASF
C1214419355-ASF
C1661710600-ASF
C1214354144-ASF
C1214354235-ASF
C1214336554-ASF
C1214335430-ASF
C1213925022-ASF
C1213927035-ASF
C1213927939-ASF
C1214335471-ASF
C179001730-ASF
C1213928209-ASF
C1214335903-ASF
C1213926419-ASF
C1661710597-ASF
C1661710603-ASF
C1661710604-ASF
C1243215430-ASF
C1213926777-ASF
C1243168866-ASF
C1214474435-ASF
C1214473367-ASF
C1661710583-ASF
C1661710590-ASF
C1213921626-ASF
C1214473171-ASF
C1214473839-ASF
C1243162394-ASF
C1214473624-ASF
C1243228612-ASF
C1243255360-ASF
C1243140611-ASF
C2795135668-ASF
C2803501758-ASF
C2777443834-ASF
100 changes: 100 additions & 0 deletions tests/integration/popular_collections/GES_DISC.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
C1243477369-GES_DISC
C1692982070-GES_DISC
C2069247359-GES_DISC
C2033151148-GES_DISC
C1569839798-GES_DISC
C2033167496-GES_DISC
C2069246977-GES_DISC
C1235316220-GES_DISC
C1239966757-GES_DISC
C1701805630-GES_DISC
C1442068519-GES_DISC
C1235316219-GES_DISC
C1701805611-GES_DISC
C1701828243-GES_DISC
C1598621095-GES_DISC
C1235316222-GES_DISC
C1235316223-GES_DISC
C1952167462-GES_DISC
C1675477037-GES_DISC
C1266136111-GES_DISC
C1239966818-GES_DISC
C1266136037-GES_DISC
C1281704371-GES_DISC
C1701805652-GES_DISC
C1238517289-GES_DISC
C1266136071-GES_DISC
C1943072252-GES_DISC
C1598621094-GES_DISC
C1266136062-GES_DISC
C2045794784-GES_DISC
C2248663267-GES_DISC
C2623694314-GES_DISC
C1442068516-GES_DISC
C2556149060-GES_DISC
C2467863601-GES_DISC
C2467880659-GES_DISC
C1239966810-GES_DISC
C1238517344-GES_DISC
C1237114212-GES_DISC
C1239966829-GES_DISC
C2011289787-GES_DISC
C1701805672-GES_DISC
C1442068433-GES_DISC
C1266136096-GES_DISC
C1404080675-GES_DISC
C1282032615-GES_DISC
C1342986035-GES_DISC
C1276812899-GES_DISC
C1276812941-GES_DISC
C2756757848-GES_DISC
C1276812926-GES_DISC
C1266136112-GES_DISC
C1266136100-GES_DISC
C2515837343-GES_DISC
C1239966791-GES_DISC
C2569847612-GES_DISC
C1266136072-GES_DISC
C2756347598-GES_DISC
C1488311935-GES_DISC
C1235316199-GES_DISC
C2723754847-GES_DISC
C1729925175-GES_DISC
C1729925806-GES_DISC
C1276812863-GES_DISC
C1235316218-GES_DISC
C2237419562-GES_DISC
C1276812830-GES_DISC
C1276812901-GES_DISC
C1266136114-GES_DISC
C1933574580-GES_DISC
C1243477366-GES_DISC
C1282032565-GES_DISC
C1239966755-GES_DISC
C1943072156-GES_DISC
C1598621098-GES_DISC
C1243477380-GES_DISC
C1282060545-GES_DISC
C1386443916-GES_DISC
C1700900796-GES_DISC
C1266136097-GES_DISC
C1266136113-GES_DISC
C2042565519-GES_DISC
C1266136070-GES_DISC
C1251101497-GES_DISC
C1633993908-GES_DISC
C1239966842-GES_DISC
C1243477371-GES_DISC
C1239966837-GES_DISC
C1251101764-GES_DISC
C1238517305-GES_DISC
C1282060546-GES_DISC
C1223720291-GES_DISC
C1701805601-GES_DISC
C2042566037-GES_DISC
C1701805657-GES_DISC
C1729925154-GES_DISC
C1276812900-GES_DISC
C1239966827-GES_DISC
C1276812893-GES_DISC
C1239536905-GES_DISC
Loading
Loading