-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Integration test build matrix is now parameterized - This should allow each independent test to control their passing thresholds - This will allow us to more flexibly control passing states over time as the project improves * Disables fail-fast flag for testing matrix * Sets a lower passing threshold for bioconda tests to start * Truncates logs when scripts run from a GitHub Workflow - From experience, GitHub seems to be struggling to maintain all the logging information we are dumping to the console. - To mitigate this, we suppress dumping all the failed recipe file names to the console. This should dramatically shorten the amount of text being buffered. - This is unfortunate for the time being, but it should also make it easier to navigate the output of `convert` and `rattler-bulk-build` * Experimental timeout mechanism * Improves timeout mechanism - Tracking timeouts now use the `subshell.run()` timeout parameter instead of attempting to use some UNIX signal solution from StackOverflow * Reduced timeout * Removes `ExitCode` enum - Exit codes are now stored as ints, there is no way to predict what rattler-build my return to us - Tweaks to minimum test passing metrics * Adds timeout, disables bioconda_03 and 04 - I can't figure out why these integration tests cause so many issues for the GitHub runner, so it'll be a story for another time * Fixes disabling tests * Fixes minor typo * Starts work on conda-forge integration test Work is based on the branch used for #15 - Adds integration test case for conda-forge - Adds new `scripts` directory for developer helper scripts - Adds `randomly_select_recipes.py` utility that allows developers to randomly select `n` recipes from a GitHub organization hosting publicly accessible feedstock repositories * Fixes issue with parsing raw bytes from the GET request * Bumps CI minimum scores * Test data now pulls from `conda-recipe-manager-test-data` - Integration tests now pull data from the test data repo using the sparse option in the checkout action. * Fixes typos
- Loading branch information
1 parent
e22a369
commit dc5b370
Showing
9 changed files
with
226 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,4 +21,6 @@ dependencies: | |
- conda-build | ||
- jsonschema | ||
- types-jsonschema | ||
- requests | ||
- types-requests | ||
- pre-commit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,6 +37,7 @@ dependencies = [ | |
"jinja2", | ||
"pyyaml", | ||
"jsonschema", | ||
"requests", | ||
] | ||
|
||
[project.optional-dependencies] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,7 @@ requirements: | |
- jinja2 | ||
- pyyaml | ||
- jsonschema | ||
- requests | ||
|
||
test: | ||
imports: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Scripts | ||
|
||
## Overview | ||
|
||
This directory contains 1-off development scripts related to this project. | ||
|
||
They should not be packaged to be run by a user/consumer of this project. | ||
|
||
# randomly_select_recipes.py | ||
|
||
Given a list of feedstock repositories owned by a GitHub organization, randomly select `NUM_RECIPES` number of recipe | ||
files to dump into `OUT_DIR` | ||
|
||
## Dependencies | ||
- `requests` | ||
- Some manual work with `gh` to produce the input file | ||
|
||
## Usage: | ||
```sh | ||
./randomly_select_recipes.py [-e EXCLUDE_FILE] FILE NUM_RECIPES OUT_DIR | ||
``` | ||
Where `-e EXCLUDE_FILE` is a list of repository names (1 line per repo name) to ignore when randomly selecting | ||
recipes from the other list. This is useful for generating multiple sets of non-overlapping repository files. | ||
|
||
For `conda-forge`, the input file used by this script was generated with: | ||
```sh | ||
gh repo list conda-forge -L 20000 > conda-forge-list.out | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
#!/usr/bin/env python3 | ||
""" | ||
File: randomly_select_recipes.py | ||
Description: Helper script to randomly select and acquire recipe files from a GitHub org. | ||
""" | ||
import argparse | ||
import csv | ||
import multiprocessing as mp | ||
import random | ||
from pathlib import Path | ||
from typing import Final, cast | ||
|
||
import requests | ||
|
||
# GET request timeout, in seconds | ||
HTTP_GET_TIMEOUT: Final[float] = 15 | ||
|
||
|
||
def fetch_repo(org_repo: str, out_dir: Path) -> str: | ||
""" | ||
Fetch a feedstock repo's recipe file and dump it to a corresponding location on disk. | ||
:param org_repo: String containing `org/repo`, which is what `gh repo list` returns | ||
:param out_dir: Path to the directory where files should be saved to | ||
:returns: The repository identifier, if successfully pulled and saved. Otherwise returns an empty string | ||
""" | ||
url_options: Final[list[str]] = [ | ||
f"https://raw.githubusercontent.com/{org_repo}/main/recipe/meta.yaml", | ||
f"https://raw.githubusercontent.com/{org_repo}/master/recipe/meta.yaml", | ||
] | ||
|
||
slash_idx: Final[int] = org_repo.find("/") | ||
if slash_idx < 0: | ||
return "" | ||
repo: Final[str] = org_repo[slash_idx + 1 :] | ||
file_path: Final[Path] = out_dir / f"{repo}/recipe/meta.yaml" | ||
|
||
for url in url_options: | ||
try: | ||
response = requests.get(url, timeout=HTTP_GET_TIMEOUT) | ||
if response.status_code == 200: | ||
file_path.parent.mkdir(exist_ok=True, parents=True) | ||
file_path.write_text(response.text) | ||
return org_repo | ||
except requests.exceptions.RequestException: # type: ignore[misc] | ||
continue | ||
return "" | ||
|
||
|
||
def main() -> None: | ||
""" | ||
Main execution point of the script | ||
""" | ||
parser = argparse.ArgumentParser( | ||
description="Randomly pulls n number of recipe files from a list of repos from a GitHub organization" | ||
) | ||
parser.add_argument("--exclude", "-e", default="", type=str, help="File containing a list of repos to exclude") | ||
parser.add_argument( | ||
"file", type=Path, help="File containing the output of `gh repo list <org>`" # type: ignore[misc] | ||
) | ||
parser.add_argument("num_recipes", type=int, help="Target number of recipes to select") | ||
parser.add_argument("out_dir", type=Path, help="Directory to place fetched recipe files in.") # type: ignore[misc] | ||
args = parser.parse_args() | ||
|
||
# Keep the type checker happy | ||
exclude: Final[bool] = cast(bool, args.exclude) | ||
gh_list_file: Final[Path] = cast(Path, args.file) | ||
num_recipes: Final[int] = cast(int, args.num_recipes) | ||
out_dir: Final[Path] = cast(Path, args.out_dir) | ||
|
||
# Parse excluded repos | ||
# TODO: This list probably comes from `ls` and won't have the prefixed org name | ||
excluded_repos: set[str] = set() | ||
if exclude: | ||
with open(exclude, encoding="utf-8") as fd: | ||
for line in fd: | ||
excluded_repos.add(line.strip()) | ||
|
||
# Parse the GitHub repo list | ||
all_repos: set[str] = set() | ||
with open(gh_list_file, encoding="utf-8") as fd: | ||
reader = csv.reader(fd, delimiter="\t", quotechar='"') | ||
for row in reader: | ||
if not row: | ||
continue | ||
all_repos.add(row[0]) | ||
|
||
# Randomly select N valid repos | ||
allowed_repos: Final[set[str]] = all_repos - excluded_repos | ||
picked_repos: Final[set[str]] = ( | ||
allowed_repos if num_recipes >= len(allowed_repos) else set(random.sample(sorted(allowed_repos), num_recipes)) | ||
) | ||
|
||
print(f"Selected {len(picked_repos)} out of {num_recipes} requested repos...") | ||
print("Fetching...") | ||
|
||
# This method could be refined. But to be lazy and avoid authentication issues and extra dependencies, we make an | ||
# attempt to pull the raw files based on an assumed location. | ||
with mp.Pool(mp.cpu_count()) as pool: | ||
results = pool.starmap(fetch_repo, [(repo, out_dir) for repo in picked_repos]) # type: ignore[misc] | ||
|
||
unique_results: Final[set[str]] = set(results) | ||
if "" in unique_results: | ||
unique_results.remove("") | ||
print(f"Fetched {len(unique_results)} out of {len(picked_repos)} picked repositories...") | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |