Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iter annexworktree subproc #3

Conversation

christian-monch
Copy link

@christian-monch christian-monch commented Nov 20, 2023

This PR adds an implementation of iter_annexworktree that is based on iterable_subprocesses and the ideas laid out in issue datalad#537.

This PR included a collection of data processors that are basically generator-wrapper.

The PR also modifies iter_gitworktree to use iterable_subprocesses instead of the datalad-core runner.

The current implementation of iter_annexworktree iterates over a dataset with 33k annex files in less than 5 seconds on my machine.

This commit adds a number of processors
that are all iterators, i.e. they read
from underlying iterators, and can
be cascaded.
@christian-monch christian-monch requested a review from mih as a code owner November 20, 2023 22:31
@christian-monch christian-monch force-pushed the iter-annexworktree-subproc branch from c0bdf3a to 908d76f Compare November 21, 2023 11:59
@mih
Copy link
Owner

mih commented Nov 21, 2023

Thanks! I am moving this PR here datalad#539 to get the full test suite in action.

@mih mih closed this Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants