Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-point for reading container configuration #242

Merged
merged 7 commits into from
Oct 11, 2023

Conversation

mih
Copy link
Member

@mih mih commented Oct 10, 2023

This changeset established a new helper get_container_configuration() as the single method to read configuration on containers.

This closes #238

This changeset is the precondition for implementing configuration item normalization effciciently -- for example as needed for resolving #224

A number of issues have been detected while working on this changeset, which are not addressed in here, but will be the topic of future work:

This changeset implies no behavior changes. The only user-facing change is an expanded documentation.

9b95b78 is not strictly needed in this PR. It led to #241 and could be included in a future PR. However, I see no benefit it such a diff to accumulate dust.

mih added 6 commits October 10, 2023 08:25
This is a plain code move from `containers_list` into a standalone function.
Added value is

- reusability
- documentation, incl. type annotation

Also adds rendering of the utility module's documentation.

This is a first step towards resolving datalad#238
This is a common use case in commands other than `containers_list`.
Albeit not strictly necessary (all configuration could be read
and processed first, and sub-selection could happen in user code),
it is a low-complexity change. Moreover, we envision more configuration
post-processing to happen in the future
(see datalad#238). This
would change the cost assessment of loading everything upfront.
This changeset maintains the exact behavior of the command, and
merely adds documentation and replaces custom configuration
access with the new `get_container_configuration()` helper.

The documentation may communicate suboptimal command semantics,
but these were given before, and are only written down now.

Also see datalad#240
for additional aspects re a future refactoring.

Configuration writes now use the ``scope=branch`` approach. It was
introduced with datalad v0.16 and requires no dependency adjustments.
...for the purpose of a standardization of container configuration reads
on `get_container_configuration()`.
This is done by turning of the result rendering. This revealed that
the command implementations cause uncondtional result rendering
datalad#241
This is the last change for establishing `get_container_configuration()`
as the single code piece for reading container documentation.

This opens the door for implementing on-read normalization of
configuration items, such as the image path.

Closes datalad#238
@codeclimate
Copy link

codeclimate bot commented Oct 10, 2023

Code Climate has analyzed commit 8c6dba5 and detected 0 issues on this pull request.

View more on Code Climate.

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (c505e52) 94.69% compared to head (8c6dba5) 94.75%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #242      +/-   ##
==========================================
+ Coverage   94.69%   94.75%   +0.05%     
==========================================
  Files          25       25              
  Lines        1093     1105      +12     
==========================================
+ Hits         1035     1047      +12     
  Misses         58       58              
Files Coverage Δ
datalad_container/containers_add.py 89.80% <100.00%> (+0.13%) ⬆️
datalad_container/containers_list.py 95.74% <100.00%> (-0.81%) ⬇️
datalad_container/containers_remove.py 97.50% <100.00%> (ø)
datalad_container/containers_run.py 84.84% <ø> (ø)
datalad_container/tests/test_containers.py 100.00% <100.00%> (ø)
datalad_container/tests/test_run.py 100.00% <100.00%> (ø)
datalad_container/utils.py 96.15% <100.00%> (+10.43%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@adswa adswa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good to me!

Possible a little less complex to understand.
@yarikoptic yarikoptic added the internal Changes only affect the internal API label Oct 11, 2023
Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment and 1 change requested to make API a bit clearer.

@@ -329,6 +330,7 @@ def __call__(name, url=None, dataset=None, call_fmt=None, image=None,
return

# store configs
cfgbasevar = "datalad.containers.{}".format(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cfgbasevar = "datalad.containers.{}".format(name)
cfgbasevar = f"datalad.containers.{name}"

Comment on lines +223 to 224
url = container_cfg.get("updateurl")
if not url:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary for this PR but I realized that we can enter the XXI century now that 3.8 is the minimal supported version (forced at datalad core level but not yet in this extension setup.cfg), and can start using walrus assignment operator, so smth like

Suggested change
url = container_cfg.get("updateurl")
if not url:
if not (url := container_cfg.get("updateurl")):

name: str, optional
If given, the reported configuration will be limited to the container
with this exact name. In this case, only a single ``dict`` is returned,
not nested dictionaries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we make it more explicit, i.e. separate out a get_containers_configuration(ds) ? internally get_container_configuration(ds, name) can just call that get_containers_configuration(ds), I don't care. But having singular semantic in the name but getting a dict for multiple containers will eventually confuse someone like me at some point because e.g. I thought without a name specified it would return the config for the "only" container (as e.g. containers-run behaves IIRC) and raise KeyError or smth like that if multiple are found -- so it becomes ambiguous.

Copy link
Member Author

@mih mih Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify. Your response to the description in dbb2f91 (commit message) is: Let's nevertheless always read and post-process all configuration, make all filesystem checks for all containers, and then throw away all that information, but for one container.

Rather than that, you can also get that semantic cleanliness by having two near-copies of that function.

Just to be clear, I am not here to argue -- I just need this merged, and I don't care in which way. I just do not want to guess what should be done and avoid future iterations on this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have missed that there is a continue shortcut, so indeed some code is skipped... even though complexity stays O(N), there would indeed be not only a waste of cycles but also possibly errors on unrelated containers which should be avoidable. I guess a way could be to make current function _private and then expose it via two clean interfaces calling it out with or without a name.
I will leave it up to you whether to implement that or proceed with current version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice suggestion! I will change it to use a private helper and two clean interfaces.

Depending on your reviewing capabilities, I might do this in a separate follow-up PR, given that more PR are built on this, and they would all need an update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's merge

@yarikoptic yarikoptic merged commit 0648e44 into datalad:master Oct 11, 2023
@mih mih deleted the rf-cfg branch October 12, 2023 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal Changes only affect the internal API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement standard utility to read container config
3 participants