Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EnsureHashAlgorithm constraint (draft) #357

Closed
wants to merge 2 commits into from
Closed

Conversation

mslw
Copy link
Collaborator

@mslw mslw commented May 15, 2023

  • handle list of names (currently expects string, gets list)
  • handle letter casing
  • deal with error messaging with EnsureHashAlgorithm | EnsureListOf(EnsureHashAlgorithm) Add EnsureHashAlgorithm #346 (comment)

Addresses #346

mslw added 2 commits May 15, 2023 11:58
This adds EnsureHashAlgorithm constraint which checks whether the
given algorith name is contained in hashlib's algorithms_guaranteed
set. It derives from EnsureChoice without further customization.

Using `algorithms_guaranteed` covers algorithms that have named
constructors in hashlib, i.e. accessible through getattr(hashlib,
name) - that's the current usage in MultiHash. Future expansion could
see us move to algorithms_available - these will be recognized when
passed to (slower) hashlib.new(), but have no named constructors.

For details, see: https://docs.python.org/3.11/library/hashlib.html
The EnsureHash algorithm now calls its super.__init__ correctly.

We use EnsureHashAlgorithm() | EnsureListOf(...) in ls_file_colelction
now. This correctly handles incoming values, which are always a list
in CLI (argparse behaviour with --append) and can be str or list in
Python API. At least when they are OK.

When input is not OK, two things are suboptimal:
- an incorrect hash (str) proceeds to EnsureListOf, which tries to go
  letter by letter, failing on first
- a hash (list) with an incorrect value would first fail on
  EnsureHashAlgorithm(list), proceed to EnsureListOf, and fail on
  EnsureHashAlgorithm(str)
As a conequence, in both cases the error message would say "does not
match any of 2 alternatives" and print EnsureChoice message ("is not
one of ...") twice.
@codecov
Copy link

codecov bot commented May 16, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (ad3aa18) 92.25% compared to head (632facb) 92.25%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #357   +/-   ##
=======================================
  Coverage   92.25%   92.25%           
=======================================
  Files          95       95           
  Lines        7747     7751    +4     
=======================================
+ Hits         7147     7151    +4     
  Misses        600      600           
Impacted Files Coverage Δ
datalad_next/commands/ls_file_collection.py 100.00% <ø> (ø)
datalad_next/constraints/__init__.py 100.00% <ø> (ø)
datalad_next/constraints/basic.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@mslw
Copy link
Collaborator Author

mslw commented May 22, 2023

I am using ls-file-collection as the model application of the validator. Hash parameter can be a str | List(str); when coming from CLI it will always be the latter. I assumed EnsureHashAlgorithm | EnsureListOf(EnsureHashAlgorithm) would be the most intuitive constraint specification.

However, I got stupidly locked on how validation errors are reported with this constraint pattern: #346 (comment)

I would appreciate suggestions (@mih maybe you have a comment) for what would be the most logical solution:

  • assume error messaging is good enough, keep the constraint and its usage as above
  • assume error messaging needs to be improved separately, but the constraint and its usage is ok
  • create an EnsureMultiHash constraint that would deal with strings & lists by itself, and create a MultiHash object already at the stage of parameter validation; adjust all code downstream to expect MultiHash as input

@mih
Copy link
Member

mih commented Jul 27, 2023

This PR has been in draft mode for a long time. I am closing it now as part of a cleanup run. Feel free to reopen once work on it restarts. Sorry.

@mih mih closed this Jul 27, 2023
@adswa adswa mentioned this pull request Oct 20, 2023
@mslw mslw deleted the ensurehash branch January 12, 2024 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants