Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What happens when a key is revoked while one device in an encrypted pool is open and others are not? #636

Open
mulkieran opened this issue Jul 26, 2023 · 5 comments
Assignees

Comments

@mulkieran
Copy link
Member

We may have encountered this problem, and this may be a reproducer.

  1. Create a pool with encryption using key in kernel keyring.
  2. Stop the pool.
  3. Cause one device to be opened by cryptsetup.
  4. Wait until key has been revoked. How to tell?
  5. Attempt to start the pool. Hope to get an assertion failure something like:
$stratis pool start --name spool1 --unlock-method=keyring
Execution failed: stratisd failed to perform the operation that you requested. It returned the following information via the D-Bus: ERROR: Failed to join thread: task 67 panicked.

and see messages like the following in the logs:

INFO stratisd::engine::strat_engine::liminal::liminal] Attempt to set up pool failed, but it may be possible to set up the pool later, if the situation changes: There was an error encountered when calculating the block devices for pool with UUID c0c8bd4c-447f-447c-aabd-fc1d0dc652d2 and name spool1; IO error: No such file or directory (os error 2)
thread 'stratis-wt-25' panicked at 'assertion failed: `(left == right)`
left: `{DevUuid(4531c3ca-9e3b-4ad7-8548-e8ca0cd08c96), DevUuid(c9980f37-5fcd-444e-ae30-20b4b991ac99)}`
right: `{DevUuid(c9980f37-5fcd-444e-ae30-20b4b991ac99)}`',
src/engine/strat_engine/liminal/device_info.rs:143:5
@mulkieran
Copy link
Member Author

mulkieran commented Jul 26, 2023

Note that /proc/keys doesn't indicate that the key has been revoked

3282ef87 I--Q---     1 perm 3f010000     0     0 user      stratis-1-key-testkey: 6

but

$ sudo PYTHONPATH=./src ./bin/stratis key list
Execution failed:
stratisd failed to perform the operation that you requested. It returned the following information via the D-Bus: ERROR: IO error: Key has been revoked (os error 128). 

Here we might be more willing to believe stratis. Perhaps that /proc/keys does not show that the key has been revoked is an expression of the bug which causes the key not to be garbage collected.

Things on the test system are doing really poorly now; I can not even set a new key. stratis is reporting all keys revoked even if not previously set.

This is a volatile problem. Now the keys are visible and are not being reported as revoked by stratis.

This may have been caused by my using keyctl to get the persistent keyring and to read its contents.

Now, I can start the pool and it is fully up. What is very weird, is that, e.g., stratis key list will also get the persistent keyring, via stratisd. Why was that not enough?

Since every stratis key command that I ran returned the same error, 128, "Key has been revoked", a solid hypothesis is that that was the error message returned by the get_persistent_keyring() method.

@mulkieran
Copy link
Member Author

mulkieran commented Jul 26, 2023

New possible test scenario:

  1. Create a pool with encryption using key in kernel keyring.
  2. Stop the pool.
  3. Cause one device to be opened by cryptsetup.
  4. Remove the key from the keyring.
  5. Now both devices have the same key, but one can not be opened. What happens?

Still unable to reproduce the error with this approach.

@mulkieran mulkieran self-assigned this Jul 26, 2023
@mulkieran
Copy link
Member Author

Note that in the original instance of this failure the failed assertion was preceded by the following odd log entry:

[INFO stratisd::engine::strat_engine::liminal::liminal]Attempt to set up pool failed, but it may be possible to set up the pool later, if the situation changes: There was an error encountered when calculating the block devices for pool with UUID c0c8bd4c-447f-447c-aabd-fc1d0dc652d2 and name spool1; IO error: No such file or directory (os error 2)

@mulkieran
Copy link
Member Author

One suggestion to get at the error faster is that in liminal.rs:setup_pool() we should assert that len bdas == len infos after get_blockdevs returns, on the error path.

@mulkieran
Copy link
Member Author

One possibility of a bda going missing is if it somehow has the same UUID as that of a bda already in the BDAs hash map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant