Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis pod failing to write on mounted volume resulting in pulp-content returning 500 #1354

Open
vkukk opened this issue Sep 17, 2024 · 5 comments
Labels

Comments

@vkukk
Copy link

vkukk commented Sep 17, 2024

Version
image: quay.io/pulp/pulp-operator:v1.0.0-beta.5
default pulp images.

Describe the bug
After enabling cache, pulp-content fails with 500.

[2024-09-17 08:22:23 +0000] [52] [ERROR] Error handling request
Traceback (most recent call last):
  File "/usr/local/lib64/python3.9/site-packages/aiohttp/web_protocol.py", line 456, in _handle_request
    resp = await request_handler(request)
  File "/usr/local/lib64/python3.9/site-packages/aiohttp/web_app.py", line 537, in _handle
    resp = await handler(request)
  File "/usr/local/lib64/python3.9/site-packages/aiohttp/web_middlewares.py", line 114, in impl
    return await handler(request)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/content/authentication.py", line 48, in authenticate
    return await handler(request)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/content/instrumentation.py", line 230, in middleware
    resp = await handler(request)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/cache/cache.py", line 346, in cached_function
    await self.auth(request, self, bk)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/content/handler.py", line 239, in auth_cached
    await cached.set(guard_key, str(guard), base_key=base_key)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/cache/cache.py", line 57, in wrapper
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pulpcore/cache/cache.py", line 265, in set
    ret = await self.redis.hset(base_key, key, value)
  File "/usr/local/lib/python3.9/site-packages/redis/asyncio/client.py", line 615, in execute_command
    return await conn.retry.call_with_retry(
  File "/usr/local/lib/python3.9/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
    return await do()
  File "/usr/local/lib/python3.9/site-packages/redis/asyncio/client.py", line 589, in _send_command_parse_response
    return await self.parse_response(conn, command_name, **options)
  File "/usr/local/lib/python3.9/site-packages/redis/asyncio/client.py", line 636, in parse_response
    response = await connection.read_response()
  File "/usr/local/lib/python3.9/site-packages/redis/asyncio/connection.py", line 570, in read_response
    raise response from None
redis.exceptions.ResponseError: MISCONF Redis is configured to save RDB snapshots, but it's currently unable to persist to disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error.
::ffff:10.2.3.17 [17/Sep/2024:08:22:23 +0000] "GET /pulp/content/mongo-6/tst/ HTTP/1.1" 500 335 "https://pulp3.hostname.tldpulp/content/mongo-6/" "Mozilla/5.0 (X11; Linux x86_64; rv:130.0) Gecko/20100101 Firefox/130.0"

The cache pod is failing due to unsufficient privileges when writing to volume.

$ kubectl exec pod/pulp-redis-6c86f8467-nwrbz -- /bin/ls -l /|grep data
drwxr-xr-x   3 root root 4096 Sep 16 16:49 data
1:M 17 Sep 2024 10:30:00.009 * Background saving started by pid 189536
189536:C 17 Sep 2024 10:30:00.009 # Failed opening the temp RDB file temp-189536.rdb (in server root dir /data) for saving: Permission denied
1:M 17 Sep 2024 10:30:00.110 # Background saving error
1:M 17 Sep 2024 10:30:06.096 * 1 changes in 3600 seconds. Saving...
1:M 17 Sep 2024 10:30:06.097 * Background saving started by pid 189551
189551:C 17 Sep 2024 10:30:06.098 # Failed opening the temp RDB file temp-189551.rdb (in server root dir /data) for saving: Permission denied
1:M 17 Sep 2024 10:30:06.199 # Background saving error

To enable Redis user 999 with group 999 to save on mounted storage, pod must have securityContext.fsGroup with value 999.
When I'm trying to enable this by editing Pulp CR:
To Reproduce
set Pulp CR:

  cache:
    enabled: true
    redis_storage_class: csi-cinder-high-speed
    securityContext:
      fsGroup: 999

kubectl apply -f pulp.yaml
strict decoding error: unknown field "spec.cache.securityContext"

Expected behavior
proper securityContext is applied and Redis is able to save RDB file.

Additional context
OVH Managed Kubernetes 1.30.2

@vkukk
Copy link
Author

vkukk commented Sep 17, 2024

Appearantly, fsGroup should be enabled according to redis controller code here

SecurityContext: podSecurityContext,

When checking actual Pod config:

$ kubectl -n pulp get pod/pulp-redis-6c86f8467-nwrbz -o json| jq -r '.spec.securityContext'
{
  "runAsGroup": 999,
  "runAsUser": 999
}
$ kubectl -n pulp get pod/pulp-redis-6c86f8467-nwrbz -o json| jq -r '.spec.containers.[0].securityContext'
{
  "allowPrivilegeEscalation": false,
  "capabilities": {
    "drop": [
      "ALL"
    ]
  },
  "runAsNonRoot": true,
  "seccompProfile": {
    "type": "RuntimeDefault"
  }
}

So fsGroup defined here

fsGroup := int64(999)
does not get into actual Kubernetes deployment.

@gerrod3
Copy link

gerrod3 commented Sep 17, 2024

Need to look into why User 999 is not allowed to write in the volume for the Redis image.

@vkukk
Copy link
Author

vkukk commented Nov 26, 2024

Who needs to look into this?

@gerrod3
Copy link

gerrod3 commented Nov 26, 2024

Who needs to look into this?

This was a reminder to us devs when we were triaging the issue.

@danielbakken
Copy link

The same issue is affecting database pods created by the pulp operator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants