Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting OnCall to seperate HTTPS Grafana instance #3823

Closed
MarkOWiesemann opened this issue Feb 1, 2024 · 12 comments
Closed

Connecting OnCall to seperate HTTPS Grafana instance #3823

MarkOWiesemann opened this issue Feb 1, 2024 · 12 comments
Assignees
Labels
bug Something isn't working more info needed

Comments

@MarkOWiesemann
Copy link

What went wrong?

What happened:

  • When setting up OnCall to connect to a seperate Grafana instance Grafana itself reports that it connects but when using it (opening a oncall setting for example) i get the error "OnCall was not able to load the current user. Try refreshing the page"

What did you expect to happen:

  • OnCall to load the setttings.

How do we reproduce it?

  1. Configure Grafana OnCall Plugin
  2. Shows: "Connected to OnCall (1.3.96, OpenSource)"
  3. Select "Open Grafana OnCall"
  4. Get following output:
    image

Grafana OnCall Version

docker compose version (latest I guess)

Product Area

Other

Grafana OnCall Platform?

Docker

User's Browser?

Firefox

Anything else to add?

I filterd my logs and see erros about "name not resolved" and "sslerrors":
Examples:
[36mcelery_1 |^[[0m ^[[1;33m2024-02-01 19:12:31,372 source=engine:celery worker=ForkPoolWorker-2 task_id=cb61aaf3-6039-427e-a563-a86e7fbf4fb9 task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.helpers.client level=WARNING Error connecting to api instance HTTPConnectionPool(host='grafana', port=3000): Max retries exceeded with url: /api/access-control/users/permissions/search?actionPrefix=grafana-oncall-app (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc0b09a25d0>: Failed to establish a new connection: [Errno -2] Name does not resolve')
[[36mcelery_1 |^[[0m ^[[1;33m2024-02-01 19:20:18,584 source=engine:celery worker=ForkPoolWorker-2 task_id=e524768e-a1fc-4460-ae58-cff06f9760ba task_name=apps.grafana_plugin.tasks.sync.plugin_sync_organization_async name=apps.grafana_plugin.helpers.client level=WARNING Error connecting to api instance HTTPSConnectionPool(host='grafana.local.wiesemann.dev', port=443): Max retries exceeded with url: /api/access-control/users/permissions/search?actionPrefix=grafana-oncall-app (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1002)')))
The host itself has those certificates installed.

@MarkOWiesemann MarkOWiesemann added the bug Something isn't working label Feb 1, 2024
@dark-brains
Copy link

Hi I think it is same as my issue.
#3825

@clementduveau
Copy link

Can you check if it's not similar to #3607 too please ?

@dark-brains
Copy link

No it is not.

@mderynck
Copy link
Contributor

mderynck commented Feb 9, 2024

You may want to try exec into the oncall engine container and use wget/curl to reach the URL. If they have a problem the certificate can't be verified, to get around that you can extend the image and install your certificate, if wget/curl work from inside the same container but oncall does not let us know and we'll investigate further.

@MarkOWiesemann
Copy link
Author

Can you check if it's not similar to #3607 too please ?

It is quite similar with the only difference that both admin and non-admin user can not access oncall.

Non-admin get "User with Admin permission in your organization must sign on and setup OnCall before it can be used"

And the admin users get the two mentioned pop ups and the "OnCall was not able to load the current user. Try refreshing the page"

Might https be a problem?

@MarkOWiesemann
Copy link
Author

You may want to try exec into the oncall engine container and use wget/curl to reach the URL. If they have a problem the certificate can't be verified, to get around that you can extend the image and install your certificate, if wget/curl work from inside the same container but oncall does not let us know and we'll investigate further.

Wget works like a charm inside the root_engine container and I get the index page. Curl doesn't seem to be installed though.

@mderynck
Copy link
Contributor

mderynck commented Feb 9, 2024

From inside that container you can also try:

python manage.py shell

# In python shell
import requests
requests.get('YOUR_URL')

See if our client is correctly using the certificate

@MarkOWiesemann
Copy link
Author

Doesn't seem so:

root@OnCall:~# docker exec -it 3d79ab25071d bash
3d79ab25071d:/etc/app$ python manage.py shell
Python 3.11.4 (main, Aug 9 2023, 08:38:11) [GCC 12.2.1 20220924] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)

import requests
requests.get('https://grafana.local.wiesemann.dev')
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1058, in validate_conn
conn.connect()
File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/util/ssl
.py", line 449, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/util/ssl
.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/ssl.py", line 517, in wrap_socket
return self.sslsocket_class._create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/ssl.py", line 1075, in _create
self.do_handshake()
File "/usr/local/lib/python3.11/ssl.py", line 1346, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1002)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='grafana.local.wiesemann.dev', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1002)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.11/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 517, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='grafana.local.wiesemann.dev', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1002)')))

@mderynck
Copy link
Contributor

I'm not sure of the exact state of your certificate but it looks like in python requests it is missing the issuer or an intermediate certificate (I think curl works because it can figure this out on its own?). This appears to be how that library works and isn't particular to OnCall. Testing locally with a self signed certificate I see the same issue. To fix it I had to add the public key for the CA and any intermediate CAs to the cacerts used by certifi in the python environment. To do this for OnCall you would need to extend the Dockerfile and append all your CA pem files to /usr/local/lib/python3.11/site-packages/certifi/cacert.pem.

Before doing that if you only have 1 root CA that signed this certificate you can test it by copying that file into the container using docker cp and then run a script similar to above:

python manage.py shell

# In python shell
import requests
requests.get('YOUR_URL', verify='PATH_TO_PEM')

@mderynck mderynck self-assigned this Feb 14, 2024
@MarkOWiesemann
Copy link
Author

Before doing that if you only have 1 root CA that signed this certificate you can test it by copying that file into the container using docker cp and then run a script similar to above:

python manage.py shell

# In python shell
import requests
requests.get('YOUR_URL', verify='PATH_TO_PEM')

I did that on the go and yes that seems to solve the problem (I got a 200 response). I will have to check on how to modify the docker file for my setup at home, as it would be my first time but I am guessing it won't be too complicated.

May I ask if this behavior will be addressed in a release or if the workaround is the way to go to solve it.

@mderynck
Copy link
Contributor

I haven't tested it but it could be easier to copy /usr/local/lib/python3.11/site-packages/certifi/cacert.pem out of the container, add the certificate and then mount it into the container over top of the included one, this way you might avoid needing to build a custom image.

I don't think we have any immediate plans to handle this through the product/release. If mounting cacert from outside works we can add a entry for it in the docs for using self signed certificates.

@MarkOWiesemann
Copy link
Author

Your suggestion was incredibly helpful and worked perfectly. Initially, I attempted to simply mount my certificate authority into the Docker container, which would have sufficed for my local Grafana instance. However, I overlooked the fact that connecting a Grafana Cloud instance was necessary to utilize the on-call mobile app. Realizing this, I reverted back to your suggestion, as it required including the default CAs as well. I copied the CA out, made the necessary modifications, and then copied it back in (mount it to ensure it could still function after a reboot).

As I delved into connecting a Grafana Cloud instance, I encountered another issue, which was already discussed here but it didn't help as the comments are not clear. However, I'll need to evaluate whether proceeding with this approach aligns with my goals, particularly since I'm unsure why a cloud instance is required if I only intend to use it within my local network or VPN.

In any case, I'm marking this as resolved, as my original issue has been successfully addressed. Many thanks for your invaluable assistance, @mderynck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working more info needed
Projects
None yet
Development

No branches or pull requests

4 participants