Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for stuck loki-ingester pods #128

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

DebakelOrakel
Copy link
Contributor

@DebakelOrakel DebakelOrakel commented Apr 10, 2024

Sometimes loki-ingester pods get stuck when the pods are not cleanly restarted. This workaround checks for long startup times of ingester pods and cleans checkpoint directory to non-existing indexes in /tmp/wal directory of the pod.

Checklist

  • The PR has a meaningful title. It will be used to auto-generate the
    changelog.
    The PR has a meaningful description that sums up the change. It will be
    linked in the changelog.
  • PR contains a single logical change (to build a better changelog).
  • Update the documentation.
  • Categorize the PR by adding one of the labels:
    bug, enhancement, documentation, change, breaking, dependency
    as they show up in the changelog.
  • Link this PR to related issues or PRs.

@DebakelOrakel DebakelOrakel added the bug Something isn't working label Apr 10, 2024
@DebakelOrakel DebakelOrakel requested a review from a team as a code owner April 10, 2024 07:14
@DebakelOrakel DebakelOrakel force-pushed the fix/loki-ingester-stuck branch from 69da2c2 to fde4ebc Compare April 10, 2024 08:01
Copy link
Member

@simu simu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments which outline some of the benefits of using kube.libjsonnet instead of writing K8s manifests by hand. Take them or leave them.

I'd strongly prefer having the script in a separate file as noted inline and I'd also strongly prefer that the script uses set -e -o pipefail.

component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
Copy link
Contributor

@bastjan bastjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and pretty robust i think...

component/loki.libsonnet Outdated Show resolved Hide resolved
component/loki.libsonnet Outdated Show resolved Hide resolved
@DebakelOrakel DebakelOrakel force-pushed the fix/loki-ingester-stuck branch 2 times, most recently from 8b9b559 to 6bb41f8 Compare April 10, 2024 12:39
Sometimes loki-ingester pods get stuck when the pods are not
cleanly restarted. This workaround checks for long startup times
of ingester pods and cleans checkpoint directory to non-existing
indexes in `/tmp/wal` directory of the pod.
@DebakelOrakel DebakelOrakel force-pushed the fix/loki-ingester-stuck branch from 6bb41f8 to 1f62f51 Compare April 10, 2024 12:43
@DebakelOrakel DebakelOrakel merged commit 87e07a1 into master Apr 10, 2024
13 checks passed
@DebakelOrakel DebakelOrakel deleted the fix/loki-ingester-stuck branch April 10, 2024 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants