Skip to content

Commit

Permalink
feat: rule engine (do not merge) (#620)
Browse files Browse the repository at this point in the history
Co-authored-by: Tal Borenstein <[email protected]>
  • Loading branch information
shahargl and talboren authored Dec 18, 2023
1 parent 567d46d commit 46aa3a7
Show file tree
Hide file tree
Showing 30 changed files with 4,222 additions and 1,826 deletions.
27 changes: 26 additions & 1 deletion .github/workflows/test-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,26 @@ on: [push, pull_request]
env:
PYTHON_VERSION: 3.11
STORAGE_MANAGER_DIRECTORY: /tmp/storage-manager
MYSQL_ROOT_PASSWORD: keep
MYSQL_DATABASE: keep

jobs:
tests:
runs-on: ubuntu-latest
services:
mysql:
image: mysql:5.7
env:
MYSQL_ROOT_PASSWORD: ${{ env.MYSQL_ROOT_PASSWORD }}
MYSQL_DATABASE: ${{ env.MYSQL_DATABASE }}
ports:
- 3306:3306
options: >-
--health-cmd="mysqladmin ping"
--health-interval=10s
--health-timeout=5s
--health-retries=3
steps:
- name: Checkout
uses: actions/checkout@v3
Expand All @@ -31,8 +47,17 @@ jobs:
key: pydeps-${{ hashFiles('**/poetry.lock') }}
- name: Install dependencies using poetry
run: poetry install --no-interaction --no-root

- name: Run unit tests and report coverage
run: poetry run coverage run --branch -m pytest
run: |
# Add a step to wait for MySQL to be fully up and running
until nc -z 127.0.0.1 3306; do
echo "waiting for MySQL..."
sleep 1
done
echo "MySQL is up and running!"
poetry run coverage run --branch -m pytest
- name: Convert coverage results to JSON (for CodeCov support)
run: poetry run coverage json --omit="keep/providers/*"
- name: Upload coverage reports to Codecov
Expand Down
1 change: 1 addition & 0 deletions docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
"overview/introduction",
"overview/keyconcepts",
"overview/usecases",
"overview/ruleengine",
"overview/examples",
"overview/alternatives"
]
Expand Down
2 changes: 1 addition & 1 deletion docs/overview/keyconcepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Key concepts"
---
## Alert
Alert is an event that triggered when something bad happens or going to happen.
The term "alert" can sometimes be interchanged with "alarm" (in CloudWatch) or "monitor" (in Datadog).
The term "alert" can sometimes be interchanged with "alarm" (e.g. in CloudWatch) or "monitor" (e.g. in Datadog).

You can easily initiate a [Workflow](#workflow) when an alert is triggered.

Expand Down
27 changes: 27 additions & 0 deletions docs/overview/ruleengine.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: "Alert grouping"
---

The Keep Rule Engine is a versatile tool for grouping and consolidating alerts.
This guide explains the core concepts, usage, and best practices for effectively utilizing the rule engine.

<Note>Access the Rule Engine UI through the Keep platform by navigating to the Rule Builder section.</Note>

## Core Concepts
- **Rule definition**: A rule in Keep is a set of conditions that, when met, creates an alert group.
- **Alert attributes**: These are characteristics or data points of an alert, such as source, severity, or any attribute an alert might have.
- **Conditions and logic**: Rules are built by defining conditions based on alert attributes, using logical operators (like AND/OR) to combine multiple conditions.

## Creating Rules
Creating a rule involves defining the conditions under which an alert should be categorized or actions should be grouped.

1. **Accessing the Rule Engine**: Navigate to the Rule Engine section in the Keep platform.
2. **Defining rule criteria**:
- **Name the rule**: Assign a descriptive name that reflects its purpose.
- **Set conditions**: Use alert attributes to create conditions. For example, a rule might specify that an alert with a severity of 'critical' and a source of 'Prometheus' should be categorized as 'High Priority'.
- **Logical grouping**: Combine conditions using logical operators to form comprehensive rules.

## Examples
- **Metric-based alerts**: Construct a rule to pinpoint alerts associated with specific metrics, such as high CPU usage on servers. This can be achieved by grouping alerts that share a common attribute, like a 'CPU usage' tag, ensuring you quickly identify and address performance issues.
- **Feature-related alerts**: Establish rules to organize alerts by specific features or services. For instance, you can group alerts based on a 'service' or 'URL' tag. This approach is particularly useful for tracking and managing alerts related to distinct functionalities or components within your application.
- **Team-based alert management**: Implement rules to categorize alerts according to team responsibilities. This might involve grouping alerts based on the systems or services a particular team oversees. Such a strategy ensures that alerts are promptly directed to the appropriate team, enhancing response times and efficiency.
78 changes: 78 additions & 0 deletions examples/workflows/elastic_enrich_example.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# if no acknowledgement has been recieved (updated in index) for x (from config index) time, i want to escalate it to next level of people
workflow:
id: elastic-enrich
description: escalate-if-needed
triggers:
# run every minute
- type: interval
value: 1m
steps:
# first, query the ack index to check if there are any alerts that have not been acknowledged
- name: query-ack-index
type: elastic
config: " {{ providers.elastic }} "
with:
index: your_ack_index
query: |
{
"query": {
"bool": {
"must": [
{
"match": {
"acknowledged": false
}
}
]
}
}
}
- name: query-config-index
type: elastic
config: " {{ providers.elastic }} "
with:
index: your_config_index
query: |
{
"query": {
"bool": {
"must": [
{
"match": {
"config": true
}
}
]
}
}
}
- name: query-people-index
type: elastic
config: " {{ providers.elastic }} "
with:
index: your_people_index
query: |
{
"query": {
"bool": {
"must": [
{
"match": {
"people": true
}
}
]
}
}
}
# now, we have the results from the ack index, config index, and people index
actions:
- name: escalate-if-needed
# if there are any alerts that have not been acknowledged
if: "{{ query-ack-index.hits.total.value }} > 0"
provider:
type: slack # or email or whatever you want
config: " {{ providers.slack }} "
with:
message: |
"A unacknowledged alert has been found: {{ query-ack-index.hits.hits }} {{ query-config-index.hits.hits }} {{ query-people-index.hits.hits }}"
6 changes: 4 additions & 2 deletions keep-ui/app/alerts/alerts.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,11 @@ export default function Alerts({
combinedAlerts.forEach((alert) => {
let alertKey = "";
try {
alertKey = `${alert.id}-${alert.lastReceived.toISOString()}`;
alertKey = `${
alert.fingerprint
}-${alert.lastReceived.toISOString()}`;
} catch {
alertKey = alert.id;
alertKey = alert.fingerprint;
}
uniqueObjectsMap.set(alertKey, alert);
});
Expand Down
16 changes: 13 additions & 3 deletions keep-ui/app/command-menu.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ import {
KeyIcon,
BriefcaseIcon,
} from "@heroicons/react/24/outline";
import { VscDebugDisconnect } from "react-icons/vsc";
import { LuWorkflow } from "react-icons/lu";
import { AiOutlineAlert } from "react-icons/ai";
import { MdOutlineEngineering } from "react-icons/md";

import "../styles/linear.scss";

Expand Down Expand Up @@ -95,19 +99,25 @@ export function CMDK() {

const navigationItems = [
{
icon: <ConnectIntegrationIcon />,
icon: <VscDebugDisconnect />,
label: "Go to the providers page",
shortcut: ["p"],
navigate: "/providers",
},
{
icon: <GoToConsoleIcon />,
icon: <AiOutlineAlert />,
label: "Go to alert console",
shortcut: ["g"],
navigate: "/alerts",
},
{
icon: <BriefcaseIcon />,
icon: <MdOutlineEngineering />,
label: "Go to alert groups",
shortcut: ["g"],
navigate: "/rules",
},
{
icon: <LuWorkflow />,
label: "Go to the workflows page",
shortcut: ["wf"],
navigate: "/workflows",
Expand Down
4 changes: 4 additions & 0 deletions keep-ui/app/globals.css
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@
display: none;
}
*/
[role="tooltip"] {
@apply w-72; /* This sets the width to 16rem. Adjust the number as needed. */
@apply break-words; /* This will wrap long words if needed. */
}
16 changes: 10 additions & 6 deletions keep-ui/app/navbar-inner.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,15 @@ import { signOut } from "next-auth/react";
import { Fragment, useState } from "react";
import {
Bars3Icon,
BellAlertIcon,
BriefcaseIcon,
DocumentTextIcon,
PuzzlePieceIcon,
XMarkIcon,
} from "@heroicons/react/24/outline";
import { VscDebugDisconnect } from "react-icons/vsc";
import { LuWorkflow } from "react-icons/lu";
import { AiOutlineAlert } from "react-icons/ai";
import { MdOutlineEngineering } from "react-icons/md";


import Link from "next/link";
import { Icon } from "@tremor/react";
import { AuthenticationType } from "utils/authenticationType";
Expand All @@ -21,9 +24,10 @@ import { InternalConfig } from "types/internal-config";
import { NameInitialsAvatar } from "react-name-initials-avatar";

const navigation = [
{ name: "Providers", href: "/providers", icon: PuzzlePieceIcon },
{ name: "Alerts", href: "/alerts", icon: BellAlertIcon },
{ name: "Workflows", href: "/workflows", icon: BriefcaseIcon },
{ name: "Providers", href: "/providers", icon: VscDebugDisconnect },
{ name: "Alerts", href: "/alerts", icon: AiOutlineAlert },
{ name: "Alert Groups", href: "/rules", icon: MdOutlineEngineering},
{ name: "Workflows", href: "/workflows", icon: LuWorkflow }
// {
// name: "Notifications Hub",
// href: "/notifications-hub",
Expand Down
15 changes: 15 additions & 0 deletions keep-ui/app/rules/layout.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import { Title, Subtitle } from "@tremor/react";

export default function Layout({ children }: { children: any }) {
return (
<>
<main className="p-4 md:p-10 mx-auto max-w-full">
<Title>Alert Groups</Title>
<Subtitle>
Group multiple alerts into single alert
</Subtitle>
{children}
</main>
</>
);
}
11 changes: 11 additions & 0 deletions keep-ui/app/rules/page.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import RulesPage from "./rules.client";


export default function Page() {
return <RulesPage />;
}

export const metadata = {
title: "Keep - Rules",
description: "Create Keep Rules.",
};
Loading

1 comment on commit 46aa3a7

@vercel
Copy link

@vercel vercel bot commented on 46aa3a7 Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully deployed to the following URLs:

keep – ./

keep-eight.vercel.app
keep-keephq.vercel.app
platform.keephq.dev
keep-git-main-keephq.vercel.app

Please sign in to comment.