From e9daaa100da879ab03ab914b0aa5b446dccb1a0b Mon Sep 17 00:00:00 2001
From: Documentation Bot CLARA is short for CLuster Architecture Recovery Assistent. CLARA can help by extracting component-based software architectures from applications deployed in Kubernetes clusters and exporting them into a visually appealing format. Thus, IT-architects that need insights about the components actually deployed in the Cluster can be assisted by using CLARA. For details on the functionality of CLARA see the concept page. For information on configuration options of CLARA see the configurations page. For information on the validation of CLARA's functionality see the validation page for the T2 Reference Architecture. CLARA is open-sourced under the MIT License. For aggregating data from the cluster four different data sources are used in CLARA: Aggregations can be enabled/disabled, yet it is recommended to always use all available aggregators to get a holistic view of the examined architecture. For details see configurations. CLARA utilizes the Kubernetes API to retrieve basic information about the pods, services and deployments running in the cluster. The fabric8 Kubernetes client is used by CLARA to communicate to the Kubernetes API. The Kubernetes API is queried for pods and services from the configured namespace (see configurations). Pods are then matched to a service and all services with their respective pods and all unmatched pods are provided into the datapipeline. CLARA can analyze the logs of CoreDNS (the default Kubernetes DNS server) to discover communication of components via DNS queries. For that feature to work correctly, it is crucial that the DNS server is configured to log DNS queries by enabling the Other DNS servers Your cluster might come with additional DNS servers to reduce the load. A prominent example is the node-local-dns for caching DNS. There, you must also enable the Compatible DNS servers Because CLARA analyzes the logged DNS queries, Currently, CLARA analyzes all logs from the pods with the labels Using a managed Kubernetes cluster from a service provider When using a managed cluster from a service provider, changes to core components of Kubernetes might be not allowed directly. Please consult the documentation of your respective provider. For DigitalOcean, the correct way of enabling logging is to create a special ConfigMap: As described in the Kubernetes Documentation, you can use dnsutils to debug DNS resolution. For CLARA, this is also a simple way of creating DNS queries explicitly and checking if CLARA detects the communication. Just create a dnsutils-pod with the following manifest: Then you can use the following command to execute DNS queries: Execute the following command to check the DNS server logs: The log DNS analysis uses the obtained information from the Kubernetes API to match the hostnames and ip-addresses in a DNS log to components of the cluster. An example log can look like this and provides disclosure about the source and target of a communication.
log
plugin.apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: coredns\n namespace: kube-system\ndata:\n Corefile: |\n .:53 {\n log\n errors\n health\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n pods insecure\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . /etc/resolv.conf\n cache 30\n loop\n reload\n loadbalance\n }\n
log
plugin.
k8s-app=kube-dns
or k8s-app=node-local-dns
in the namespace kube-system
.
"},{"location":"aggregation/platforms/kubernetes/dns/#dns-debugging","title":"DNS debugging","text":"apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: coredns-custom\n namespace: kube-system\ndata:\n log.override: |\n log\n
apiVersion: v1\nkind: Pod\nmetadata:\n name: dnsutils\n namespace: default\nspec:\n containers:\n - name: dnsutils\n image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.7\n command:\n - sleep\n - \"infinity\"\n imagePullPolicy: IfNotPresent\n restartPolicy: Always\n
kubectl exec -it dnsutils -n default -- nslookup google.com\n
"},{"location":"aggregation/platforms/kubernetes/dns/#concept","title":"Concept","text":"kubectl logs -l k8s-app=kube-dns -n kube-system\n
[INFO] 10.244.0.19:35065 - 3179 \"A IN kubernetes.default.svc.cluster.local.svc.cluster.local. udp 72 false 512\" NXDOMAIN qr,aa,rd 165 0.0000838s\n
CLARA can utilize OpenTelemetry traces as data source for finding components and to some extent component types, as well as communications between components. For that feature to work correctly, it is crucial to have instrumented applications and an OpenTelemetry Collector running in the cluster as described below.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#concept","title":"Concept","text":"\"OpenTelemetry is an Observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs.\" Traces and metrics are generated by each component individually and are forwarded to an OpenTelemetry collector, which processes the telemetry data and distributes it to a backend which utilizes it. CLARA can be seen as such a backend, which offers a gRPC endpoint for the oTel-collector to forward the traces to. CLARA then iterates over the traces and extracts information about components and their communications from that.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#setup","title":"Setup","text":"When using OpenTelemetry for CLARA you first need to ensure your software components are instrumented with OpenTelemetry traces. If not consider using OpenTelemetry auto-instrumentation.
OpenTelemetry Semantic Conventions
Because OpenTelemetry traces' attributes are not standardized, it is recommended to use tracing with the OpenTelemetry semantic conventions for CLARA. If your services do not provide them, you can try to set up the OpenTelemetry auto-instrumentation on top of your system.
Second, ensure that there is an OpenTelemetry collector with the matching configuration is running in your cluster. Third, when you use CLARA on a local machine and do not deploy in the cluster, you need to forward the traces from the OpenTelemetry collector to your local machine. The open-source tool ktunnel can be used to achieve this.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#opentelemetry-collector","title":"OpenTelemetry Collector","text":"The OpenTelemetry collector is a default component provided by OpenTelemetry itself. For CLARA only traces are used, thus the minimal configuration The image can be used to deploy a container with a suitable configuration as shown below. Examples for service and deployment configurations can be found in clara/deployment/open-telemetry-collector/deployment.yml
An example ConfigMap for the oTel-collector deploymentapiVersion: v1\nkind: ConfigMap\nmetadata:\n name: otel-collector-conf\n labels:\n app: otel-collector-conf\n component: otel-collector-conf\ndata:\n otel-collector-conf: |\n receivers:\n otlp:\n protocols:\n grpc:\n http:\n\n processors:\n\n exporters:\n otlp:\n endpoint: \"localhost:7878\"\n tls:\n insecure: true\n\n service:\n pipelines:\n traces:\n receivers: [otlp]\n processors: []\n exporters: [otlp]\n
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#ktunnel","title":"ktunnel","text":"ktunnel is an open-source tool that enables reverse port-forwarding to extract data out of kubernetes clusters. In order to use CLARA on a local machine, a ktunnel sidecar can be attached towards the OpenTelemetry collector deployment using ktunnel inject deployment otel-collector-deployment 7878 -n <namespace>
For further information see the ktunnel docs.
OpenTelemetry Auto-instrumentation can be used to generate OpenTelemetry traces on software components that are not instrumented themselves in Kubernetes clusters. This works by applying sidecar containers to each yet to be instrumented service that capture the network traffic and generate traces from that. For documentation on installation please see the official docs from OpenTelemetry.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#aggregation-algorithms","title":"Aggregation Algorithms","text":"After the OpenTelemetry-aggregator finished collecting the traces, the algorithm will iterate over all traces and extract architectural information from it. An OpenTelemetry spans can be of one of five kinds, Producer, Consumer, Client, Server and Internal. Internal spans are ignored, Client- and Server-spans as well as Producer- and Consumer- spans are analyzed seperated from each other, as described below.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#client-server","title":"Client-Server","text":"Spans of Client and Server kind are analyzed the following way:
A client as well as a server span can disclose information about the sending component as well as the respective other component in the communication. Therefore, from each span two possible components are obtained, that receive all the information available.
As there is no definitive standard for the naming and the values of span-attributes, the following logic is applied to extract information from span attributes:
Producers and Consumers of message-oriented communications are specifically tagged, because there is no directly observable communication between the source and the target component. Based on the semantic conventions, however, a producer-span contains the messaging-destination which can be used to obtain the communication. Therefore, the recovery looks as follows:
As each analyzed span creates at least two component objects, those need to be merged into a consistent pattern. The merging is done the following way:
The component objects finally need to be mapped to the CLARA-wide internal component and communication representation. Component objects containing a service-name are mapped to an \"internal\" component object, components without a service-name are mapped to an \"external\" one. Communications are mapped if a matching component via service-name or hostname for source and target can be found.
"},{"location":"aggregation/platforms/kubernetes/syft/","title":"SBOM","text":"CLARA can utilize anchore/syft to create SBOM files in SPDX format from the recovered components. This is done to extract the dependencies and external libraries of the recovered architecture.
"},{"location":"aggregation/platforms/kubernetes/syft/#concept","title":"Concept","text":"In order to get the library information of each component, CLARA passes the recovered image and version tag to syft. The syft binary then fetches the image from docker-hub and analyzes its contents and creates the SPDX files. Lastly, the SPDX files for the components are read by CLARA and each library and version from is added to the respective component.
"},{"location":"aggregation/platforms/kubernetes/syft/#setup","title":"Setup","text":"Install the binary from anchore/syft for your respective OS: macOS:
brew install syft\n
All OS: curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin \n
For configuration options please see the configurations page."},{"location":"concept/","title":"Concept","text":"CLARA itself is a data pipeline that collects information about an application deployed in a Kubernetes cluster from different data sources (i.e. the Kubernetes API, the Kubernetes internal DNS server and OpenTelemetry traces), merges them, filters them and exports them to visually display the architecture of the examined application.
"},{"location":"concept/#datapipeline","title":"Datapipeline","text":"The datapipeline of CLARA consists of the four main steps:
The typical configuration for CLARA is in the YAML format. Below, each available option is explained. All options with a default value are optional.
Sensitive Information & Environment Variables
Sensitive information, like usernames and passwords, don't belong into configuration files! For that reason, each configuration option can be specified by an environment variable. Instead of the actual value, the BASH-like syntax ${NAME_OF_THE_ENV_VARIABLE}
can be used to specify the environment variable.
Interpolation and specifying default values is possible as well, just have a look here! There you can also find other ways to set up the configuration to be effective, like referencing and substituting other parts of the configuration to make it more DRY.
"},{"location":"configuration/#general-configuration-options","title":"General configuration options","text":"app.log-configkube
-prefix should also get scanned by CLARA. Must be set to true when every namespace should be scanned, even when namespaces has the *
-wildcard.kube
-namespaces) set just the *
-wildcard as the only element. The *
needs to be in quotes.2024-01-01T00:00:00Z
)Equals
, Prefix
, Suffix
, Contains
)Equals
needs the same names, Prefix
and Suffix
need to have matching strings on the start or the end respectively, Contains
needs that one string is part of the other.true
, CLARA will define communications that go via a message broker directly between the components and removes the communications to the message broker. If false
it show the communications via the message broker. true
, the endpoints of the components are filtered out before the export, to improve visibility in complex architectures.true
, the versions of the components are filtered out before the export, to reduce updates when components are often released.true
, CLARA will export the recovered architecture using the enabled exporters, even if the architecture is completely empty. This could be useful for debugging purposes.BMP
, DOT
, GIF
, JPG
, JPEG
, JSON
, PDF
, PNG
, SVG
, TIFF
)SVG
is known to work well and in most situations the best choice.generated/architecture.svg
Delete
or Modify
)app:\n log-config: true\n block-after-finish: false\n\naggregation:\n platforms:\n kubernetes:\n include-kube-namespaces: false\n namespaces:\n - abc\n - xyz\n aggregators:\n kube-api:\n enable: true\n dns:\n enable: true\n logs-since-time: 2024-02-01T00:00:00Z\n open-telemetry:\n enable: true\n listen-port: 7878\n listen-duration: 45 minutes\n syft-sbom:\n enable: true\n sbom-file-path: sbom/\n use-stored-sbom-files: false\n\nmerge:\n comparison-strategy: Equals\n show-messaging-communications-directly: true\n\nfilter:\n remove-component-endpoints: false\n remove-components-by-names:\n - otel-collector-service\n\nexport:\n on-empty: false\n exporters:\n graphviz:\n enable: true\n output-type: SVG\n output-file: generated/architecture.svg\n gropius:\n enable: true\n project-id: aaaaaaaa-1111-bbbb-2222-cccccccccccc\n graphql-backend-url: http://my.backend.com:8080/graphql\n graphql-backend-authentication:\n authentication-url: http://my.backend.com:3000/authenticate/oauth/xxxxxxxx-1111-yyyy-2222-zzzzzzzzzzzz/token\n client-id: ${CLARA_GROPIUS_GRAPHQL_CLIENT_ID}\n client-secret: ${CLARA_GROPIUS_GRAPHQL_CLIENT_SECRET}\n
"},{"location":"export/","title":"Export","text":"CLARA offers two different ways for exporting the aggregated Architecture:
Gropius is an open-source cross-component issue management system for component-based architectures. In order to enable managing cross-component dependencies, users can model component-based software architectures in a Gropius project, e.g. via the API. For more details on Gropius visit the GitHub Page.
For configuration options of the export please check out the configurations page.
"},{"location":"export/gropius/#data-model","title":"Data Model","text":"The data model of Gropius consists of components which can be specified with templates as well as relations between those components, also configurable via templates. A component must have a component and a repository-URL in order to be added to a project, which resembles an architecture.
CLARA components are mapped to Gropius-components like this:
CLARA Metamodel Gropius Metamodel InternalComponent Component \u00a0\u00a0\u00a0\u00a0InternalComponent.Name \u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0InternalComponent.IpAddress \u00a0\u00a0\u00a0\u00a0Component.Description \u00a0\u00a0\u00a0\u00a0InternalComponent.Version \u00a0\u00a0\u00a0\u00a0Component.ComponentVersion \u00a0\u00a0\u00a0\u00a0InternalComponent.Namespace MISSING \u00a0\u00a0\u00a0\u00a0InternalComponent.Endpoints MISSING (Note, that Gropius is capable of modeling interfaces, yet due to a lack of time this is not performed in the current work.) MISSING \u00a0\u00a0\u00a0\u00a0Component.RepositoryURL (Example URL) \u00a0\u00a0\u00a0\u00a0InternalComponent.Type \u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Database \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Database Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Microservice \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Microservice Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Messaging \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Messaging Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0null \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Base Component Temp. \u00a0\u00a0\u00a0\u00a0InternalComponent.Libraries \u00a0\u00a0\u00a0\u00a0Components \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.ComponentVersion \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.Description MISSING \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate (Library Temp.) \u00a0\u00a0\u00a0\u00a0InternalComponent.Library \u00a0\u00a0\u00a0\u00a0Relation \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0InternalComponent.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.Start \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.End MISSING \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.RelationTemplate (Includes Temp.) ExternalComponent Component \u00a0\u00a0\u00a0\u00a0ExternalComponent.Name \u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0ExternalComponent.Domain \u00a0\u00a0\u00a0\u00a0Component.Description \u00a0\u00a0\u00a0\u00a0ExternalComponent.Type \u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate (Misc Temp.) \u00a0\u00a0\u00a0\u00a0ExternalComponent.Version \u00a0\u00a0\u00a0\u00a0Component.ComponentVersion Communication Relation \u00a0\u00a0\u00a0\u00a0Communication.Source.Version \u00a0\u00a0\u00a0\u00a0Relation.Start \u00a0\u00a0\u00a0\u00a0Communication.Target.Version \u00a0\u00a0\u00a0\u00a0Relation.End MISSING \u00a0\u00a0\u00a0\u00a0Relation.RelationTemplate (Calls Temp.)The Gropius GraphQL API is utilized by CLARA in order to export the recovered architectures into a Gropius project.
"},{"location":"export/gropius/#export-flow","title":"Export Flow","text":"The export works sketched like this based on the respective configuration:
For all CRUD operations there are predefined GraphQL queries which are transformed into Kotlin Models using this GraphQl gradle plugin and executed using this GraphQL Kotlin Spring client. The GraphQL queries are located in the clara-graphql directory.
"},{"location":"export/svg/","title":"SVG Export","text":"The GraphViz SVG exporter exports a superficial representation of the architecture as displayed in the example image below.
Legend:
For configuration options please check out the configurations page.
"},{"location":"filtering/","title":"Filtering","text":"Filtering is applied as third step in the data pipeline, see concept. Filters can be added/removed by plug-and-play. For details see the configurations page.
"},{"location":"filtering/#filtering-options","title":"Filtering options","text":"Merging is applied as third step in the data pipeline, as shown in concept. Merging is mandatory, as the results of the different aggregations need to be merged into a homogenous data format to retrieve a holistic picture. Further, duplications are removed. For details on configuration possibilities see the configurations page.
"},{"location":"merging/#concept","title":"Concept","text":"The following concepts and data operations are applied in the merging step of CLARA:
In CLARA the merging of two detected components by different aggregators is defined as merging a comparison object on top of the base object. In general, the components aggregated from the Kubernetes API are considered as the base component and OpenTelemetry components are considered as compare components. This is the case, because the Kubernetes API can be perceived as the ground truth about what is deployed in the cluster.
"},{"location":"merging/#comparing","title":"Comparing","text":"In the comparison step for every Kubernetes component a matching OpenTelemetry component is searched. The matching is currently only be done by the name. Thereby CLARA can be configured to match only equal names or also match if one name contains the other (e.g. cart-pod-12345 and cart).
"},{"location":"merging/#merging_1","title":"Merging","text":"In the merging step both component objects from Kubernetes and from OpenTelemetry are merged into a new final component object. Thereby, the Kubernetes component is providing the service-name, Kubernetes namespace, IP-address, and if applicable the version. The OpenTelemetry Component provides the endpoints and most likely the service type (e.g. database).
"},{"location":"merging/#dealing-with-renamed-components","title":"Dealing with Renamed Components","text":"If a merged component was matched via a \"contains\"-pattern matching it is likely, that the final component has a different name then the OpenTelemetry component. Thus, the relations discovered between OpenTelemetry components need to be adjusted to match the new naming.
"},{"location":"merging/#leftover-components","title":"Leftover Components","text":"All components that could not be matched are simply mapped to a final component, with whatever attributes are available, to not lose any information.
"},{"location":"merging/#communications","title":"Communications","text":"Communications do not really have to be merged, as they are simply stacked upon each other in the exporter and do not contain any meta-information except source and target.
"},{"location":"merging/#adjusting-messaging-communications","title":"Adjusting Messaging Communications","text":"Communications that are tagged as messaging communication, are also adjusted in the merger. CLARA can be configured to either show communications via a message broker or filter out the message broker and show the communications between the communications directly. The latter can make it easier to understand the real communication flows of an application, especially if everything runs via a message broker.
"},{"location":"setup/","title":"Setup instructions for CLARA step-by-step","text":"These instructions will walk you through the initial installation and setup of the CLARA project. A deployed instance of the Gropius project as well as access to a kubernetes cluster are required.
"},{"location":"setup/#1-prerequisites","title":"1. Prerequisites","text":""},{"location":"setup/#11-getting-clara","title":"1.1. Getting CLARA","text":"Clone the CLARA repository.
git clone https://github.com/ccims/clara.git\n
or git clone git@github.com:ccims/clara.git\n
Java Installation
Make sure you have at least a Java 21 JVM installed and configured on your machine.
"},{"location":"setup/#12-kube-api","title":"1.2. kube-api","text":"ktunnel allows CLARA to stream data from inside the cluster to the outside, thus not needing to be deployed inside the cluster.
Either use homebrew:
brew tap omrikiei/ktunnel && brew install omrikiei/ktunnel/ktunnel\n
or fetch the binaries from the release page."},{"location":"setup/#2-aggregator-setup-and-configuration","title":"2. Aggregator Setup and Configuration","text":"CLARA relies on different aggregation components, that each need individual preparation. Although each aggregator is not mandatory, it is recommended to go through the setup of all following aggregators.
"},{"location":"setup/#21-opentelemetry-auto-instrumentation","title":"2.1 OpenTelemetry auto-instrumentation","text":"CLARA utilizes the opentelemetry auto-instrumentation to add spans to the cluster's communication.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.4/cert-manager.yaml\n
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml\n
kubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/configmap.yml\nkubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/deployment.yml \n
kubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/autoinstrumentation.yml\n
deployment.yaml
individually: spec:\n template:\n metadata:\n annotations: \n # choose one or more of the following for each deployment\n instrumentation.opentelemetry.io/inject-java: \"true\"\n instrumentation.opentelemetry.io/inject-dotnet: \"true\" \n instrumentation.opentelemetry.io/inject-go: \"true\" \n instrumentation.opentelemetry.io/inject-nodejs: \"true\" \n instrumentation.opentelemetry.io/inject-python: \"true\" \n
Ensure you can access and if necessary configure the kube-dns in the kube-system namespace. When using a managed cluster from a service provider, changes to core components of Kubernetes might not be allowed directly. Please consult the documentation of your respective provider.
kubectl logs -l k8s-app=kube-dns -n kube-system\n
[INFO] 10.244.0.19:35065 - 3179 \"A IN kubernetes.default.svc.cluster.local.svc.cluster.local. udp 72 false 512\" NXDOMAIN qr,aa,rd 165 0.0000838s\n
CLARA uses syft to generate SBOMs from container images.
brew install syft\n
All OS: curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin \n
<path-to-clara>/clara-app/src/main/resources/config.yml
config.yml
please insert your specific URLs and authorization information for accessing your deployed Gropius instance. Sensitive credentials are prepared to be set as environment variables../gradlew clean build standaloneJar\n
export CLARA_GROPIUS_GRAPHQL_CLIENT_ID=<your-client-id>\nexport CLARA_GROPIUS_GRAPHQL_CLIENT_SECRET=<your-client-secret>\nexport CLARA_GROPIUS_PROJECT_ID=<your-project-id>\n
java -jar clara-app/build/libs/clara-app-*.jar\n
ktunnel inject deployment otel-collector-deployment 7878 -n <your-namespace>\n
CLARA has been evaluated against the T2-Project Reference-Architecture (Microservices).
Follow this guideline step-by-step to recreate the evaluation of CLARA using the T2-Project, locally on a minikube cluster.
"},{"location":"validation/t2-reference-architecture/#step-by-step-setup-and-execution-instructions","title":"Step-By-Step Setup and Execution Instructions","text":"The setup consists of the Gropius setup, the minikube setup, the T2-Project setup, and the CLARA setup.
"},{"location":"validation/t2-reference-architecture/#1-gropius-setup","title":"1. Gropius Setup","text":""},{"location":"validation/t2-reference-architecture/#11-getting-gropius","title":"1.1 Getting Gropius","text":"Clone the Gropius repository recursive with all submodules.
git clone --recurse-submodules https://github.com/ccims/gropius.git\n
or git clone --recurse-submodules git@github.com:ccims/gropius.git\n
Docker Installation
Make sure you have a local container environment (e.g. Docker) installed and configured on your machine.
docker-compose -f docker-compose-testing.yaml up -d\n
Admin
in the top menu, then select OAuth2
on the left menu.+
to create a new OAuth2 client.CLARA
http://localhost:7878
Admin
account.requires secret
.is valid
.Create auth client
.CLARA
in the list.ID
-icon and copy the client-id and store it where you find it again.Projects
in the top menu.+
to create a Project:https://example.org
create project
.git clone https://github.com/ccims/template-importer.git\n
or git clone git@github.com:ccims/template-importer.git\n
npm i\nnpm run build\n
npm start <path-to-clara>/gropius_templates.json <your-client-id> <your-client-secret> http://localhost:4200\n
minikube start\n
kubectl ctx\n
clara
in the minikube cluster: kubectl create ns clara\n
sh git clone https://github.com/t2-project/devops.git
or git clone git@github.com:t2-project/devops.git\n
devops/k8s/t2-microservices/base
, where you find the deployment manifests for the T2-Project microservices.Deployment
part of the respective yaml file (except for the postgres services): spec:\n template:\n metadata:\n annotations: \n instrumentation.opentelemetry.io/inject-java: \"true\"\n
clara
as the target namespace.devops
-repository navigate back to devops/k8s
and execute the following to install the T2-Project into the cluster: chmod +x ./start-microservices.sh\n./start-microservices.sh clara\n
kubectl -n clara describe pod <any-pod>
and ensure they have OTLP
attributes inside the description-yaml.kubectl -n clara port-forward svc/ui 7000:80\n
CLARA is short for CLuster Architecture Recovery Assistent. CLARA can help by extracting component-based software architectures from applications deployed in Kubernetes clusters and exporting them into a visually appealing format. Thus, IT-architects that need insights about the components actually deployed in the Cluster can be assisted by using CLARA.
For details on the functionality of CLARA see the concept page. For information on configuration options of CLARA see the configurations page. For information on the validation of CLARA's functionality see the validation page for the T2 Reference Architecture.
CLARA is open-sourced under the MIT License.
"},{"location":"aggregation/","title":"Aggregation","text":"For aggregating data from the cluster four different data sources are used in CLARA:
Aggregations can be enabled/disabled, yet it is recommended to always use all available aggregators to get a holistic view of the examined architecture. For details see configurations.
"},{"location":"aggregation/platforms/kubernetes/api/","title":"API","text":"CLARA utilizes the Kubernetes API to retrieve basic information about the pods, services and deployments running in the cluster.
The fabric8 Kubernetes client is used by CLARA to communicate to the Kubernetes API.
"},{"location":"aggregation/platforms/kubernetes/api/#concept","title":"Concept","text":"The Kubernetes API is queried for pods and services from the configured namespace (see configurations). Pods are then matched to a service and all services with their respective pods and all unmatched pods are provided into the datapipeline.
"},{"location":"aggregation/platforms/kubernetes/dns/","title":"DNS","text":"CLARA can analyze the logs of CoreDNS (the default Kubernetes DNS server) to discover communication of components via DNS queries. For that feature to work correctly, it is crucial that the DNS server is configured to log DNS queries by enabling the log
plugin.
apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: coredns\n namespace: kube-system\ndata:\n Corefile: |\n .:53 {\n log\n errors\n health\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n pods insecure\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . /etc/resolv.conf\n cache 30\n loop\n reload\n loadbalance\n }\n
Other DNS servers
Your cluster might come with additional DNS servers to reduce the load. A prominent example is the node-local-dns for caching DNS. There, you must also enable the log
plugin.
Compatible DNS servers
Because CLARA analyzes the logged DNS queries,
Currently, CLARA analyzes all logs from the pods with the labels k8s-app=kube-dns
or k8s-app=node-local-dns
in the namespace kube-system
.
Using a managed Kubernetes cluster from a service provider
When using a managed cluster from a service provider, changes to core components of Kubernetes might be not allowed directly. Please consult the documentation of your respective provider.
"},{"location":"aggregation/platforms/kubernetes/dns/#digitalocean","title":"DigitalOcean","text":"For DigitalOcean, the correct way of enabling logging is to create a special ConfigMap:
ConfigMap to activate query logging for CoreDNS in a Kubernetes cluster managed by DigitalOceanapiVersion: v1\nkind: ConfigMap\nmetadata:\n name: coredns-custom\n namespace: kube-system\ndata:\n log.override: |\n log\n
"},{"location":"aggregation/platforms/kubernetes/dns/#dns-debugging","title":"DNS debugging","text":"As described in the Kubernetes Documentation, you can use dnsutils to debug DNS resolution. For CLARA, this is also a simple way of creating DNS queries explicitly and checking if CLARA detects the communication. Just create a dnsutils-pod with the following manifest:
apiVersion: v1\nkind: Pod\nmetadata:\n name: dnsutils\n namespace: default\nspec:\n containers:\n - name: dnsutils\n image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.7\n command:\n - sleep\n - \"infinity\"\n imagePullPolicy: IfNotPresent\n restartPolicy: Always\n
Then you can use the following command to execute DNS queries:
kubectl exec -it dnsutils -n default -- nslookup google.com\n
Execute the following command to check the DNS server logs:
kubectl logs -l k8s-app=kube-dns -n kube-system\n
"},{"location":"aggregation/platforms/kubernetes/dns/#concept","title":"Concept","text":"The log DNS analysis uses the obtained information from the Kubernetes API to match the hostnames and ip-addresses in a DNS log to components of the cluster. An example log can look like this and provides disclosure about the source and target of a communication.
[INFO] 10.244.0.19:35065 - 3179 \"A IN kubernetes.default.svc.cluster.local.svc.cluster.local. udp 72 false 512\" NXDOMAIN qr,aa,rd 165 0.0000838s\n
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/","title":"OpenTelemetry","text":"CLARA can utilize OpenTelemetry traces as data source for finding components and to some extent component types, as well as communications between components. For that feature to work correctly, it is crucial to have instrumented applications and an OpenTelemetry Collector running in the cluster as described below.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#concept","title":"Concept","text":"\"OpenTelemetry is an Observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs.\" Traces and metrics are generated by each component individually and are forwarded to an OpenTelemetry collector, which processes the telemetry data and distributes it to a backend which utilizes it. CLARA can be seen as such a backend, which offers a gRPC endpoint for the oTel-collector to forward the traces to. CLARA then iterates over the traces and extracts information about components and their communications from that.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#setup","title":"Setup","text":"When using OpenTelemetry for CLARA you first need to ensure your software components are instrumented with OpenTelemetry traces. If not consider using OpenTelemetry auto-instrumentation.
OpenTelemetry Semantic Conventions
Because OpenTelemetry traces' attributes are not standardized, it is recommended to use tracing with the OpenTelemetry semantic conventions for CLARA. If your services do not provide them, you can try to set up the OpenTelemetry auto-instrumentation on top of your system.
Second, ensure that there is an OpenTelemetry collector with the matching configuration is running in your cluster. Third, when you use CLARA on a local machine and do not deploy in the cluster, you need to forward the traces from the OpenTelemetry collector to your local machine. The open-source tool ktunnel can be used to achieve this.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#opentelemetry-collector","title":"OpenTelemetry Collector","text":"The OpenTelemetry collector is a default component provided by OpenTelemetry itself. For CLARA only traces are used, thus the minimal configuration The image can be used to deploy a container with a suitable configuration as shown below. Examples for service and deployment configurations can be found in clara/deployment/open-telemetry-collector/deployment.yml
An example ConfigMap for the oTel-collector deploymentapiVersion: v1\nkind: ConfigMap\nmetadata:\n name: otel-collector-conf\n labels:\n app: otel-collector-conf\n component: otel-collector-conf\ndata:\n otel-collector-conf: |\n receivers:\n otlp:\n protocols:\n grpc:\n http:\n\n processors:\n\n exporters:\n otlp:\n endpoint: \"localhost:7878\"\n tls:\n insecure: true\n\n service:\n pipelines:\n traces:\n receivers: [otlp]\n processors: []\n exporters: [otlp]\n
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#ktunnel","title":"ktunnel","text":"ktunnel is an open-source tool that enables reverse port-forwarding to extract data out of kubernetes clusters. In order to use CLARA on a local machine, a ktunnel sidecar can be attached towards the OpenTelemetry collector deployment using ktunnel inject deployment otel-collector-deployment 7878 -n <namespace>
For further information see the ktunnel docs.
OpenTelemetry Auto-instrumentation can be used to generate OpenTelemetry traces on software components that are not instrumented themselves in Kubernetes clusters. This works by applying sidecar containers to each yet to be instrumented service that capture the network traffic and generate traces from that. For documentation on installation please see the official docs from OpenTelemetry.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#aggregation-algorithms","title":"Aggregation Algorithms","text":"After the OpenTelemetry-aggregator finished collecting the traces, the algorithm will iterate over all traces and extract architectural information from it. An OpenTelemetry spans can be of one of five kinds, Producer, Consumer, Client, Server and Internal. Internal spans are ignored, Client- and Server-spans as well as Producer- and Consumer- spans are analyzed seperated from each other, as described below.
"},{"location":"aggregation/platforms/kubernetes/opentelemetry/#client-server","title":"Client-Server","text":"Spans of Client and Server kind are analyzed the following way:
A client as well as a server span can disclose information about the sending component as well as the respective other component in the communication. Therefore, from each span two possible components are obtained, that receive all the information available.
As there is no definitive standard for the naming and the values of span-attributes, the following logic is applied to extract information from span attributes:
Producers and Consumers of message-oriented communications are specifically tagged, because there is no directly observable communication between the source and the target component. Based on the semantic conventions, however, a producer-span contains the messaging-destination which can be used to obtain the communication. Therefore, the recovery looks as follows:
As each analyzed span creates at least two component objects, those need to be merged into a consistent pattern. The merging is done the following way:
The component objects finally need to be mapped to the CLARA-wide internal component and communication representation. Component objects containing a service-name are mapped to an \"internal\" component object, components without a service-name are mapped to an \"external\" one. Communications are mapped if a matching component via service-name or hostname for source and target can be found.
"},{"location":"aggregation/platforms/kubernetes/syft/","title":"SBOM","text":"CLARA can utilize anchore/syft to create SBOM files in SPDX format from the recovered components. This is done to extract the dependencies and external libraries of the recovered architecture.
"},{"location":"aggregation/platforms/kubernetes/syft/#concept","title":"Concept","text":"In order to get the library information of each component, CLARA passes the recovered image and version tag to syft. The syft binary then fetches the image from docker-hub and analyzes its contents and creates the SPDX files. Lastly, the SPDX files for the components are read by CLARA and each library and version from is added to the respective component.
"},{"location":"aggregation/platforms/kubernetes/syft/#setup","title":"Setup","text":"Install the binary from anchore/syft for your respective OS: macOS:
brew install syft\n
All OS: curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin \n
For configuration options please see the configurations page."},{"location":"concept/","title":"Concept","text":"CLARA itself is a data pipeline that collects information about an application deployed in a Kubernetes cluster from different data sources (i.e. the Kubernetes API, the Kubernetes internal DNS server and OpenTelemetry traces), merges them, filters them and exports them to visually display the architecture of the examined application.
"},{"location":"concept/#datapipeline","title":"Datapipeline","text":"The datapipeline of CLARA consists of the four main steps:
The typical configuration for CLARA is in the YAML format. Below, each available option is explained. All options with a default value are optional.
Sensitive Information & Environment Variables
Sensitive information, like usernames and passwords, don't belong into configuration files! For that reason, each configuration option can be specified by an environment variable. Instead of the actual value, the BASH-like syntax ${NAME_OF_THE_ENV_VARIABLE}
can be used to specify the environment variable.
Interpolation and specifying default values is possible as well, just have a look here! There you can also find other ways to set up the configuration to be effective, like referencing and substituting other parts of the configuration to make it more DRY.
"},{"location":"configuration/#general-configuration-options","title":"General configuration options","text":"app.log-configkube
-prefix should also get scanned by CLARA. Must be set to true when every namespace should be scanned, even when namespaces has the *
-wildcard.kube
-namespaces) set just the *
-wildcard as the only element. The *
needs to be in quotes.2024-01-01T00:00:00Z
)Equals
, Prefix
, Suffix
, Contains
)Equals
needs the same names, Prefix
and Suffix
need to have matching strings on the start or the end respectively, Contains
needs that one string is part of the other.true
, CLARA will define communications that go via a message broker directly between the components and removes the communications to the message broker. If false
it show the communications via the message broker. true
, the endpoints of the components are filtered out before the export, to improve visibility in complex architectures.true
, the versions of the components are filtered out before the export, to reduce updates when components are often released.true
, CLARA will export the recovered architecture using the enabled exporters, even if the architecture is completely empty. This could be useful for debugging purposes.BMP
, DOT
, GIF
, JPG
, JPEG
, JSON
, PDF
, PNG
, SVG
, TIFF
)SVG
is known to work well and in most situations the best choice.generated/architecture.svg
Delete
or Modify
)app:\n log-config: true\n block-after-finish: false\n\naggregation:\n platforms:\n kubernetes:\n include-kube-namespaces: false\n namespaces:\n - abc\n - xyz\n aggregators:\n kube-api:\n enable: true\n dns:\n enable: true\n logs-since-time: 2024-02-01T00:00:00Z\n open-telemetry:\n enable: true\n listen-port: 7878\n listen-duration: 45 minutes\n syft-sbom:\n enable: true\n sbom-file-path: sbom/\n use-stored-sbom-files: false\n\nmerge:\n comparison-strategy: Equals\n show-messaging-communications-directly: true\n\nfilter:\n remove-component-endpoints: false\n remove-components-by-names:\n - otel-collector-service\n\nexport:\n on-empty: false\n exporters:\n graphviz:\n enable: true\n output-type: SVG\n output-file: generated/architecture.svg\n gropius:\n enable: true\n project-id: aaaaaaaa-1111-bbbb-2222-cccccccccccc\n graphql-backend-url: http://my.backend.com:8080/graphql\n graphql-backend-authentication:\n authentication-url: http://my.backend.com:3000/authenticate/oauth/xxxxxxxx-1111-yyyy-2222-zzzzzzzzzzzz/token\n client-id: ${CLARA_GROPIUS_GRAPHQL_CLIENT_ID}\n client-secret: ${CLARA_GROPIUS_GRAPHQL_CLIENT_SECRET}\n
"},{"location":"export/","title":"Export","text":"CLARA offers two different ways for exporting the aggregated Architecture:
Gropius is an open-source cross-component issue management system for component-based architectures. In order to enable managing cross-component dependencies, users can model component-based software architectures in a Gropius project, e.g. via the API. For more details on Gropius visit the GitHub Page.
For configuration options of the export please check out the configurations page.
"},{"location":"export/gropius/#data-model","title":"Data Model","text":"The data model of Gropius consists of components which can be specified with templates as well as relations between those components, also configurable via templates. A component must have a component and a repository-URL in order to be added to a project, which resembles an architecture.
CLARA components are mapped to Gropius-components like this:
CLARA Metamodel Gropius Metamodel InternalComponent Component \u00a0\u00a0\u00a0\u00a0InternalComponent.Name \u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0InternalComponent.IpAddress \u00a0\u00a0\u00a0\u00a0Component.Description \u00a0\u00a0\u00a0\u00a0InternalComponent.Version \u00a0\u00a0\u00a0\u00a0Component.ComponentVersion \u00a0\u00a0\u00a0\u00a0InternalComponent.Namespace MISSING \u00a0\u00a0\u00a0\u00a0InternalComponent.Endpoints MISSING (Note, that Gropius is capable of modeling interfaces, yet due to a lack of time this is not performed in the current work.) MISSING \u00a0\u00a0\u00a0\u00a0Component.RepositoryURL (Example URL) \u00a0\u00a0\u00a0\u00a0InternalComponent.Type \u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Database \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Database Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Microservice \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Microservice Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Type.Messaging \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Messaging Temp. \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0null \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Base Component Temp. \u00a0\u00a0\u00a0\u00a0InternalComponent.Libraries \u00a0\u00a0\u00a0\u00a0Components \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.ComponentVersion \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Name \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.Description MISSING \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate (Library Temp.) \u00a0\u00a0\u00a0\u00a0InternalComponent.Library \u00a0\u00a0\u00a0\u00a0Relation \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0InternalComponent.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.Start \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Library.Version \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.End MISSING \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Relation.RelationTemplate (Includes Temp.) ExternalComponent Component \u00a0\u00a0\u00a0\u00a0ExternalComponent.Name \u00a0\u00a0\u00a0\u00a0Component.Name \u00a0\u00a0\u00a0\u00a0ExternalComponent.Domain \u00a0\u00a0\u00a0\u00a0Component.Description \u00a0\u00a0\u00a0\u00a0ExternalComponent.Type \u00a0\u00a0\u00a0\u00a0Component.ComponentTemplate (Misc Temp.) \u00a0\u00a0\u00a0\u00a0ExternalComponent.Version \u00a0\u00a0\u00a0\u00a0Component.ComponentVersion Communication Relation \u00a0\u00a0\u00a0\u00a0Communication.Source.Version \u00a0\u00a0\u00a0\u00a0Relation.Start \u00a0\u00a0\u00a0\u00a0Communication.Target.Version \u00a0\u00a0\u00a0\u00a0Relation.End MISSING \u00a0\u00a0\u00a0\u00a0Relation.RelationTemplate (Calls Temp.)The Gropius GraphQL API is utilized by CLARA in order to export the recovered architectures into a Gropius project.
"},{"location":"export/gropius/#export-flow","title":"Export Flow","text":"The export works sketched like this based on the respective configuration:
For all CRUD operations there are predefined GraphQL queries which are transformed into Kotlin Models using this GraphQl gradle plugin and executed using this GraphQL Kotlin Spring client. The GraphQL queries are located in the clara-graphql directory.
"},{"location":"export/svg/","title":"SVG Export","text":"The GraphViz SVG exporter exports a superficial representation of the architecture as displayed in the example image below.
Legend:
For configuration options please check out the configurations page.
"},{"location":"filtering/","title":"Filtering","text":"Filtering is applied as third step in the data pipeline, see concept. Filters can be added/removed by plug-and-play. For details see the configurations page.
"},{"location":"filtering/#filtering-options","title":"Filtering options","text":"Merging is applied as third step in the data pipeline, as shown in concept. Merging is mandatory, as the results of the different aggregations need to be merged into a homogenous data format to retrieve a holistic picture. Further, duplications are removed. For details on configuration possibilities see the configurations page.
"},{"location":"merging/#concept","title":"Concept","text":"The following concepts and data operations are applied in the merging step of CLARA:
In CLARA the merging of two detected components by different aggregators is defined as merging a comparison object on top of the base object. In general, the components aggregated from the Kubernetes API are considered as the base component and OpenTelemetry components are considered as compare components. This is the case, because the Kubernetes API can be perceived as the ground truth about what is deployed in the cluster.
"},{"location":"merging/#comparing","title":"Comparing","text":"In the comparison step for every Kubernetes component a matching OpenTelemetry component is searched. The matching is currently only be done by the name. Thereby CLARA can be configured to match only equal names or also match if one name contains the other (e.g. cart-pod-12345 and cart).
"},{"location":"merging/#merging_1","title":"Merging","text":"In the merging step both component objects from Kubernetes and from OpenTelemetry are merged into a new final component object. Thereby, the Kubernetes component is providing the service-name, Kubernetes namespace, IP-address, and if applicable the version. The OpenTelemetry Component provides the endpoints and most likely the service type (e.g. database).
"},{"location":"merging/#dealing-with-renamed-components","title":"Dealing with Renamed Components","text":"If a merged component was matched via a \"contains\"-pattern matching it is likely, that the final component has a different name then the OpenTelemetry component. Thus, the relations discovered between OpenTelemetry components need to be adjusted to match the new naming.
"},{"location":"merging/#leftover-components","title":"Leftover Components","text":"All components that could not be matched are simply mapped to a final component, with whatever attributes are available, to not lose any information.
"},{"location":"merging/#communications","title":"Communications","text":"Communications do not really have to be merged, as they are simply stacked upon each other in the exporter and do not contain any meta-information except source and target.
"},{"location":"merging/#adjusting-messaging-communications","title":"Adjusting Messaging Communications","text":"Communications that are tagged as messaging communication, are also adjusted in the merger. CLARA can be configured to either show communications via a message broker or filter out the message broker and show the communications between the communications directly. The latter can make it easier to understand the real communication flows of an application, especially if everything runs via a message broker.
"},{"location":"setup/","title":"Setup instructions for CLARA step-by-step","text":"These instructions will walk you through the initial installation and setup of the CLARA project. A deployed instance of the Gropius project as well as access to a kubernetes cluster are required.
"},{"location":"setup/#1-prerequisites","title":"1. Prerequisites","text":""},{"location":"setup/#11-getting-clara","title":"1.1. Getting CLARA","text":"Clone the CLARA repository.
git clone https://github.com/ccims/clara.git\n
or git clone git@github.com:ccims/clara.git\n
Java Installation
Make sure you have at least a Java 21 JVM installed and configured on your machine.
"},{"location":"setup/#12-kube-api","title":"1.2. kube-api","text":"ktunnel allows CLARA to stream data from inside the cluster to the outside, thus not needing to be deployed inside the cluster.
Either use homebrew:
brew tap omrikiei/ktunnel && brew install omrikiei/ktunnel/ktunnel\n
or fetch the binaries from the release page."},{"location":"setup/#2-aggregator-setup-and-configuration","title":"2. Aggregator Setup and Configuration","text":"CLARA relies on different aggregation components, that each need individual preparation. Although each aggregator is not mandatory, it is recommended to go through the setup of all following aggregators.
"},{"location":"setup/#21-opentelemetry-auto-instrumentation","title":"2.1 OpenTelemetry auto-instrumentation","text":"CLARA utilizes the opentelemetry auto-instrumentation to add spans to the cluster's communication.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.4/cert-manager.yaml\n
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml\n
kubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/configmap.yml\nkubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/deployment.yml \n
kubectl -n <target-namespace> apply -f <path-to-clara>/deployment/open-telemetry-collector/autoinstrumentation.yml\n
deployment.yaml
individually: spec:\n template:\n metadata:\n annotations: \n # choose one or more of the following for each deployment\n instrumentation.opentelemetry.io/inject-java: \"true\"\n instrumentation.opentelemetry.io/inject-dotnet: \"true\" \n instrumentation.opentelemetry.io/inject-go: \"true\" \n instrumentation.opentelemetry.io/inject-nodejs: \"true\" \n instrumentation.opentelemetry.io/inject-python: \"true\" \n
Ensure you can access and if necessary configure the kube-dns in the kube-system namespace. When using a managed cluster from a service provider, changes to core components of Kubernetes might not be allowed directly. Please consult the documentation of your respective provider.
kubectl logs -l k8s-app=kube-dns -n kube-system\n
[INFO] 10.244.0.19:35065 - 3179 \"A IN kubernetes.default.svc.cluster.local.svc.cluster.local. udp 72 false 512\" NXDOMAIN qr,aa,rd 165 0.0000838s\n
CLARA uses syft to generate SBOMs from container images.
brew install syft\n
All OS: curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin \n
<path-to-clara>/clara-app/src/main/resources/config.yml
config.yml
please insert your specific URLs and authorization information for accessing your deployed Gropius instance. Sensitive credentials are prepared to be set as environment variables../gradlew clean build standaloneJar\n
export CLARA_GROPIUS_GRAPHQL_CLIENT_ID=<your-client-id>\nexport CLARA_GROPIUS_GRAPHQL_CLIENT_SECRET=<your-client-secret>\nexport CLARA_GROPIUS_PROJECT_ID=<your-project-id>\n
java -jar clara-app/build/libs/clara-app-*.jar\n
ktunnel inject deployment otel-collector-deployment 7878 -n <your-namespace>\n
CLARA has been evaluated against the T2-Project Reference-Architecture (Microservices).
Follow this guideline step-by-step to recreate the evaluation of CLARA using the T2-Project, locally on a minikube cluster.
"},{"location":"validation/t2-reference-architecture/#step-by-step-setup-and-execution-instructions","title":"Step-By-Step Setup and Execution Instructions","text":"The setup consists of the Gropius setup, the minikube setup, the T2-Project setup, and the CLARA setup.
"},{"location":"validation/t2-reference-architecture/#1-gropius-setup","title":"1. Gropius Setup","text":""},{"location":"validation/t2-reference-architecture/#11-getting-gropius","title":"1.1 Getting Gropius","text":"Clone the Gropius repository recursive with all submodules.
git clone --recurse-submodules https://github.com/ccims/gropius.git\n
or git clone --recurse-submodules git@github.com:ccims/gropius.git\n
Docker Installation
Make sure you have a local container environment (e.g. Docker) installed and configured on your machine.
docker-compose -f docker-compose-testing.yaml up -d\n
Admin
in the top menu, then select OAuth2
on the left menu.+
to create a new OAuth2 client.CLARA
http://localhost:7878
Admin
account.requires secret
.is valid
.Create auth client
.CLARA
in the list.ID
-icon and copy the client-id and store it where you find it again.Projects
in the top menu.+
to create a Project:https://example.org
create project
.git clone https://github.com/ccims/template-importer.git\n
or git clone git@github.com:ccims/template-importer.git\n
npm i\nnpm run build\n
npm start <path-to-clara>/gropius_templates.json <your-client-id> <your-client-secret> http://localhost:4200\n
minikube start\n
kubectl ctx\n
clara
in the minikube cluster: kubectl create ns clara\n
clara
as the target namespace.t2-deployment
and execute the following to install the T2-Project into the cluster: chmod +x ./start-microservices.sh\n./start-microservices.sh clara\n
kubectl -n clara describe pod <any-pod>
and ensure they have OTLP
attributes inside the description-yaml.kubectl -n clara port-forward svc/ui 7000:80\n