Skip to content

Commit

Permalink
Add support for splunk-otel-java v2.x.x (#1349)
Browse files Browse the repository at this point in the history
* Add support for splunk-otel-java v2.x.x

* For Java instrumentation update OTEL_EXPORTER_OTLP_ENDPOINT 4317 -> 4318

* patch

* Regenerate expected_java_traces.yaml

* Update Java trace tests

* Update CI/CD to only updat the Java docker image to v2.x versions

* temp

* Bump Java to v2.7.0

* Add CHANGELOG.md and UPGRADING.md entries

* doc update

* test update
  • Loading branch information
jvoravong authored Aug 22, 2024
1 parent f6bb73a commit 0662515
Show file tree
Hide file tree
Showing 8 changed files with 295 additions and 226 deletions.
12 changes: 12 additions & 0 deletions .chloggen/update-java-2x.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking
# The name of the component, or a single word describing the area of concern, (e.g. agent, clusterReceiver, gateway, operator, chart, other)
component: operator
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Bump java from v1.32.3 to v2.7.0 in helm-charts/splunk-otel-collector/values.yaml
# One or more tracking issues related to the change
issues: [1349]
# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: This is a major upgrade. If you use Java auto-instrumentation please review the [upgrade guidelines](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/UPGRADING.md#01053-01070)
2 changes: 1 addition & 1 deletion .github/workflows/update_docker_images.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
component: 'operator'
yaml_file_path: 'helm-charts/splunk-otel-collector/values.yaml'
yaml_value_path: '.operator.instrumentation.spec.java'
filter: 'v1.'
filter: 'v2.'
- name: 'nodejs'
component: 'operator'
yaml_file_path: 'helm-charts/splunk-otel-collector/values.yaml'
Expand Down
68 changes: 68 additions & 0 deletions UPGRADING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,73 @@
# Upgrade guidelines

## 0.105.3 to 0.107.0

The `Java instrumentation` for Operator auto-instrumentation has been upgraded from v1.32.2 to v2.7.0.
This major update introduces several breaking changes. Below we have supplied a customer migration
guide and outlined the key changes to highlight the impact.

Please refer to the [Migration guide for OpenTelemetry Java 2.x](https://docs.splunk.com/observability/en/gdi/get-data-in/application/java/migrate-metrics.html)
to update your custom dashboards, detectors, or alerts using Java application telemetry data.

### Breaking Changes Overview
- Runtime metrics will now be enabled by default, this can increase the number of metrics collected.
- The default protocol changed from gRPC to http/protobuf. For custom Java exporter endpoint
configurations, verify that you’re sending data to http/protobuf endpoints like this [example](https://github.com/signalfx/splunk-otel-collector-chart/blob/splunk-otel-collector-0.107.0/examples/enable-operator-and-auto-instrumentation/rendered_manifests/operator/instrumentation.yaml#L59).
- Span Attribute Name Changes:

| Old Attribute (1.x) | New Attribute (2.x) |
| ----------------------------- | ----------------------------- |
| http.method | http.request.method |
| http.status_code | http.response.status_code |
| http.request_content_length | http.request.body.size |
| http.response_content_length | http.response.body.size |
| http.target | url.path and url.query |
| http.scheme | url.scheme |
| http.client_ip | client.address |

- Metric Name Changes:

| Old Metric (1.x) | New Metric (2.x) |
|-------------------------------------------------------------------------|------------------------------------------------------|
| db.pool.connections.create_time | db.client.connections.create_time (Histogram, ms) |
| db.pool.connections.idle.max | db.client.connections.idle.max |
| db.pool.connections.idle.min | db.client.connections.idle.min |
| db.pool.connections.max | db.client.connections.max |
| db.pool.connections.pending_threads | db.client.connections.pending_requests |
| db.pool.connections.timeouts | db.client.connections.timeouts |
| db.pool.connections.idle | db.client.connections.usage[state=idle] |
| db.pool.connections.active | db.client.connections.usage[state=used] |
| db.pool.connections.use_time | db.client.connections.use_time (Histogram, ms) |
| db.pool.connections.wait_time | db.client.connections.wait_time (Histogram, ms) |
| runtime.jvm.buffer.count | jvm.buffer.count |
| runtime.jvm.buffer.total.capacity | jvm.buffer.memory.limit |
| runtime.jvm.buffer.memory.used | jvm.buffer.memory.usage |
| runtime.jvm.classes.loaded | jvm.class.count |
| runtime.jvm.classes.unloaded | jvm.class.unloaded |
| runtime.jvm.gc.concurrent.phase.time | jvm.gc.duration (Histogram, <concurrent gcs>) |
| runtime.jvm.gc.pause | jvm.gc.duration (<non-concurrent gcs>) |
| runtime.jvm.gc.memory.allocated \| process.runtime.jvm.memory.allocated | jvm.memory.allocated* |
| runtime.jvm.memory.committed | jvm.memory.committed |
| runtime.jvm.memory.max | jvm.memory.limit |
| runtime.jvm.gc.max.data.size | jvm.memory.limit{jvm.memory.pool.name=<long lived>} |
| runtime.jvm.memory.used | jvm.memory.used |
| runtime.jvm.gc.live.data.size | jvm.memory.used_after_last_gc{jvm.memory.pool.name=} |
| runtime.jvm.threads.daemon \| runtime.jvm.threads.live | jvm.thread.count |

- Dropped Metrics:
- executor.tasks.completed
- executor.tasks.submitted
- executor.threads
- executor.threads.active
- executor.threads.core
- executor.threads.idle
- executor.threads.max
- runtime.jvm.memory.usage.after.gc
- runtime.jvm.gc.memory.promoted
- runtime.jvm.gc.overhead
- runtime.jvm.threads.peak
- runtime.jvm.threads.states

# 0.93.0 to 0.94.0

The `networkExplorer` option is removed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,19 @@ spec:
- name: OTEL_RESOURCE_ATTRIBUTES
value: splunk.zc.method=autoinstrumentation-go:v0.10.1-alpha
java:
image: ghcr.io/signalfx/splunk-otel-java/splunk-otel-java:v1.32.3
image: ghcr.io/signalfx/splunk-otel-java/splunk-otel-java:v2.7.0
env:
- name: OTEL_RESOURCE_ATTRIBUTES
value: splunk.zc.method=splunk-otel-java:v1.32.3
value: splunk.zc.method=splunk-otel-java:v2.7.0
- name: SPLUNK_OTEL_AGENT
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
# java auto-instrumentation uses http/proto by default, so data must be sent to 4318 instead of 4317.
# See: https://github.com/open-telemetry/opentelemetry-operator#opentelemetry-auto-instrumentation-injection
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://$(SPLUNK_OTEL_AGENT):4318
nginx:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.4
env:
Expand Down
12 changes: 8 additions & 4 deletions functional_tests/functional_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -582,16 +582,15 @@ func testNodeJSTraces(t *testing.T) {
maskScopeVersion(expectedTraces)

err = ptracetest.CompareTraces(expectedTraces, *selectedTrace,
ptracetest.IgnoreResourceAttributeValue("process.pid"),
ptracetest.IgnoreResourceAttributeValue("container.id"),
ptracetest.IgnoreResourceAttributeValue("host.arch"),
ptracetest.IgnoreResourceAttributeValue("k8s.deployment.name"),
ptracetest.IgnoreResourceAttributeValue("k8s.pod.ip"),
ptracetest.IgnoreResourceAttributeValue("k8s.pod.name"),
ptracetest.IgnoreResourceAttributeValue("k8s.pod.uid"),
ptracetest.IgnoreResourceAttributeValue("k8s.replicaset.name"),
ptracetest.IgnoreResourceAttributeValue("os.version"),
ptracetest.IgnoreResourceAttributeValue("host.arch"),
ptracetest.IgnoreResourceAttributeValue("telemetry.sdk.version"),
ptracetest.IgnoreResourceAttributeValue("process.pid"),
ptracetest.IgnoreResourceAttributeValue("splunk.distro.version"),
ptracetest.IgnoreResourceAttributeValue("process.runtime.version"),
ptracetest.IgnoreResourceAttributeValue("process.command"),
Expand All @@ -600,8 +599,11 @@ func testNodeJSTraces(t *testing.T) {
ptracetest.IgnoreResourceAttributeValue("process.owner"),
ptracetest.IgnoreResourceAttributeValue("process.runtime.description"),
ptracetest.IgnoreResourceAttributeValue("splunk.zc.method"),
ptracetest.IgnoreSpanAttributeValue("net.peer.port"),
ptracetest.IgnoreResourceAttributeValue("telemetry.distro.version"),
ptracetest.IgnoreResourceAttributeValue("telemetry.sdk.version"),
ptracetest.IgnoreSpanAttributeValue("http.user_agent"),
ptracetest.IgnoreSpanAttributeValue("net.peer.port"),
ptracetest.IgnoreSpanAttributeValue("network.peer.port"),
ptracetest.IgnoreSpanAttributeValue("os.version"),
ptracetest.IgnoreTraceID(),
ptracetest.IgnoreSpanID(),
Expand Down Expand Up @@ -657,9 +659,11 @@ func testJavaTraces(t *testing.T) {
ptracetest.IgnoreResourceAttributeValue("host.arch"),
ptracetest.IgnoreResourceAttributeValue("telemetry.sdk.version"),
ptracetest.IgnoreResourceAttributeValue("telemetry.auto.version"),
ptracetest.IgnoreResourceAttributeValue("telemetry.distro.version"),
ptracetest.IgnoreResourceAttributeValue("splunk.distro.version"),
ptracetest.IgnoreResourceAttributeValue("splunk.zc.method"),
ptracetest.IgnoreResourceAttributeValue("service.instance.id"),
ptracetest.IgnoreSpanAttributeValue("network.peer.port"),
ptracetest.IgnoreSpanAttributeValue("net.sock.peer.port"),
ptracetest.IgnoreSpanAttributeValue("thread.id"),
ptracetest.IgnoreSpanAttributeValue("thread.name"),
Expand Down
Loading

0 comments on commit 0662515

Please sign in to comment.