-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTLP support via HTTP and GRPC #2
Conversation
Thanks for offering to help! This will be a lot to review, so let's this split out into parts, starting with http either send or receive side. Arbitrarily, maybe start with send/translate, side. Please use exact code style, naming conventions and test approach https://github.com/openzipkin/zipkin-gcp to keep review burden down. It will be easier on me and others maintaining this to look at how gcp is working and compare approach when future maintenance occurs. Also, we have years of 👍 on that repo. Note: If this needs to support the old cc @anuraaga @reta I'll try to watch over this, but as I'm also doing this on a volunteer basis please keep an eye out in case I disappear (not volunteering you, just not sure if you pay attention to this formerly dead repo) |
also for the PR that handles the http collector side, it would make sense to do comparisons with the otel exporter (cc @jack-berg) For example, send the same otel spans in this test, citing the source of course. Then, make sure the assertions are the same result. Citing this test allows future maintainers to keep track in case certain mappings change or new spec versions vary things. Using the same data helps for the same reason. While the project structure of this repo should be the same as zipkin-gcp for the server integration (module, docker, etc), the implementation of the http handlers will be best to follow zipkin itself. For example, here is zipkin's grpc related handlers and I would expect it easiest for this part of the implementation to be similar in code and tests (even though the otel collector additionally needs translation parts similar to gcp) Notice that the grpc collector (which would be the pattern both for http and grpc here) exposes a configurator bean for armeria. So, it doesn't need to extend collector. Basically the handler part is the implementation. This also allows users to not need to track or manage a separate port for these parts as they don't conflict with any of the zipkin server paths. https://github.com/openzipkin/zipkin/blob/master/zipkin-server/src/main/java/zipkin2/server/internal/ZipkinGrpcCollector.java hope these notes help |
finally (for now ;)), if there is a goal to support default ports and compat etc, I would go with otel not necessarily jaeger for comparisons. so, cite https://opentelemetry.io/docs/specs/otlp/#otlpgrpc-default-port and if doing integration tests that use docker, prefer parts in otel orgs vs jaeger for example. What's being supported here is OTLP x.x (vs jaeger's implementation of that) so we want to be as close to the source as possible for clarity. p.s. I would first to OTLP on our standard ports and then a follow-up if we want to make a separate listener, as it keeps the amount of code and review concise. |
import zipkin2.reporter.ClosedSenderException; | ||
import zipkin2.reporter.Encoding; | ||
|
||
public final class OtelGrpcSender extends BytesMessageSender.Base { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one high level comment, I wonder if we need a gRPC sender. FWIU, all OTel implementations support HTTP (including Zipkin collector here!) and it was made the default transport of OTel at some point post-launch (https://opentelemetry.io/docs/specs/otel/protocol/exporter/#specify-protocol), I would expect future backends to possibly have both or otherwise only support HTTP.
A okhttp-based HTTP sender (AFAICT it's not part of this PR) would feel a lot closer to zipkin's other senders than a gRPC one. It could also potentially be a base for a grpc-less gRPC sender if the need for that came but IMO it's probably not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! if we can rule out any protocol for a while, we should. as well we should follow, not lead the norms of spec implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the gRPC stuff entirely then
I have been copying zipkin-gcp all the time to this codebase. Is there anything specific that you find offending the rules in this PR? |
I've been following whatever you have set up there. I can follow Zipkin's part, no problem |
It's already there in the |
@marcingrzejszczak sorry if I caused confusion. my notes above were about how to go about it, and I hadn't done a review, yet. Mainly, I wanted to give you guidance including test (jaeger vs otlp etc) advice before I do a review as I want to conserve time mutually at that phase. There are probably parts of my comments you hadn't addressed, yet, so try to read carefully so we don't need to cover them in the actual PRs to be merged. This will save us both time and days, as I won't be able to allocate much time daily (as it is all free time). In other words focus less on the parts you feel you already did, and more in efforts to find things maybe you haven't. When ready, please peel off the sender/brave side (dropping grpc transport as advised) and lets get that part in first! you can either open another PR or drop the server side from this one. Cheers. |
fyi while tempting to simplify the ports by collocating the OTLP handler on the same port as zipkin's http and grpc stuff, I notice that the port seems important over there, for drop-in reasons I suspect. In other words, it may be best to keep the "standard" OTLP port vs wait until someone to ask for that. just a thought |
hey marcin. Can you pull the infrastructural/tidie changes into a different PR? This would include any modules that are renamed or deleted (e.g. removal of grpc). We can merge mechanical change quickly without a lot of focus. Next, you could extract a PR about the brave send/translate with only that part in a PR? So, just the brave AsyncSpanHandler part (brave.Handler) without a dependency on zipkin2.Span. Once we have that in, we can chunk something else into yet another PR notes below as I quickly browsed through some of the changes On translation, when I asked about using the same test data. Basically, make sure we reference where the data/scenarios of translation are coming from. Going beyond that to depend or copy otel classes may make this difficult to maintain. Some types of translation concerns I recall in other products are linked here, but note the below are zipkin2.Span and I think we shouldn't do that or start with it. Just the same approach for an encoder of mutableSpan https://github.com/openzipkin/zipkin-aws/blob/master/storage/xray-udp/src/test/java/zipkin2/storage/xray_udp/UDPMessageEncoderTest.java on tests is possible we can focus on decoding with otel's proto or something, to assert the translation. I mention this because I noticed copying abstraction classes from otel and I think that will be maintenance, and unlike the rest of the code in this org. Main thing is to capture and cite the scenarios. We dont need to make otel specific behavior tests for senders, in fact I'm not sure we need a sender anymore. My thoughts were that once we remove grpc, basically all we are left with is an encoder, and can use the existing httpurlconnectionsender or okhttp one, pointed at the OTLP http port. If anything, we could have a convenience wrapper to configure an http sender for otel. If we need a separate sender we should keep it small and be loud about why. Integration tests are ideally an existing http sender and the brave encoder and point it to some docker image which can read it back. We would throw a happy path mutableSpan and prove the validity of at least one of our translation scenarios. This is a toe-hold in case there is confusion later. I don't know what images are out there at the moment for this purpose. In general let's be careful to keep the code and dependencies small, so that we can get to mergable PRs. Less is best. |
ps I reworded my last comment hopefully to be more clear, so refresh and 🤞 it is indeed more clear. Looking forward to getting some of this landed soon! |
Hey, I'm planning to planning to separate these soon. I'll mark this as ready for review once I'm done with iterating. If it wasn't for me checking out the OTel test cases I wouldn't know that some special user agent is required (I had no knowledge of that really). I will be moving things around still but I think that I start having an overall idea of how this should look like. |
thanks, marcin. p.s. if we need to make a wrapper or instructions for setting user-agent very specifically, all for it. In fact we have an existing use case around user-agent that maybe is the right time to action openzipkin/zipkin-reporter-java#142 (e.g. default as mentioned but at the same time, this would allow a standard way for otel users to override it and/or this code to make a portable wrapper that does the same) |
OK I think everything is somehow tested one way or the other. Tomorrow I'll split this into a couple of PRs
I've ported tests from OTel, from zipkin-gcp so it looks better now but let's wait for the splitting to invest more time in reviewing |
Closing this in favour of |
BTW when querying Jaeger WITHOUT the user agent stuff still works so I don't know if this is actually mandatory 🤷 |
Probably otel-java tests test some of the SDK's expected behavior unrelated to the spec. It's one reason to, maybe inspire some unit tests from their translation logic, but not to treat them as-is. If we have an |
TODO