Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with TLS Stream Dump #184

Open
mspublic opened this issue Mar 11, 2024 · 8 comments
Open

Issues with TLS Stream Dump #184

mspublic opened this issue Mar 11, 2024 · 8 comments

Comments

@mspublic
Copy link
Contributor

mspublic commented Mar 11, 2024

We have been running into a few issues with the TLS stream dump functionality. I followed the udp dump/exported_pdu instructions in wireshark.

  1. Wireshark is not able to fully parse the different streams (TCP/HTTP/etc). For example when you right click on an HTTP request then select follow-HTTP. It is unable to reassemble and follow the stream. Even if the dump has been running for a while with many requests wireshark identifies newer requests as an early tcp.stream (for example 1 or 2). This leads me to believe it's not able to properly discern between different TCP streams.

  2. The source and destination are of the proxy server and the udpdump receiver. Ideally these would be of the connecting client and remote server. Or at least between the proxy and the remote server.

  3. I believe due to issue 1 we are seeing HTTP [Malformed Packet] errors often.

Thanks for any help/suggestions!

@zh-jq-b
Copy link
Member

zh-jq-b commented Mar 12, 2024

  1. Wireshark is not able to fully parse the different streams (TCP/HTTP/etc). For example when you right click on an HTTP request then select follow-HTTP. It is unable to reassemble and follow the stream. Even if the dump has been running for a while with many requests wireshark identifies newer requests as an early tcp.stream (for example 1 or 2). This leads me to believe it's not able to properly discern between different TCP streams.

The wireshark HTTP follow code only support TCP filters, see http_follow, and the exported_pdu layer didn't register any tcp.* variables. So it's not possible to follow.

  1. The source and destination are of the proxy server and the udpdump receiver. Ideally these would be of the connecting client and remote server. Or at least between the proxy and the remote server.

It's the first exported layer, which is added by udpdump. You can see the client address and the proxy address at the second exported_pdu layer which is generated by g3proxy.

  1. I believe due to issue 1 we are seeing HTTP [Malformed Packet] errors often.

I haven't checked the details yet, but packet loss is expected as it's UDP based dump protocol.

@mspublic
Copy link
Contributor Author

mspublic commented Mar 12, 2024

  1. Wireshark is not able to fully parse the different streams (TCP/HTTP/etc). For example when you right click on an HTTP request then select follow-HTTP. It is unable to reassemble and follow the stream. Even if the dump has been running for a while with many requests wireshark identifies newer requests as an early tcp.stream (for example 1 or 2). This leads me to believe it's not able to properly discern between different TCP streams.

The wireshark HTTP follow code only support TCP filters, see http_follow, and the exported_pdu layer didn't register any tcp.* variables. So it's not possible to follow.

Thank you for the link. I will investigate more how it is trying to restructure. I noticed you were including “synthetic” TCP data (such as sequence number etc) in the packet you were sending. I was seeing some tcp variables being registered (sequence etc) register in wireshark. I wonder if it can be aligned to allow it to be fully followed.

I also tried modifying the wireshark side udpdump source code so it does not add the second layer of exported_pdu. It had no issues identifying the TCP data you send.

The pcap capture feature you have written is an extremely useful feature. Having more in depth data and being able to follow the streams would really be great.

  1. The source and destination are of the proxy server and the udpdump receiver. Ideally these would be of the connecting client and remote server. Or at least between the proxy and the remote server.

It's the first exported layer, which is added by udpdump. You can see the client address and the proxy address at the second exported_pdu layer which is generated by g3proxy.

This is my mistake. Local machine testing. I was thinking the server would be the remote server.
For our use having the remote HTTP server would be useful (as well as the HTTP client). Any easy suggestions on passing that info down to the udpdump?

  1. I believe due to issue 1 we are seeing HTTP [Malformed Packet] errors often.

I haven't checked the details yet, but packet loss is expected as it's UDP based dump protocol.

That would make sense. It is possible to do pcap over IP using a TCP pipe. PolarProxy does this (sending decrypted TLS over TCP).

https://www.netresec.com/?page=Blog&month=2022-05&post=Real-time-PCAP-over-IP-in-Wireshark
https://www.netresec.com/?page=PolarProxy

————————-
It sounds like the few issues we have identified above are known. We will start planning on how to modify G3 to have more functionality in PCAP export. If you have any requirements or suggestions please let us know.

Some background - we were leveraging ICAP but many sites have started using websockets which the ICAP protocol was not built to handle. So we started going down the PCAP route.

Thank you again @zh-jq - your work on this project is much appreciated.

@zh-jq-b
Copy link
Member

zh-jq-b commented Mar 13, 2024

Thank you for the link. I will investigate more how it is trying to restructure. I noticed you were including “synthetic” TCP data (such as sequence number etc) in the packet you were sending. I was seeing some tcp variables being registered (sequence etc) register in wireshark. I wonder if it can be aligned to allow it to be fully followed.

It may be possible if those tcp.* variables especially tcp.stream get registered in the exported_pdu code. May be we can try and then contribute back to wireshark.

I also tried modifying the wireshark side udpdump source code so it does not add the second layer of exported_pdu. It had no issues identifying the TCP data you send.

The udpdump added exported_pdu layer is also needed when you have multiple g3proxy process sending the traffic to the same wireshark instance. The src addr and dst addr may conflict within g3proxy processes on different hosts, the correct way to calculate the stream id is by using g3proxy's dump addr + src addr + dst addr all together.

Don't forget to set the data type to exported_pdu when capturing.

This is my mistake. Local machine testing. I was thinking the server would be the remote server. For our use having the remote HTTP server would be useful (as well as the HTTP client). Any easy suggestions on passing that info down to the udpdump?

It's still the same problem, the src addr + remote addr may conflict even on the same host, as the client may connect to different g3proxy ports but with the same remote target address.

The way to add more info is to add more fields in the exported_pdu layer, which need to be defined by wireshark first. I'm not sure whether they will accept it.

That would make sense. It is possible to do pcap over IP using a TCP pipe. PolarProxy does this (sending decrypted TLS over TCP).

I prefer to use UDP to do this, as packet loss is also allowed in internal processing. It's may be possible to replace the generated exported_pdu layer with IP+TCP, but this way we will lose the ability to set dissectors hints other than tcp.port.

It sounds like the few issues we have identified above are known. We will start planning on how to modify G3 to have more functionality in PCAP export. If you have any requirements or suggestions please let us know.

I think that the exported_pdu code in wireshark could be improved to match our needs. It will be good if you can help to do that.

Some background - we were leveraging ICAP but many sites have started using websockets which the ICAP protocol was not built to handle. So we started going down the PCAP route.

It's fine if you don't use the block&modify feature. You may also use ICAP for websocket upgrade request and then use udpdump to dump the traffic.

@zh-jq-b
Copy link
Member

zh-jq-b commented Aug 27, 2024

Now wireshark added the support for following http/h2 streams https://gitlab.com/wireshark/wireshark/-/issues/20010.
You can test with the master version of wireshark (or the released 4.4 version) and the master version of g3proxy.

@zh-jq-b
Copy link
Member

zh-jq-b commented Aug 27, 2024

wit commit 7d9577c, now the address in the second exported_pdu frame will be the client address and the remote address. To identify an unique stream, dump addr + src addr + dst addr + stream id should be used.

@zh-jq-b
Copy link
Member

zh-jq-b commented Aug 27, 2024

Well, the wireshark changes get reverted (https://gitlab.com/wireshark/wireshark/-/merge_requests/16985) after we found more problems, I also have to revert the g3proxy code.

@mspublic
Copy link
Contributor Author

mspublic commented Sep 3, 2024

Amazing work. Hopefully soon!

I also saw you are adding support for an "ICAP for Websockets". That looks really cool! Is there anything we can do to help test?

@zh-jq-b
Copy link
Member

zh-jq-b commented Sep 4, 2024

I also saw you are adding support for an "ICAP for Websockets".

The HTTP upgrade request will be sent to ICAP server, and if the server agreed to upgrade to websocket, then you will need this
https://github.com/bytedance/g3/blob/master/g3proxy/doc/protocol/helper/stream_detour.rst to inspect the websocket traffic by implementing your own stream detour service.

Is there anything we can do to help test?

I haven't tested the stream detour feature, you can start by impl a simple echo service first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants