libpcap: added raw filters #563

chemag · 2017-02-28T19:35:11Z

The goal of supporting raw filters* is to provide libpcap/tcpdump support
for generic BPF insns, including those that are not-supported by libpcap
(e.g., the BPF_MOD/BPF_XOR ops in Linux, or any of the multiple ancillary
loads in linux). It also allows testing new kernel extensions to the
BPF ISA without having to modify libpcap/tcpdump.

We provide support by modifying pcap_compile() so that it first checks
for raw filters. This works for expressions appended in the command line,
and for expressions read from a file ("-F" option). Filters starting
with an integer and a valid separator (',' or '\n') are considered raw.
All other filters are considered (traditional) expressions.

We also make sure that filters compiled from raw filters are left for the
kernel to validate (added "skip_validate" to pcap_t).

*Raw filters are those generated by tcpdump -ddd. i.e.

$ ./tcpdump -ddd -i eth0 icmp
6
40 0 0 12
21 0 3 2048
48 0 0 23
21 0 1 1
6 0 0 65535
6 0 0 0

We also support replacing the new line characters with commas, as to
make possible to have inline filters.

$ ./tcpdump -ddd -i eth0 icmp |tr '\n' ','
6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0,

Some examples:
$ ./tcpdump -nn -i eth0 "6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0,"
$ ./tcpdump -nn -i eth0 -F ~/bpf/icmp.2.bpfraw

guyharris · 2017-02-28T19:43:22Z

That's a bit of a big Git comment. Should we discard everything starting with "*Raw filters are those generated by tcpdump -ddd. i.e.", and change the first paragraph to read

The goal of supporting raw filters (those generated by tcpdump -ddd) ...

chemag · 2017-02-28T19:45:15Z

I like large comments (it's a doc, after all), but please feel free to remove what you decide

guyharris · 2017-02-28T19:45:18Z

gencode.c

@@ -50,6 +50,10 @@

 #endif /* _WIN32 */

+#if defined(BSD) || (defined (__APPLE__) && defined(__MACH__))
+#include <net/bpf.h>


So why does this need to be included? We're dealing with raw filters, i.e. raw BPF machine code, so there shouldn't be any need to look at the definitions for the kernel's BPF interpreter.

(Leftover from the previous patch).

Done

guyharris · 2017-02-28T19:46:54Z

Is there any reason to run raw filters through the optimizer? Presumably if a user specifies a raw filter, they want exactly that chunk of BPF machine code to be used.

guyharris · 2017-02-28T19:48:10Z

Documentation for the user would belong in a man page; documentation for libpcap developers would belong in a comment in the code.

chemag · 2017-02-28T19:53:21Z

Documentation for the user would belong in a man page; documentation for libpcap developers would belong in a comment in the code.

I removed all the test notes.

chemag · 2017-02-28T19:54:11Z

Is there any reason to run raw filters through the optimizer? Presumably if a user specifies a raw filter, they want exactly that chunk of BPF machine code to be used.

AFAICT, we don't run the optimizer when we call pcap_compile_raw(). In that case, pcap_compile() returns before calling the optimizer

infrastation · 2017-03-01T15:31:06Z

Please note this pull request failed to build on MacOS.

guyharris · 2017-03-01T17:46:34Z

gencode.c

+
+	if (sscanf(bpf_string, "%hu%c", &bpf_len, &sp) != 2 ||
+			(sp != separator1 && sp != separator2) ||
+			bpf_len > BPF_MAXINSNS || bpf_len == 0) {


That's not a syntax error.

In addition, it requires that BPF_MAXINSNS be available in a header file, which it might or might not be, and the header file depends on the OS. Remember, this has to compile on, at minimum, various versions of:

various Linux distributions

various BSDs, including Darwin

Solaris

AIX

HP-UX

and the header file on the system on which it's compiled might, or might not, match the system on which it's running.

If we're going to impose a limit on the number of instructions (the main reason for which would be to keep from allocating a huge blob of memory and eating up most of the address space or backing store), if we're going to use an unsigned short, we might as well just impose a limit of 65535 instructions and let the kernel complain if it has a lower limit.

Your suggestion is ''s/BPF_MAXINSNS/65535/', right?

Note that my kernel (3.13.0) defines BPF_MAXINSNS to be 4096. I doubt there is an issue with moving from 4096 to 64k (raw filters are only used from tcpdump clients).

Another question: do you prefer me harcoding the 64k constant, or something like:

#ifndef BPF_MAXINSNS
#define BPF_MAXINSNS 65536
#endif

Did the ifndef code. Also used 4096 instead of 64k (that's what the kernel defines).

guyharris · 2017-03-01T21:53:03Z

Also used 4096 instead of 64k (that's what the kernel defines).

Actually, the kernel defines it as 512, as do the kernel and the kernel and the kernel and the kernel. The kernel also appears to do so, although Oracle's version may have diverged from the last OpenSolaris version.

Unfortunately, the kernel isn't open-source, so I can't post a URL. The kernel-mode driver also defines it as 512.

Oh, and the kernel doesn't have BPF, so it doesn't define it as anything.

Translation: there's no such thing as "the kernel" in the context of libpcap; there are a number of kernels it deals with - that's the whole point of libpcap; it hides, as best it can, the variety of packet capture mechanisms, so that code can be written to run on several different OSes with a minimum of platform-specific #ifdefs.

Note also that any one of those kernels might change the value in the future.

So my inclination is not to pay attention to what any particular kernel happens to choose, and not to make an effort to try to find the appropriate header on various different platforms. Just pick an arbitrary maximum, and give it a name other than BPF_MAXINSNS, to emphasize that it's a libpcap limit rather than any particular OS's limit.

chemag · 2017-03-01T22:25:31Z

Actually, the kernel defines it as 512, as do the kernel and the kernel and the kernel and the kernel ...

My bad. There's only one kernel for me :)

So my inclination is not to pay attention to what any particular kernel happens to choose, and not to make an effort to try to find the appropriate header on various different platforms. Just pick an arbitrary maximum, and give it a name other than BPF_MAXINSNS, to emphasize that it's a libpcap limit rather than any particular OS's limit.

Done

chemag · 2017-03-03T20:20:31Z

Ping

chemag · 2017-03-10T19:48:35Z

Ping

chemag · 2017-03-28T02:14:22Z

One more ping...

mcr · 2019-04-26T14:12:32Z

if there is still interest, please rebase, thank you!

chemag · 2019-04-29T18:49:27Z

if there is still interest, please rebase, thank you!

Done.

Tested again, and added a "Tested:" section to the patch comment.

Thanks!

chemag · 2019-04-29T19:18:33Z

Also, per a previous comment from Guy, I moved a chunk of the comment to the pcap_compile man page

gencode.c

The goal of supporting raw filters* is to provide libpcap/tcpdump support for generic BPF insns, including those that are not-supported by libpcap (e.g., the BPF_MOD/BPF_XOR ops in Linux, or any of the multiple ancillary loads in linux). It also allows testing new kernel extensions to the BPF ISA without having to modify libpcap/tcpdump. We provide support by modifying pcap_compile() so that it first checks for raw filters. This works for expressions appended in the command line, and for expressions read from a file ("-F" option). Filters starting with an integer and a valid separator (',' or '\n') are considered raw. All other filters are considered (traditional) expressions. We also make sure that filters compiled from raw filters are left for the kernel to validate (added "skip_validate" to pcap_t). Tested: Ping 8.8.8.8 from another terminal. Then run: ``` $ ./tcpdump -n -i eno1 icmp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes 11:44:29.556006 IP 1.2.3.4 > 8.8.8.8: ICMP echo request, id 16624, seq 1, length 64 11:44:29.589406 IP 8.8.8.8 > 1.2.3.4: ICMP echo reply, id 16624, seq 1, length 64 11:44:30.557593 IP 1.2.3.4 > 8.8.8.8: ICMP echo request, id 16624, seq 2, length 64 11:44:30.590918 IP 8.8.8.8 > 1.2.3.4: ICMP echo reply, id 16624, seq 2, length 64 ... ``` OK, this seems to be working. Now let's try a raw filter: ``` $ ./tcpdump -ddd -i eth0 icmp |tr '\n' ',' 6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0, $ ./tcpdump -n -i eno1 "6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 262144,6 0 0 0," tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes 11:45:04.820334 IP 1.2.3.4 > 8.8.8.8: ICMP echo request, id 16715, seq 1, length 64 11:45:04.853608 IP 8.8.8.8 > 1.2.3.4: ICMP echo reply, id 16715, seq 1, length 64 11:45:05.821841 IP 1.2.3.4 > 8.8.8.8: ICMP echo request, id 16715, seq 2, length 64 11:45:05.855157 IP 8.8.8.8 > 1.2.3.4: ICMP echo reply, id 16715, seq 2, length 64 ... ```

infrastation · 2022-09-19T21:19:33Z

On one hand, merging these changes into libpcap would improve consistency in tcpdump: since it can print the compiled bytecode with -ddd, it would be reasonable if it could parse and use this compiled bytecode (as iptables -I INPUT -m bpf --bytecode does), and for that libpcap (but not the programs that use libpcap, as Guy explained) would have to recognize it.

On the other hand, the "ddd" format has a pitfall, in that it does not specify for which DLT the expression was compiled. This, for example, creates the space for things quietly going wrong when the end user compiles a filter using the popular DLT_EN10MB type, whereas in the iptables context above the bytecode is applied to what effectively is DLT_RAW.

So perhaps this change and cBPF savefile address related, but distinct use cases, and the only coordination between the two should be that the complete API in the end makes as much sense as possible.

chemag mentioned this pull request Feb 28, 2017

libpcap: added raw filters #353

Merged

guyharris reviewed Feb 28, 2017

View reviewed changes

chemag force-pushed the master branch from b11bde1 to 9dd8496 Compare February 28, 2017 20:02

guyharris reviewed Mar 1, 2017

View reviewed changes

chemag force-pushed the master branch from 9dd8496 to 886efbe Compare March 1, 2017 21:36

chemag force-pushed the master branch from 886efbe to a059a06 Compare March 1, 2017 22:25

mcr added linux pcap-compiler labels Apr 26, 2019

mcr added this to the release-after-next milestone Apr 26, 2019

chemag closed this Apr 29, 2019

chemag force-pushed the master branch from a059a06 to d2e9a27 Compare April 29, 2019 17:48

chemag reopened this Apr 29, 2019

chemag force-pushed the master branch from 84e4232 to 95048d5 Compare April 29, 2019 19:17

guyharris reviewed Apr 29, 2019

View reviewed changes