XDP is only one of the available eBPF network hooks. Another very important
eBPF network hook in the Linux Traffic Control (TC) system, both at
ingress and egress via clsact
.
To transfer info between XDP and network stack there are a number of options. One option is that XDP can modify packet headers before netstack, e.g. pop/push headers influence RX-handler in netstack, or e.g. modify MAC-src and match that with a iptables rule.
Another option is XDP “meta-data”. The “meta-data” can be written by XDP, and a TC-hook BPF program can read this, and e.g. update fields in the SKB.
In the kernel tree there is a BPF-sample that show how XDP and TC-ingress hook can cooperate; XDP set info in meta-data and TC use this meta-data to set the SKB mark field.
The XDP and TC BPF-prog’s code is in: samples/bpf/xdp2skb_meta_kern.c. A shell script to load both XDP and TC via iproute2 is placed in xdp2skb_meta.sh.
A real-world problem is traffic shaping causing lock-congestion on the TC root qdisc lock (e.g. Googles servers experience this also see article).
The XDP-project have a git-repo for demonstrating how to solve this:
It setup the MQ (Multi-Queue) qdisc per TXQ to have a HTB-shaper. Then it uses XDP to redirect (via CPUMAP) the traffic to the CPU that is responsible for handling this egress traffic. In the TC clsact-egress hook, a BPF-prog stamps the SKB packet with the appropriate HTB class id (via skb->queue_mapping), such that traffic shaping get isolated per CPU.
Do notice that it depends on a kernel feature that will first be avail in kernel v5.1, via kernel commit 74e31ca850c1.