| About Subscribe Categories Archives |
Patrick McHardy's blogTue, 08 Jul 2008VLAN update
Dave just merged the second part of my VLAN update for 2.6.27. It contained nothing particularly interesting, mainly minor cleanups, uninlining, ethtool support for querying offload settings and a minor fix for incorrect header pointer adjustments with software tagging. The more interesting part is contained in the final update, which I just sent out as RFC . There are a number of inconsistencies when using VLAN with hardware acceleration with respect to visibility on packet sockets, which these patches attempt to cure. Linux supports three modes of VLAN hardware acceleration:
Since running tcpdump is an exception, fixing this behaviour should not affect or disable the optimizations provided by hardware acceleration. The approach taken by my patches is to promote the VLAN TCI from the cb to a full skb member to avoid the netem corruption and keep it intact within the packet socket code, which is also using the cb itself. A new member is added to the packet socket auxillary data to store the VLAN TCI, allowing userspace to sense that a packet is actually a VLAN packet and (re)construct the VLAN tag. On RX, the hardware acceleration netif_{rx,receive_skb} wrappers store the VLAN TCI in the skb and manually invoke the ETH_P_ALL packet handlers before receiving the packet on the VLAN device. Combined with a patch for libpcap to perform the VLAN tag (re)construction, this fixes the first two issues. One minor remaining issue is that socket filters for VLAN packets don't work as intended since they expect a VLAN header. Since userspace needs to know about VLAN acceleration anyway, it seems reasonable to put the burden on userspace by providing a new filter instruction for getting the VLAN TCI from the skb's meta data and expecting it to construct the filters accordingly. To fix the third issue, the drivers need to be modified to provide the desired semantic. Their initial state should be to filter out all VLANs. Currently most of them only enable filters when adding the first VLAN, which is clearly suboptimal since previously all VLAN packets are uninteresting, except when in promiscous mode. When adding new VLANs, the filters should be adjusted to allow their respective IDs, this is done correctly by all drivers. Finally, in promiscous mode, all filters should be disabled. I'm half way done modifying the drivers to provide this behaviour (all Intel drivers), for most of the remaining ones I'm not sure about their current behaviour since its unclear whether the promiscous mode offered by the hardware automatically disables VLAN filtering or not. Additionally it turned out during testing that about half the drivers performing VLAN stripping didn't provide the full TCI but only the VLAN tag to the HW acceleration RX functions. The upper bits of the TCI contain the VLAN priority, which is used for ingress priority mappings so far, with my patches it also affects tcpdump visibilty. The fixes for this are already in net-next-2.6.git.
What's nextSince with these changes the skb is able to carry the VLAN TCI across layers, we are now able to provide VLAN acceleration to virtual network devices by adding a software fallback, similar to how TSO works. This would allow to use hardware tagging from within network namespaces or other virtualized environments. Additionally there are two more inconsistencies that people have been complaining about:
|