netfilter project logo

My netfilter blog

Wed, 04 Aug 2004

IPv6 packet filter benchmarking

It seems like a german university is currently doing feature analysis and benchmarking of IPv6 packet filters. Coincidentially, I'm going to near that universty next week anyway, so I'll stop over for a short visit and help them with their ip6tables evaluation setup.

I would be very interested to see some numbers on ip6tables... as we just discovered at the networking conference in Portland, nobody seems to be doing benchmarking / profiling on the linux ipv6 code so far.

Sun, 25 Jul 2004

David Miller survived my 13-patch patchbomb

This is good news, DaveM accepted all the 13 netfilter related patches that I had pending for 2.6.9. The patches included a number of optimizations, the ctstat, connection-based accounting, tcp window tracking, and some conversions to new in-kernel-API (seq_file, module_param).

Now let's hope that 2.6.8 will be released soon and we can start the 2.6.9 cycle...

Fri, 23 Jul 2004

IPFIX / ulog integration

After some more in-depth study of the ipfix IETF drafts, I finally started coding. Having written the first dozens of lines, I discovered that on an abstract layer IPFIX doesn't do something too different from my good old ulogd. Ignoring the minor difference that ulogd deals with individual packets and ipfix with flows, the ulogd_iret_t structure is very similar to what ipfix templates are trying to describe.

So I now forked a ulogd2 branch off the current ulogd subversion tree and started to reorganize the tree.

For more flexibility, I am going for a stackable plugin infrastructure, where the sysadmin can configure stacks like: ULOG->ulogd_BASE->flow aggregation->ipfix-over-tcp-export or ctnetlink->ipfix-over-sctp-export.

Thu, 22 Jul 2004

Merging 2.6.8-rc2 changes into patch-o-matic ng

I just started the boring job of merging 2.6.8-rc2 with patch-o-matic-ng... I'm happy that Jozsef, Martin and Patrick did this for the last couple of kernel releases. However, I need to get more into this job again in order to determine which patches still have to be submtited to the mainline kernel...

Expect some pom-ng breakage over the next couple of days...

Wed, 21 Jul 2004

Working towards IPFIX based on conntrack

I've written a patch to add 64bit packet and byte counters for both directions of every ip_conntrack. This should enable a clean and efficient implementation of flow based accoounting, when combined with ctnetlink events and a userspace daemon picking up those events.

I need to study the IPFIX (IETF Working Group) specifications in more detail before writing the respective daemon...

The patch is apparently working, you can read the counters via /proc/net/ip_conntrack and also use a modified/extended/updated version of the 'connbytes' match.

Tue, 20 Jul 2004

Pattern-matching API in the 2.6.x Kernel

There are various places in the kernel where we need to do some kind of pattern matching on the packet contents. Applications range from connection tracking helpers (looking for FTP PORT command, ...) over the 'string' match to intrusion detection systems.

Two years ago, Phillipe Biondi once came up with something called libqsearch. It implements a generic pattern matching api, supporting plugin based algorithm implementations.

I now took the liberty of porting this into a 2.6.x kernel, resulting in lots of changes that make my qsearch port now incompatible with what Philipe wrote. Anyway, I'm now in the process of combining this with Rusty's recent work on skb_walk() and skb_iter(), so we can pattern-match against a fragmented/nonlinear skb without any copy.

Tue, 22 Jun 2004

New iptables-1.2.11 and patch-o-matic-ng-20040621 release

I have just released iptables-1.2.11 and patch-o-matic-ng-20040621 on the netfilter homepage.

Seems like we'll never have an iptables release that doesn't introduce some severe bug that requires releasing another version immediately later. To some part, I blame the users. Seems like not enough of them try the CVS snapshots and report bugs back to us.

Wed, 21 Apr 2004

Doing lots of benchmarks / tuning / profiling lately

During the last weeks I've been working on tuning/benchmarking/profiling the Sun V20z dual opteron boxes for high-speed packet filtering purpose.

Some of my findings:

  • i386 kernels give you higher PPS than x86_64 (because sk_buff is smaller)
  • e1000 are way faster than tg3 boards (could be hardware or driver issue)
  • Intel PRO/1000MT Quad e1000 boards suck (apparently problems with the onboard PCI-X bridge)
  • Connection Tracking performance is not that bad...
  • ip_tables performance sucks, even if the ruleset is empty ?!?
  • 2.4.x has slightly worse results than 2.6.x if you use irq affinity, but really sucks if you don't, since the kernel doesn't balance irq's by itself (and irqbalance daemon only balances every 10 seconds)
  • You can route up to 1mpps at 64bytes packet size
  • ip_conntrack and iptable_filter at suck at least 300kpps, giving 700kpps as a result

Expect a more detailed report within the next weeks.

Wed, 07 Apr 2004

Some more ct_sync bug hunting

It seems like there's still a number of bugs left in ct_sync. I've spent the major part of the last three days hunting them down. Seems to be really hard ones, that only appear when compiled with recent gcc-3.2 versions... Learned a lot about objdump and strange x86 "instruction encoding artefacts", though.

Sun, 28 Mar 2004

Finally commintting Pablo Neira's optimization patches

Subject says it all... I've found some time to review his patches. With some luck, DaveM will receive them later today.

Sat, 27 Mar 2004

revived the dropped table

After about two years in deep freeze, I revived the idea of a dropped table. For those of you who haven't heared about it in the past: The idea is to gather all packets that are dropped at any place within the network stack. This is very useful for auditing and debugging.

Userspace support is included in libiptc/iptables for ages, so all you need is patch-o-matic-ng from >= today.

Sun, 29 Feb 2004

Added a new 'licensing' section on the netfilter homepage

Since recently more and more vendors seem to disobey the terms of the GNU GPL, I decided to put some more detailed information on how to comply with this license online. It was written for the netfilter/iptables project, but shoud apply to any other GPL licensed free software project. You can find the section here.

Wed, 25 Feb 2004

Continued work on libiptc2

I finally find some time to work on what I call 'libiptc2'. It is basically a reimplementation of the 'chain cache' inside libiptc. This should remove the last O^n complexities we have in there. While I would really enjoy working on new stuff like pkttables, this kind of work keeps me from doing it :(

Sun, 22 Feb 2004

conntrack and nat helpers in 2.6.x

The last couple of days I'm trying to finalize the first release of patch-o-matic-ng. Everything seems really close now. A lot of patchlets available for 2.4.x however are missing for 2.6.x kernels. Maybe the biggest and most important lack is for all conntrack/nat helpers.

The reason is that the semantics for those helpers have completely changed. They now get fed non-linear skb's by the conntrack core, which in turn means that they all need to copy the skb payload into some temporary buffer in order to search for some particular string (e.g. PORT command).

The conntrack core should definitely provide some function that is able to look for strings within a packet. Need to think more about this.

Fri, 20 Feb 2004

Submitting patches

I finally got around to initiate another one of my patch submission cycles. This means that DaveM is receiving a number of patches that have been pending in the netfilter patch-o-matic repository.

Apart from that, pom-ng needs some more work. It turns out I will have to do some perl scripting again.

A day of patch-o-matic-ng merging

Since there are slight syntactical and semantical differences in the API for iptables matches and targets between 2.4.x and 2.6.x kernels, a minimum editing has to take place in order to make even the most simple 2.4.x extension work with 2.6.x. With more than 65 extensions in current pom-ng, this can take quite a while.

Apart from a minor bug in the perl module, we should now be ready for the first official pom-ng release. Finally, people will be able to use our extensions with a 2.6.x kernel.

Wed, 18 Feb 2004

redesign of dstlimit match

A couple of weeks ago I first published the dstlimit match. It provides an easy way of ratelimiting certain packets on a 'per destination ip' or 'per destination ip/port' tuple base.

However, it turned out that it had several flaws. One of them was that you could create two /prc/net/dstlimit/ files with the same name. procfs doesn't actually check if some file already exists, if you want to create it (within the kernel). Several hours of research within the vfs (of which I have no idea) and conversation with some other kernel developers revealed that there is no reliable way to check if a specific file already exists. Even if there was, you would never be able to atomically check-and-create.

So in the end I had to implement some major changes in the dstlimit code. However, this again changed the kernel/userspace structure layout, so you will have to recompile both in order to use it

Sat, 14 Feb 2004

The netfilter/iptables project is looking for a hardware donation

The project's mail/web/ftp/cvs/list/... servers are highly loaded, and as usual the load always increases. We're getting more list members, more downloads and more page views every month. However, our current hardware is not growing by itself. Thus, we need to buy a new machine soon.

All of the current (and past) hardware was bought from my personal wallet. While I could afford this in the past, I would very much like to see one of our corporate netfilter/iptables users step up and show his support for netfilter/iptables by donating a new machine. This would be an ideal opportunity to show the development community that you are not just using free software, but also putting in your part to make it work.

We have very specific needs with regard to the hardware we use: It has to be a 1U system, and non-x86. This basically leaves us with Sun UltrSPARC based systems, and the Apple XServe line. Both options would cost about EUR 3500 to 3800.

If you are interested in sponsoring such a system, please contact Harald to discuss the details. Thanks in advance.

Thu, 12 Feb 2004

Jozsef made my day by finishing pom-ng

Jozsef was kind enough to implement the missing features in patch-o-matic-ng. This is really great. It was one of the most important pending items on my TODO list.

This basically means that we are at the brink of the first official pom-ng release, enabling 2.6.x kernel users to benefit from the vast collection of netfilter/iptables features contained in patch-o-matic.

Fri, 06 Feb 2004

Idea of a new conntrack-based accounting system

There has been discussion about this before, but it now came to my mind (again).

If you want to do some accounting on linux based routers, you don't have any reasonable way of doing so. All you can do is

  • capture all packets, do any kind of evaluation later
  • This is what you can do with nacctd, ULOGD/ulogd, and various other approaches. The problem is, that you collect an incredible amount of data which needs to be processed.
  • insert iptables rules, account only what you're really interested in
  • This requires prior knowledge of exactly what you want to account. You immediately get the results, and it's not possible to do any arbitrary calculation at some later point.

So there is a need for something else: conntrack based accounting. The idea is: Let connection tracking count how many bytes+packets a connection has. When the connection terminates, the total amount is sent to some userspace process. This means you will have one record of accounting data per connection. In the worst case of extremely short-lived connections, you would end up with almost as much dta as in the nacctd approach - but even then, significantly less processing for the actual accounting itself.

I haven't looked into the details yet, but even generating netflow data should be possible quite easy this way.

As for the implementation, a single set of counters should be sufficient. Adding per-cpu counters doesn't make sense, since the cache lines of the conntrack entry have to be valid on the current cpu anyway. We're also already under ip_conntrack_lock, so writing two more counters per packet shouldn't be that expensive. Per-cpu counters also don't make sense if they are within the same cacheline...

One set of counters would have to be: bytes for each direction, packets for each direction. They could be u_int32_t, since almost all connections have less than 4GB traffic these days.

more work on the failover code

I'm getting more and more of the failover code done. It now implements conntrack exemption (NOTRACK) for the sync device, and also blocks all incoming/outgoing network traffic on any node that is currently in 'slave' state. This means that all interfaces can be configured, any applications can be running, sockets bound, ... - but none of that will be visible to the network until the node is propagated to master state.
This needs explicit support for new netfilter hooks in the core network stack (I call them l2hooks, other people NETFILTER_PACKET).

Main parts that are missing:

  • Correctly deal with sync packet loss situations
  • Replicate expectations (needs conntrack expect notifications)
  • Testing on SMP systems, there might be locking bugs

Tue, 03 Feb 2004

A quiet week for my weblog

This is going to be a quiet week in this weblog. I'm currently at [ | permanent link ] Wed, 28 Jan 2004

Ulogd is becoming a flow accounting subsystem

Some nice russian guy wrote a patch to add bsd like ipacct flow accounting to ulogd. This is something I had on my wishlist for quite some time.

He has written an OUTPUT plugin that does all the flow accounting and file-writing itself. However, I have an idea of how this could be implemented in a more generic way: Implement flow accounting as interpreter, and return a pointer to a struct flowinfovia a new ulog_iret_t. This way any output plugin could reference flow information for the current flow.

More work on the failover code

Currently Astaro is paying me for my development on the netfilter conntrack failover code. That's what I'm supposed to be working on, at the least... I should stop reading my email in the morning, because otherwise my whole day will be filled with other stuff that just results from reading emails.

Anyway, the failover has been progressing, slowly but steadily. I should expect some working code any day now.

Thanks again to Astaro for still funding this, despite me delaying it over and over again.

Trying to make 2.6.x IPsec and conntrack/nat work

Spent some time thinking about how to possibly solve the long standing problem with conntrack/NAT and the 2.6.x in-kernel AH/ESP implementation.
The recent discussion on netfilter-devel was quite productive, although most of my ideas turned out to be without technical possibility :(
For example, iptables cannot attach the same CHAIN to multiple HOOKS. That would be so neat. Would somebody remind me that that has to go into pkttables?
Anyway, I've now written a surprisingly small (but still ugly) patch that should do about 60% of the solution upon which we agreed on the mailinglist.
Unfortunately, I don't have the time to set up a full ipsec testbed right now, so I have to rely on others to test it..

Sun, 25 Jan 2004

Bought three interesting books

During my stay in NYC went to the NYU computer bookstore, just for browsing, not looking for anything in particular. In the end, I spent more than 150 bucks on three books:

  • Telecommunications Technologies Reference (ISBN 1-58705-036-6)
  • This makes an excellent reading for somebody with an Internet background who wants to learn about the general architecture of modern telephone systems, SS7, frame relay, ATM, SONET/SDH, ISDN BRI/PRI protocol layers, encodings, multiplexing, ...
  • 802.11 Wireless LAN Fundamentals (ISBN 1-58705-077-3)
  • A comprehensive guide on the 802.11 standards, ranging from MAC to PHY layer, advancing to ecnoding and modulation techniques used. It also covers roaming, Mobile IP, WPA, WEP, 802.1x. A good read for those who want to learn more about the 802.11 family.
  • Practical VoIP
  • A book about the VOCAL implementation of SIP/SDP user agent/proxy/gateway functionality, with solutions to interconnect with H.323 and MGCP. Also includes introductions to the respective protocols, however after having read the SIP relevant RFC's I had skipped that part.

Fri, 09 Jan 2004

Final work on new netfilter homepage

The last section of the homepage (security advisories) has now been converted. The security advisories in their text form are just placed into a certain directory, and some makefile, perl-script and docbook-xml magic takes care of the rest.

With some luck, the new homepage will be online tomorrow

Thu, 08 Jan 2004

More work on the new website and

I've finished the scripts for auto-generation of the mirrors.html page from the dns zone file, and the HOWTO-link-generation similar to what the current netfilter homepage has. Also done some final tweaking of the style sheets.

With regard to the blosxom configuration: I've now finished some nice blosxom templates (flavour, how it likes to call these themself) that resemble the exact layout of the docbook-website generated netfilter homepage... in fact, it is using the same css :)

Wed, 07 Jan 2004

libiptc2 woes

After quite some time, a posting on the netfilter-devel list reminded me of my unfinished work on libiptc2. The problem with old libiptc is, that it has a n^2 complexity when adding rules to an in-memory ruleset. This slows down the time for iptables-restore with large rulesets.

Old libiptc has a so-called chain cache that contains pointers to the start of each chain within the ruleset blob. This chaincache has to die, and libiptc2 needs a totally seperate representation of the ruleset. Every rule as a malloc()ed chunk of memory, put into a linked list (which builds a chain, which are in turn linked lists). Only at the iptc_commit() stage this libiptc-internal representation is compiled into the ruleset blob.

Let's hope Andre Uratsuka Manoel will find the time to continue this work, since I really don't even know to start with my ever-growing TODO list :(

Mon, 05 Jan 2004

netfilter developer diaries

I've started to use blosxom as the designated tool for the upcoming netfilter developer diaries. If the test phase works out well, every netfilter/iptables developer will have the possibility to host their own homepage including a blosxom-enable blog on this server.

netfilter homepage v3 using docbook-website

Over the last couple of weeks I've converted the netfilter website to docbook-website. Let's hope this will be the last and final re-design of our project website.

Copyright (C) 2004 The netfilter webmaster Harald Welte
Validate XHTML