| About Subscribe Categories Archives |
Patrick McHardy's blogThu, 03 Jul 2008GARP/GVRP
Just finished the GARP/GVRP patches and sent them out. I love the feeling when you can finally delete a tree that has been lieing around for ages :) Unfortunately I have way too many of these. A few words on GARP/GVRP for those not familiar with it. GARP is the Generic Attribute Registration Protocol and is specified in IEEE 802.1D. It is used to register and propagate attributes through the active spanning tree topology. Examples of these attributes include multicast link layer addresses and VLAN IDs. A bridge can use the attributes to configure filtering, a host can perform source pruning, meaning it can avoid sending f.i. multicast frames noone is interested in. Source pruning requires the host to be a full participant however, my implementation is only of the applicant-only participant model, meaning it supports only the client side. The full participant belongs in userspace since it may have to create network devices etc. GVRP is the GARP VLAN Registration Protocol, specified in IEEE 802.1Q. As the name implies, it is used to register VLANs. This is supported by many switches, even the really cheap ones. The current implementation doesn't enable GVRP by default since we're missing a way to disable it when a VLAN device gets added to a bridge. I'll probably fix that shortly, for now it has to be enabled manually using iproute (ip link set eth0.1000 type vlan gvrp on). Dynamically sized qdisc class hashes
Just sent out the second version of my dynamically sized qdisc class hash patches . They are intended to solve scalability problems when using large number of classes with CBQ/HTB/HFSC (and soon DRR). Currently, all of these use a fixed hash size of 16. There are mainly two cases where this matters:
Now I'm off to do some final cleanups of the GARP/GVRP patches so I can hopefully send them out today as well. Mon, 30 Jun 2008Bored kids
These two kids were looking for something fun to do on Saturday evening. They first tried to climb the wall of the Freiburg theater (visible in the back of the picture), but the left one only made it half way up. Very disappointed, he stated he felt emasculated and both of them left. He tried to regain his manliness a few minutes later when they returned with a stolen table from the bar next door and used it to slide down the stairs. Getting ready to launch ...
They made it down the stairs. The left one is really having fun, the right one is also beginning to feel like a man again.
End of the ride.
Also don't miss out on the movie of the waitress slapping them around and their flight. Thanks to Elena for the pictures and redacting the face of an innocent bystander :) Mon, 23 Jun 2008Companies to avoid
Warning: long rant. Note to any company mentioned below: this is *my* opinion and my opinion only. Just added Lenovo to the list of companies I won't buy from anymore. I got a Z61p about 9 months ago and had nothing but trouble since. Air circulation seems to be broken by design, from the beginning the graphic card overheated to over 100° celsius, then starting makeing squeaky sounds and showing flickering moving lines across the entire screen, before shutting down completely (not the notebook, only the card). There are plenty of reports on the internet of people having similar problems. The processor also often heats up until it reaches the shutdown temperature when doing CPU intensive work. A technician tried to fix it by replacing some parts, but without any success. Additionally on my travel to LinuxTag, the fan broke and it would refuse to boot. This cured itself a few days later, but now it sounds like a rusty lawn mower. Since last weekend, it doesn't detect the battery anymore, even though a voltmeter shows its working perfectly fine. Sad, back when IBM was still producing ThinkPads I never had trouble. The other two companies on this list of pride are (there are more, but most of them are irrelevant since they are either almost broke or small enough to avoid easily): - Deutsche Telekom and all their subsidiaries. This is the most ridiculous company I've ever seen, the highlights of their doings include:
- HP, for not fulfilling their service obligations for fixing my notebook. They first sent some clown from Deutsche Telekom to fix it, who broke it even worse. His second attempt was also unsuccessful, after which HP simply closed the request. Every time I reopened it, it was closed again without further comment. They even had the impudence to ask for my satisfaction with their service - which was pretty obvious from looking at the request. Also sad, because I liked the notebook and there are not many alternatives if you want a big display, but such behaviour is inacceptable. Sat, 21 Jun 2008Too busy to blog
Since I've been slacking with updating this blog lately, and probably will continue to do so for the next week, here's another drawing from Elena from a couple of weeks ago.
Tue, 17 Jun 2008
iptables 1.4.1.1 released
Just released iptables 1.4.1.1, a pure bugfix release for regressions reported against 1.4.1. Besides this, I'm mainly in bugfix mode for 2.6.26 currently, which is keeping me pretty busy. Tue, 10 Jun 2008iptables 1.4.1 release
Finally released iptables 1.4.1 this morning. I had the impression the -rc phase worked pretty well this time and hoped we had shaken out all the bugs. Unfortunately this hope wasn't fulfilled, the first regression report came in only 5 hours later. Its nothing terribly important, just a cosmetic problem when printing IPv6 masks, but I guess I'll release a 1.4.1.1 bugfix release in a few days. Thu, 05 Jun 2008Release delays
The iptables 1.4.1 release got delayed a bit by my notebook breaking 5 minutes after getting on the train to LinuxTag, so I couldn't do any real work the entire last week. I hoped we could test the header fixes last week and release on Monday, but they really need some wider testing, so I'll release another -rc today and hopefully the final release in about a week. On the kernel side, I'm working on getting the things I would like to merge in 2.6.27 into shape. The netfilter things in my queue so far are mostly minor cleanups and feature additions, with the exception of ebtables IPv6 support from Kuo-lang Tseng. The non-netfilter things are:
iptables release status
I managed to push out an iptables release candidate last week and another one this week. There are some issues with endian-annotated types in the netfilter headers when using ancient linux/types.h versions that need to be fixed before a final release, but we'll hopefully have a final 1.4.1 release by next monday. Leaving for LinuxTag
I'll be leaving for my train to Berlin now. Amazingly it took me more time to decide what to put on my notebook than what to pack in my suitcase :) Hope to see you there ... Fri, 23 May 2008Flying outside
did not work to well ...
Geek toys
Received a bunch of things to play with this morning. A customer wants me to develop a multipath tunneling protocol. The basic idea is to encapsulate packets, distribute them over N paths, decapsulate them on the other side and restore ordering using sequence numbers. This will allow to use both the combined upstream and downstream bandwidth for single connections. I wrote a prototype some time ago, but its very basic at this point and missing all the fancy features, like microflow seperation, delay aware distribution, dead path detection, etc. I used to have two internet connections for a couple of months, but canceled the second one in march. Since there's nothing like real-life testing, I ordered a second cable connection again, which was installed this morning. So now I have two 32/2.5mbit connections, but only one of them is used currently. I probably won't be able to resist the urge to work on this for long :) Additionally I received some VoIP testing equipment. Innovaphone kindly provided me with a H.323/SIP test setup consisting of 2 * 3 different telephone models, two PBXs and some test scenarios. We already support a lot of scenarios, my goal is to extend this as far as possible within reason. The tricky part are things like call transfers crossing two NATs:
Phone1-\ /------Phone3
-[Registrar1]-[NAT1]-{ }-[NAT2]-[Registrar2]-
Phone2-/ \------Phone4
When Phone1 calls Phone3 and the call is transfered to Phone2, we
currently fail completely (for so far unknown reasons). The ideal
outcome would be that NAT1 detects that the transfered call originated
in the local network of Phone1 and the RTP streams are set up between
Phone1 and Phone2 directly. This not only reduces latency, it avoids
having internal calls go over the Internet. For normal calls between
Phone1 and Phone2 using an external registrar this already works,
provided that the registrar doesn't decide to proxy the calls.
Speaking of geek toys, ThinkGeek has an incredibly fun toy..
The two helicopters are controlled using IR remote controls. They can only go up and down by controlling main rotor speed and spin around the rotor-axis using the rear rotor, but have some small constant forward movement, which allows to fly them quite precisely. The most important feature however is that they can shoot at each other using IR. On the first hit it spins a bit, on the second one it looses power for a short period of time, on the third hit it completely looses power and goes down. This is accompanied by shooting sounds. Unfortunately they break pretty fast, I wrecked four of them within only a few days - well actually three of them, the fourth one was last seen flying over my neighbours garden :) Being hooked on the fun, I got a a different model , which is controllable on all axis and supposed to be more robust.
Unfortunately it also has a lot more power, so I managed to wreck another one within hours: one gear broke, the tail bent to a 45° angle and then broke, the flight bar also took some damage. To be fair, it really is more robust (and luckily you can also order replacement parts), I'm just not used to the two two-dimensional controls. Instead of powering down when getting to close to the walls, I tried directing it away by pushing all sticks in the desired direction, causing it to increase power and crash in the wall. I have some replacement parts and a second helicopter, but I'll try to resist flying inside again until I can get some training in a less dangerous environment. Regardless of these problems, they are really great fun :) Thu, 22 May 2008LinuxTag
I'll be visiting LinuxTag in Berlin next week, probably the entire day of Thursday and Friday until sometime in the afternoon. Until a couple of years ago, I used to visit LinuxTag annually, but then the quality of the presentation declined, with a lot of the topics being along the lines of "We're company XYZ and we're using Linux, yay". This year the talks look more interesting again, and I'm looking forward to meet with Harald and DaveM. Anyone interested in meeting and discussing some netfilter or networking related topics, drop me an email, there should be plenty of time. Fri, 16 May 2008Netfilter move to git almost completed
I've completed the move of (most of) the netfilter repositories to git today. I still need to change the email notification script to make the commit emails more readable. They don't look very nice by default and I made it even worse. For today my limit of the amount of shell scripts I can look at is reached though. These were the last SVN repositories I was using. I'm tempted to leave a long rant about SVN, but its probably better to simply forget about it as quickly as possible :) Next I'll try if I can manage to roll a release candidate for iptables. We're currently releasing too infrequently. Since we're usually merging at least one new extension or revision per kernel release, there also should be one iptables release per kernel version, so users can actually use new things. The ideal time for this would be shortly before kernel releases, since that allows us to merge userspace extensions for things targeted at the next kernel release early enough so they can be used for testing. So thats what I'll try to do in the future. Luckily we didn't merge anything requiring new userspace extensions during the last merge window, so we won't need a new release for 2.6.26. Wed, 14 May 2008Illustrations
Apparently my blog is too boring to read, so Elena kindly offered to illustrate it. Since this spares me from writing some actual content, I gladly accepted. What you're seing below is me resting in a deck chair, enjoying a Rothaus beer, exhaling some unidentified fumes and apparently being haunted by thoughts of ip_route_me_harder() :)
Wed, 07 May 2008
Summer office
The weather has been great the past days, so I set up my summer workplace :)
Working outside is really pleasant after a month of almost constantly grey sky. Below the balcony there's a small stream, and hundreds of birds sit in the trees and sing, which makes an amazing scenery. I sent out a first batch of HIFN fixes today to avoid causing too much conflicts in the series in case something turns up during review. Caught a good time during which both Evgeniy and Herbert were responsive and it only took about an hour to get all patches reviewed, fix a minor bug and get them merged. The remaining ones are hopefully in shape by tommorrow, the descriptor accounting still needs a bit more work. Herbert also merged some patches from Loc Ho today for async hashing support, which is cool because I already started adding hashing support to the HIFN driver until I noticed the CrytoAPI doesn't support it asynchronously yet :) Also sent out a few netfilter patches and fixed a slightly embarrasing bug in the macvlan driver. It would crash the kernel on module unload because cleanup was performed incorrectly, causing the kernel to jump to a NULL function pointer when receiving the next packet on the underlying device. I wonder why I've never noticed this. Tue, 06 May 2008Fighting the HIFN driver
What I hoped initially to be just a simple fix for a few arithmetic errors in the driver for the HIFN 795x crypto accelerator cards turned into a week long struggle, accompanied by at least a hundred crashes and reboots. The initial bug manifested itself by going into an endless loop when the CryptoAPI issued a request for less data than the full scatterlist, caused by an integer underflow while calculating the remaining amount of data to be processed. The fix was straight-forward: only use the minimum of the scatterlist size and the crypto request size. While at it, I also fixed some endian bugs, missing error propagation for errors that shouldn't happen, but did because of the underflow, and some overly strict data alignment checks. Testing looked good, no more crashes, but surprisingly the testcases of the tcrypt module using algorithms provided by HIFN randomly failed. This turned out to be caused by an incorrect return value indicating synchronous processing to the CryptoAPI, while the request was in fact processed asynchronously. So when the result was not already available when returning from the driver, testcases failed. After fixing the tcrypt failures, next was some real-life testing using IPsec. The first attempt resulted in an immediate crash in crypto_authenc_genivc(). This one was fixed fairly quickly, the asynchronous completion handler interpreted a pointer as an incorrect structure. The second attempt looked more promising, no crashes, packets went through and looked like IPsec. The remote side failed to parse them however, closer looking revealed that they were incorrectly constructed and had 16 bytes of garbage at the end. From my last attempt to fix the driver I remembered that this was most likely caused by missing initialization vector size initialisation of the CBC modes. Naively, I changed the driver to properly initialize the ivsize. To my surprise, attempting to add SAs using cbc(aes) now failed with -ENOENT. Figuring out the reason took me almost an entire day. When the ivsize is already initialized, the CryptoAPI attempts to spawn a new instance of the algorithm. Algorithms are identified by name, possibly combined with modes, like cbc(aes). When spawning new algorithms, the driver name is used for the lookup however, which in the case of HIFN was "hifn-aes" for all AES modes, causing the lookup to return the ofb(aes) algorithm instead of cbc(aes). Using unique driver names for the different algorithm modes fixed this problem. While chasing this bug, I noticed some DMA memory corruption issues in the HIFN driver. When a request contains more than a single scatterlist element, the driver programmed the hardware to perform one crypt operation per scatterlist element, but for the full request size, corrupting the memory after its tail. The fix for this was a bit more involved since using the correct length also requires to perform only a single operation for all scatterlist elements since the source and destination descriptors don't necessarily have identical lengths. This complicates keeping track of free descriptor entries. Previously, each operation needed exactly one command, source, destination and result descriptor. With only a single operation, it needs one command and result descriptor and a varying amount of source and destination descriptors. On the upside, this reduces the number of interrupts per request to exactly one instead of one per scatterlist element and gets rid of some atomic operations. Additionally tcrypt can now detect destination buffer corruption for cipher tests. Continuing testing with IPsec, things now looked better, packets were properly sized and the receive side worked properly. Outgoing packets were still dropped by the receiver however. Looking more closely at the packets showed that they contained what looked like a block of unencrypted data at the end. Additionally there still were some rare random crashes in the CryptoAPI. The crashes were caused by a missing check for end-of-scatterlist in one of the CryptoAPI scatterlist helpers, the unencrypted block of data by an off-by-one in the eseqiv sequence number generator. Both problems were fixed by Herbert Xu. The first victory - IPsec now worked properly using ping. TCP connections stalled after a short period however. Half a day later, I also figured out the reason for the stalls. The HIFN driver needs to keep some context for each request since it processes them asynchronously. The driver used the global per-transform storage for this context instead of the per-request storage, corrupting existing contexts when more than one request was outstanding. Even in flood mode, ping exhibits ping-pong behaviour, waiting for a reply before sending the next request, which is why it wasn't affected by this problem. With this also fixed, IPsec seemed to be working properly, at least on the HIFN side. There still appears to be some corruption of the XFRM CB with asynchronous processing, causing outgoing tunnel mode packets to be sent without IP_DF, but that should be easily fixed. Next was testing with dm-crypt, for which I actually purchased the card. Testing worked fine while debugging was enabled, without debugging it reproduceably crashed in the device mapper code. This was fairly nasty to debug since enabling debugging stopped the bug from happening. After following lots of dead ends and some suggestions from Evgeniy, I found the cause: when no descriptors are currently available, the request is queued and processed once enough descriptors are available again. The queue length is limited (in the case of HIFN to 1), when the limit is reached the behaviour depends on the flags specified by the caller. When using CRYPTO_TFM_REQ_MAY_SLEEP, the caller goes to sleep and waits for notification from the driver when its ready to accept more requests. When dequeuing the crypto queue, asynchronous crypto drivers need to check for backlogged clients and wake them before continuing processing. This was missing from the HIFN driver, causing it to call the dm-crypt completion handler for a request that wasn't fully initialized. With this bug also fixed, dm-crypt survived a 24 hour stress test. I'm a bit reluctant at this point to use it for real data though, all those bugs didn't exactly instill confidence. The patches are in an almost upstream-submittable state, just the descriptor accounting needs some minor cleanup. I hope to get this done today or tommorrow and then attend to the huge backlog in my inbox that has grown over the past week. On the netfilter front, nothing too exciting has happened during the last two weeks. 2.6.25 appears to have gone pretty well, netfilter-wise, except for one nasty hashing regression on ARM, fixed by Philip Craig. The amount of patches merged during the 2.6.26 merge window was smaller than usual, the highlights are:
I'm particulary happy about finally managing to merge the SIP helper patches, which I had queued for almost 9 month. If you've tried using it and it didn't work, now is a good time to try again and submit bug reports :) Overcoming laziness
I decided to give blogging another try. My last attempt failed after just one or two entries because of me being too lazy to actually write something, but since I enjoy reading other people's blogs, I hope I can keep the motivation up a bit longer this time :) |