1% of the Internet crashed the other day …

You might have read this in some newspaper, though unless you are very unlucky I doubt you really noticed, but a few routers stopped work the way they were suppose to and caused failures in the control information in the Internet.

You can read more on HITECHfourm. If you read the article, or scroll down near the end, you see they say “it will happen again”.

The scary thing is that they are so right. The Border Gateway Protocol (BGP) was not made to handle errors in the data stream, and if they did happen it did basically use the ejection seat (imagine your car not having a break, but an ejection seat instead .. wuuuuussshhhhh).

Sure the internet uses re-routing, redundancy etc to make the Internet have proper seat-belts, breaks, ABS, ESP and what-ever-the-latest-gimmick-in-cars-are-today.

Why has it not already been changed you might ask?

That is a fair question and I’m sure there are more answers than there are BGP people, but one is that with proper re-route et al then it is not number one on the operators wish list – or at least not on the managers. I remember a similar incident a few years ago, originating from middle America, where another major system also handled “a special update” incorrectly and pulled the ejector-seat-handle. The technical people at a major US ISP were, besides happy for knowing what the problem was and that it could be stopped, also very interested in finding a way to stop using the brutal “session down” answer to unwanted stuff.

BGP Engineers from different vendors were all agreeing that there should be better solutions put there, but never did anyone from the provider actually put the subject on the agenda.

However the real reason is why vendors and IETF people do not just fix it, but that I’ll leave in the air or another day 😉

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.