Millions of people have been prevented from visiting dozens of Microsoft websites today.
Here's my notes on what happened. Briefly, four Microsoft computers somewhere in Redmond aren't working properly:
* a.root-servers.net for microsoft.com, msnbc.com and others points to four DNS servers * those DNS servers are dns4.cp.msft.net through dns7.cp.msft.net * all four are alive: they respond to ping requests * that netblock appears to be owned by microsoft, so this is almost certainly not a hacker attack * the DNS servers seem to be physically close together, a terrible design decision, with IP addresses from just 207.46.138.11 to 207.46.138.21. they could even be in the same machine room. * those DNS servers don't respond to dns lookup requests * therefore, things are screwed and people can't get through. * other affected sites: expedia.com, slate.com, encarta.com, passport.com * that is, unless your computer knows the ip address to microsoft.com etc. since your isp/corporation/university has it cached * but caches expire, so microsoft properties have been fading from the web all day * the web servers are working fine; microsoft.com is at http://207.46.230.218/ * the first person to identify the problem seems to be sean donelan at 11:05 pm PT last night * even though hotmail.com uses other DNS servers, it's still affected. reason: it redirects to http://lc1.law13.hotmail.passport.com/cgi-bin/login (per my attempt to connect to port 80) * my mail to microsoft.com addresses goes through fine, except to exchange.microsoft.com addreses, which had intermittent errors. that seems to be working because the DNS servers are still responding to requests for MX records. * normally when a website can't be reached, internet explorer defaults to auto.search.msn.com, which, ironically, is also offline. talk about a catastrophic failure. (this is one of the risks of moving services, like error messages and search functionality, to the net.) * at 4:26 pm ET, microsoft.com was still offline for me.
One Microsoft representative blamed ICANN, which as we can tell from the above has nothing to do with the problems:
http://www.idg.net/ic_386962_1793_1-1681.html
Microsoft has yet to pin down the cause of the DNS error. "It can
be a system or human error, but somebody could also have done this
intentionally," De Jonge said. "We don't manage the DNS ourselves,
it is a system controlled by the Internet Corporation for Assigned
Names and Numbers (ICANN) with worldwide replicas."
That said, this remains a mystery. Why would it take so long to get even one of those computers back online? Any network admins want to speculate?
-Declan