Cable Forum - View Single Post

qasdfdsaq · 19-01-2012, 11:58

Quote:

Originally Posted by Sephiroth

As many techies will know, you can algorithmically re-route when there is a node failure; this put pressure on other nodes but is in any case standard everywhere these days.

Plus you can put in physical resilience so that another device takes over with the same IP address in case of failure of a key device. I suspect that didn't happen. If there is a site failure then, of course, this measure is not effective).

The trick is to do proper reliability analysis, identify the potential critical items and design the risk out accordingly. Just so you know, this is one of the day job things I do.

Yes, most sensible departments, including mine, do a combination of all 3 of the above. Although that said, I've seen our whole network taken down more often by human error and critical software bugs than actual hardware failure.

I see no excuse for this level of failure in any major ISP, particularly not one of the largest in the UK. And especially not lingering effects several days on as some people seem to be reporting...