MAJOR NETWORK ISSUE (17 Jan 2012)
17-01-2012, 20:33
|
#136
|
|
Ice Cold
Join Date: Oct 2006
Location: West Yorkshire
Age: 48
Services: XL TV
M Phone
1000MB BB
Posts: 1,568
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Its working at about 75% at the moment here in Leeds not back to 100% yet
|
|
|
17-01-2012, 20:35
|
#137
|
|
Inactive
Join Date: Oct 2008
Location: Norwich
Age: 37
Services: Company LLU internet, soon-to-be company FTTC internet at 56Mb/20Mb!
Posts: 1,895
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by qasdfdsaq
Yes but a simple "If you are calling about the current service outage, we're aware of it already" would massively reduce the number of support staff required.
|
Easily done, provided you can actually connect to the phone system rather than the NTS service Virgin use for their 150 and 08454541111 numbers.
If you had a few hundred thousand customers bombarding your support number for answers wouldn't you sweat a little? It's the reason most have automated menus on high-capacity servers to deal with that load, evidently this was just a little too much for the poor PBX to handle!
Quote:
Originally Posted by qasdfdsaq
And a lot of sites have a "backup" high volume, low complexity system for reporting major outages - e.g. reverting to a single line of text instead of failing completely.
|
An alternative for the 503 Server Error if you will, like some sites' cute 404 errors. This'd be fine if their internet service hadn't degraded to the point that the error was worth a pretty page for.
Quote:
Originally Posted by qasdfdsaq
Having a contingency plan to deal with major outages is all part of being a major service provider.
|
I'm not sure hardware failure comes under that list of first things to check. Evidently something big was up with their internet service when leased lines fail as well - it must have been a very major switch failure within their core network for customers to have still been connected at the UBRs but not routing the traffic, even the VOD failed and that's supposed to be internal.
Quote:
Originally Posted by qasdfdsaq
Actually I was getting a "Site too busy" response from Cable Forum most of the time while VM's own forums were slow, but functional.
|
God bless Paul M's ability to build a stable forum to take the load of 5000+ angry cable customers wanting answers. This place is non-profit and survives on donations.
By the way - slow is better than "unavailable". At least if it's slow you can get somewhere.
|
|
|
17-01-2012, 20:35
|
#138
|
|
Inactive
Join Date: Oct 2006
Location: Right here!
Posts: 22,315
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
I must admit that when I found this site unavailable due to an overloaded server I assumed Alan Fry had posted another of his legendary 'plans'.....
|
|
|
17-01-2012, 20:37
|
#139
|
|
Inactive
Join Date: Apr 2010
Services: VM 200Mb BB, Sky Q Silver 2TB
Posts: 739
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by braysoj1
it is Down here in keighley this is from my mobile
|
All back up and running now though.
|
|
|
17-01-2012, 20:58
|
#140
|
|
Inactive
Join Date: May 2007
Location: St Albans, Herts
Services: TV XL w/ All Sky HD Collections. 2xTivo, 1x VHD
Phone XL
Broadband XXL
Posts: 91
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by Turkey Machine
God bless Paul M's ability to build a stable forum to take the load of 5000+ angry cable customers wanting answers. This place is non-profit and survives on donations. 
|
Actually the forum did bomb out a few times with 'too many connection' warnings from vBulletin. These usually happen if there is a really high load on the server.
I can understand it on a site like this, but for VM's site to get 503 errors isnt acceptable. They could easily host the site in a cloud failover distribution, the costs would be negligible for VM to do so.
---------- Post added at 21:58 ---------- Previous post was at 21:54 ----------
Quote:
Originally Posted by Traduk
My modem went off at midnight last night for an hour which usually means VM working somewhere and as it came back with a different IP I thought it was re-segmentation. However performance was below par and pings longer than usual.
Early this afternoon as surfing started to fall apart and changes in DNS servers made no difference
|
Sounds a bit like mine. Woke up this morning with a new IP, service fine for most of the morning however was noticably slower this afternoon, with regular drops and server not found notices. Did a bunch of worldwide ping and speed tests and didnt find any problems. Shortly after this most sites failed to load. I thought it may be OpenDNS at first as I've got out network set to use them, however obviously it wasnt.
VOD wasnt affected in any way however.
(St Albans AL4 via Hemel)
|
|
|
17-01-2012, 20:59
|
#141
|
|
Permanently Banned
Join Date: Jan 2009
Location: In a world of no buffering!!
Services: Samsung V+ XL TV
XL Phone
30Mb Superhub
Samsung Galaxy 3 32GB sd card In a world of no buffering!
Posts: 20,915
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by rmwebs
Actually the forum did bomb out a few times with 'too many connection' warnings from vBulletin. These usually happen if there is a really high load on the server.
I can understand it on a site like this, but for VM's site to get 503 errors isnt acceptable. They could easily host the site in a cloud failover distribution, the costs would be negligible for VM to do so.
|
Have you read about the fault on the network on the community forum instead of making assumptions.
CLICK ME
|
|
|
17-01-2012, 21:04
|
#142
|
|
cf.mega poster
Join Date: Sep 2003
Posts: 12,048
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
service back to its usual 60% now.
|
|
|
17-01-2012, 21:13
|
#143
|
|
cf.geek
Join Date: Jul 2010
Location: Newcastle
Posts: 785
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Not noticed any issues in Newcastle
|
|
|
17-01-2012, 21:24
|
#144
|
|
cf.mega poster
Join Date: Aug 2004
Posts: 11,207
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by Chrysalis
service back to its usual 60% now.
|
Lol.
I'm glad I missed all the fun, was at work till it was all fixed.
So anyone know what the fuss was about? Someone mentioned routing hardware failure (which is pretty embarrasing, as a major ISP should have backup routes on just about everything) and someone else mentioned an aircon failure in Poplar?
|
|
|
17-01-2012, 21:56
|
#145
|
|
Inactive
Join Date: Jun 2007
Location: Bristol UK
Services: 2 x TIVO 500GB, XL TV, BB XL 60MB
Posts: 25
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
It sounds like this is either fixed or getting fixed - good news from my perspective. I was in work when the failure hit (Virgin Media leased line) and had all sorts of complaints - I worked out when I could not call Virgin it must be a Virgin fault but good to know it should have gone away before I get back into the office tomorrow! At home I haven't seen any disruption (VM cable) - but that might be because I was in work dealing with complaints during the major blackout!
Good work Virgin for getting it fixed, fingers crossed it doesn't happen again any time soon!
|
|
|
17-01-2012, 21:59
|
#146
|
|
Inactive
Join Date: Oct 2008
Location: Norwich
Age: 37
Services: Company LLU internet, soon-to-be company FTTC internet at 56Mb/20Mb!
Posts: 1,895
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by rmwebs
I can understand it on a site like this, but for VM's site to get 503 errors isnt acceptable. They could easily host the site in a cloud failover distribution, the costs would be negligible for VM to do so.
|
Wrong completely. Have you *ANY* idea how much bandwidth actually costs for ISPs? (clue - it's not cheap. Try £10 per MegaBIT-per-second as a rough minimum, with 100 MegaBIT being the minimum size for bandwidth charges, not to mention transit costs (LINX/LoNAP for UK traffic and Cogent et al [expensive] for the worldwide traffic)).
Your domestic 50Mb connection you pay for is a contended service. If you have a 50Mb leased line you pay the high premium to have that service switched on, all the time, and bandwidth reserved exclusively for you. 
---------- Post added at 22:59 ---------- Previous post was at 22:57 ----------
Quote:
Originally Posted by qasdfdsaq
Lol.
I'm glad I missed all the fun, was at work till it was all fixed.
So anyone know what the fuss was about? Someone mentioned routing hardware failure (which is pretty embarrasing, as a major ISP should have backup routes on just about everything) and someone else mentioned an aircon failure in Poplar?
|
Backup routes yes, hardware failure is rare, and when it does happen it requires an engineer to go to the rack(s) in question and manually switch the hardware. That takes time because you have to do it slowly, carefully and properly. Not ham-fistedly so some script-kiddie in his mum's basement can continue his Warcraft campaign.
|
|
|
17-01-2012, 22:09
|
#147
|
|
Wisdom & truth
Join Date: Jul 2009
Location: RG41
Services: RG41: 1Gig VOLT
Rutland: Gigaclear 400/400
Posts: 12,929
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Failover is usually soft in my world - the backup protocols are all there on Cisco kit. A SPOF is not how VM's network is designed otherwise you'd see everyone routing through that point in the traceroutes.
The engineer goes to the racks to replace or reset the kit in question after reading the logs and when the emergency change manager decides would be the best time.
__________________
Seph.
My advice is at your risk.
|
|
|
17-01-2012, 22:11
|
#148
|
|
cf.mega poster
Join Date: Aug 2004
Posts: 11,207
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by Turkey Machine
Quote:
Originally Posted by rmwebs
I can understand it on a site like this, but for VM's site to get 503 errors isnt acceptable. They could easily host the site in a cloud failover distribution, the costs would be negligible for VM to do so.
|
Wrong completely. Have you *ANY* idea how much bandwidth actually costs for ISPs? (clue - it's not cheap. Try £10 per MegaBIT-per-second as a rough minimum, with 100 MegaBIT being the minimum size for bandwidth charges, not to mention transit costs (LINX/LoNAP for UK traffic and Cogent et al [expensive] for the worldwide traffic)).
Your domestic 50Mb connection you pay for is a contended service. If you have a 50Mb leased line you pay the high premium to have that service switched on, all the time, and bandwidth reserved exclusively for you. 
|
How is this at all relevant to 503 errors on VM's website?
Internal access to VM's website has nothing to do with transit or peering, they wouldn't have to pay anyone for anything. In the worst case all they'd have to do is rent a £30/month server in someone else's datacentre to serve up "Yes, the site really is down" messages. £30 a month really is negligable for a company the size of VM.
Quote:
|
Backup routes yes, hardware failure is rare, and when it does happen it requires an engineer to go to the rack(s) in question and manually switch the hardware. That takes time because you have to do it slowly, carefully and properly. Not ham-fistedly so some script-kiddie in his mum's basement can continue his Warcraft campaign.
|
Dunno about yours or VM's setups, but in our environment backup routes (and pretty much backup everything else) kick in automatically. Last major outage we had on our primary route, nobody even noticed except the net-ops guys, even support hadn't heard a thing from either side till I told them.
Having to manually fail-over faulty hardware in this day and age is pretty backwards.
In any case, the failure of a single router or any individual piece of hardware should not be able to cause anything as severe as this. Loss of an entire datacentre due to aircon failure however, could be a justifiable cause, though quite what was going on with the A/C would raise a few questions in itself.
[Edit]
Yeah, what Seph said.
|
|
|
17-01-2012, 22:44
|
#149
|
|
Dr Pepper Addict
Cable Forum Admin
Join Date: Oct 2003
Location: Nottingham
Age: 63
Services: IDNet FTTP (1000M), Sky Q TV, Sky Mobile, Flextel SIP
Posts: 30,580
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Quote:
Originally Posted by qasdfdsaq
Actually I was getting a "Site too busy" response from Cable Forum most of the time while VM's own forums were slow, but functional.
|
We have a number of limits in place to stop anything bringing the site down completely. One of them cuts the forum to that site busy message if the server load exceeds a preset value. That limit is normally set to about 6.00 (the server normally runs at about 1.50 when busy). Today the load hit over 9.00 at one point, so that "safety valve" kicked in until the load fell again. I actually rasied it at one point to let more people on.
Quote:
Originally Posted by rmwebs
Actually the forum did bomb out a few times with 'too many connection' warnings from vBulletin. These usually happen if there is a really high load on the server.
|
Too many connections actually has no connection with server load. Its purely down to the connection limit on mysql. Again we have this set such that the whole thing cant run away with itself and die.
We hit record concurrent guest and member figures, and survived very well.
__________________
Baby, I was born this way.
|
|
|
18-01-2012, 04:40
|
#150
|
|
Inactive
Join Date: Apr 2008
Location: Nottingham
Services: XL 60/3
Posts: 356
|
re: MAJOR NETWORK ISSUE (17 Jan 2012)
Not sure if it's related or not but my hub has been resetting itself for the last few hours now, several times while reading this thread..
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT. The time now is 21:40.
|