CityFibre have now provided us with a reason for outage.
Human error. A CityFibre network engineer failed to follow standard process when commission a new service, and caused a misconfiguration resulting in the widespread disruption to all layer two provisioned CityFibre services nationwide.
This affected all Giganet CityFibre services, including our ELITE (leased lines) as well as FTTP broadband services.
The misconfiguration was rolled back by CityFibre network engineers as soon as they realised that the human error was the cause for this. The delay in 30 minutes to rectify the situation was due to their investigations on where the problem was caused.
CityFibre will be conducting a full investigation to ensure this cannot happen again.
12:27 - 13:04 on Wednesday 14th October 2020
Human error happens, even with automation systems that many of us operate, there’s often a human touch somewhere (even if it’s designing the automation systems). It was disappointing that this incident occurred and took as long as it did to restore full service.
We’ll be following with CityFibre to ensure that they have improved monitoring so that in the event of future misconfigurations, these are rolled back sooner.
Our monitoring systems highlighted the problem quickly, and after a few minutes of our own internal troubleshooting, we escalated and raised the fault to CityFibre. We also raised the Status Page incident very soon after the incident as can be seen from the timeline of events. Within 5 minutes.
We are also going to be going into more detail with their configuration routines to understand how ‘routine’ their configuration change was on the 14th, and if it was routine, why it caused all services to go offline. Naturally, for ‘business as usual’ provisioning tasks, these are usually extremely low-risk and only impact a single circuit at the time.
Customers who had a managed automated failover service add-on from Giganet would have been unaffected during this incident (aside from up to 180 second BGP failover timers).
We apologise for the disruption that this outage caused.