LINX LON1 peering incident - Tuesday 23rd March 2021
Resolved
Peering via the LINX LON1 LAN has remained stable since we re-enabled peering on Thursday.

This incident is being marked as ‘resolved’ from our systems, as we have been operating normally on this connection.
LINX are still progressing the issue with their network vendor, but they know what causes the problem now and can mitigate around this until the vendor releases a fix in due course.
Posted Mar 27, 2021 - 15:00 GMT
Update
We have now re-enabled our LINX LON1 port following a positive update from LINX engineering.
Traffic is now exchanging via our LON1 port.

LINX together with their vendor have confirmed that they can re-create the problem they saw on Tuesday in a lab environment, and understand the process that triggers the brief interruption in service. They also understand how to mitigate the problem.

Therefore we have the confidence to re-enable our LINX LON1 port as this will provide optimal routing and lower latency for customers connections.

LINX are still working with their vendor to investigate the root cause.

We shall continue to monitor the network for the next day.
Posted Mar 25, 2021 - 14:58 GMT
Update
The LINX LON1 network continues to be stable and LINX are still engaging with vendor TAC to understand the root cause of the problem.

An update this morning from LINX informs us that their vendor has made good progress to identify a corruption issue, and that they are continuing to investigate the root cause of this.

We continue to maintain traffic away from the LON1 LAN for the time being.
All other peering and transit uplinks are operating nominally, with plenty of capacity.

We will re-evaluate our LINX LON1 peering tomorrow.
Posted Mar 25, 2021 - 09:19 GMT
Update
LINX continue to confirm that their LON1 peering LAN is stable following their 22:00 update, however we shall continue to not advertise our routes to peers for the time being.

LINX are continuing to engage with their vendor's engineering and development team to understand the root cause that triggered the layer two forwarding problems earlier.

We shall review our LINX LON1 peering tomorrow.
Posted Mar 23, 2021 - 22:46 GMT
Monitoring
We have continued to maintain traffic away from LINX LON1 whilst LINX still engage with their vendor TAC surrounding this afternoon's network incident on their LON1 peering LAN.
Peering to LINX LON2 and LONAP, as well as upstream Tier 1 transit links, are all operating nominally.

By way of a further update and timeline of events:
LINX have informed us that the LINX LON1 network is currently stable and passing traffic since 15:40.
However with an abundance of caution, we will maintain traffic away from this peering port at the moment.

LINX report that a minority of their connected member ports experienced brief moments of instability between 11:15 - 11:28 and 11:41 - 11:57 following routine configuration changes they were making on certain devices in their network (not the same device we connect to).
Our customers will only have noticed connectivity issues at this time if your traffic was trying to reach a network on these affected devices.
However, at 15:32, following some additional debugging added to the LINX LON1 LAN following vendor TAC instructions to get to the bottom of the earlier issues, LINX saw a much larger interruption in traffic flows, and this time it affected our port as well as other ISPs and networks across the LAN.

According to LINX LON1 traffic stats (https://portal.linx.net/), at ~15:32, traffic levels on LON1 dropped from 4.2Tb/s to 0.47Tb/s, before recovering to 3Tb/s at 16:00.

At 15:51 we took the decision to migrate traffic away from LINX LON1. Many other ISPs and networks have also taken this step as can be seen with the reduced traffic levels.

We will continue to monitor the situation and restore connectivity to LINX LON1 once we're satisfied stability has returned.
Posted Mar 23, 2021 - 18:33 GMT
Investigating
LINX (London Internet Exchange) are currently experiencing a network incident on their LON1 LAN.
They are engaging with Vendor TAC.

We have temporarily disabled our peering to LINX LON1 whilst the issues are ongoing to avoid connectivity issues to our customers.
We have resilient peering to LINX LON2 and LONAP, and these are currently operating nominally.
Posted Mar 23, 2021 - 15:58 GMT
This incident affected: M12 Giganet - Internet Services (Giganet Core - Peering).