Hyonix - London Servers Degraded Performance – Incident details

All systems operational

London Servers Degraded Performance

Resolved
Operational
Started 8 months agoLasted 10 days

Affected

Virtualization Hosts

Degraded performance from 3:51 PM to 9:41 PM, Operational from 9:41 PM to 9:41 PM

London, UK (LON1)

Degraded performance from 3:51 PM to 9:41 PM, Operational from 9:41 PM to 9:41 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved as Datacenter has replaced the AHU, cooling temperature has been back to operational range.

  • Update
    Update

    The Datacenter has updated us the following:

    > The faulty AHU (F4) is partly operational, and contractors will be returning tomorrow to begin work on the faulty compressor circuit. We're optimistic that it will be fully operational by close of business this Thursday.

    Our team will update this status once we receive any news from them.

  • Monitoring
    Monitoring

    After several hours of utilization of the portable conditioning units, all servers and network equipment are now within their operating temperatures.

    Currently, we are awaiting communication from datacenter operations regarding the malfunctioning cooling unit. At present, the projected timeframe for resolution remains the same on Monday the 18th, by the end of the day.

  • Identified
    Identified

    The Datacenter is currently mitigating the issues by installing portable cooling units, while waiting for 3rd party vendors to arrive and troubleshoot/replace the failed unit.

    We are seeing temperatures starting to decrease to acceptable values, majority of nodes should be working normally, however a few nodes are still throttling.

    We are expecting this issue to be solved by end of day. Our team will post updates as we receive more information from Datacenter Operators.

    Our apologies for any inconvenience caused.

  • Investigating
    Investigating

    Our team is aware that London servers are experiencing drastic performance issues, this is due to our Datacenter temperature has overheated which caused all servers performance to be throttled.


    The core reason is the a cooling unit (AHU) has been failed, but redundancy did not kick in. We are currently checking with Datacenter Operators and will update here as soon as we have new information.

    All host nodes have not been down yet. We will closely monitor the incident and will provide notifications if any nodes go down.