MyForexVPS - NY Connectivity issues – Detalles del incidente

Sistemas funcionando con normalidad

NY Connectivity issues

Resuelto
Interrupción mayor
Iniciado el hace más de 2 añosDuró 2 días

Afectado

New York Datacenter

Interrupción mayor de 2:00 PM a 1:53 AM

Actualizaciones
  • Resuelto
    Resuelto

    All servers are up.

    We want to remind you to login to your VPS server and run MT4/MT5 terminals. In case you still having issues, please contact support.

  • Actualizar
    Actualizar

    Our entire infrastructure is back up in New York. We are monitoring the network for any issues.

  • Actualizar
    Actualizar

    Our pod has been energized. Things are coming online. We'll post more info as this progresses.

  • Supervisando
    Supervisando

    DC Update:

    Our onsite team is currently bringing our UPS Systems online. We now have UPS-4 and UPS-R online. While bringing up UPS-1, we ran into an issue and we are unable to bring it online. We have engaged with our vendor and are finding a workaround to deliver power downstream to customer cabinets.
    
    Due to the issue with UPS-1, we will not be automatically powering up all customer cabinets. We will be reaching out to you to let you know if you are part of the group of customers on unprotected/non-UPS backed utility power so that you can make the decision whether to energize your cabinets at that time.
    
    We are currently at 50% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 3 hours. The current time frame for when clients will be able to come back onsite is approximately 10:30PM EDT.
    
    We are currently sourcing materials to bring our fire system fully online and do not have an ETA for completion. Because of this and fire marshal compliance we will only be allowed to have supervised escorted customer access when we finish bringing up the critical infrastructure systems. We will have additional personnel onsite to assist us with this escort policy.
    

    We are not part of UPS-1.

  • Actualizar
    Actualizar

    DC Update:

    Our onsite team is currently bringing our UPS Systems online. We have our UPS vendor onsite assisting us with this. We have brought UPS-4 online and are currently charging it's associated battery system. While bringing UPS-R online we have run into a minor issue that we are currently investigating.
    We are currently at 35% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 5 hours. We are still planning for an evening time frame when clients will be able to come back on site.
    In process with our power system re-energizing, we have been working on our fire system as well. We are currently sourcing materials to bring our fire system fully online and do not have an ETA for completion. Because of this and fire marshal compliance we will only be allowed to have supervised escorted customer access when we finish bringing up the critical infrastructure systems. We are currently sourcing additional personnel to assist us with this escort policy.

  • Actualizar
    Actualizar

    DC update:

    Our onsite team is currently bringing our UPS Systems online. We have our UPS vendor onsite assisting with this as we bring up UPS-4 and then UPS-R after that.
    As these systems are brought online, we will concurrently work on bringing carriers up.
    We are currently at 30% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 6 hours. We are still planning for an evening time frame when clients will be able to come back on site.

  • Actualizar
    Actualizar

    DC update:

    Our onsite team has energized the primary electrical equipment that powers the site, enabling us to bring our mechanical plant online. We are currently cooling the facility.

    As we monitor for stability, we are focused on bringing up our electrical systems. In starting this process, we have identified an issue with powering up our fire panel as well as power systems that were powered by UPS3. While this will cause us a delay, we are working with our vendors for remediation.

    We are currently at 25% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 7 hours. We are still planning for an evening time frame when clients will be able to come back on-site. We will send out additional information regarding access to the facility and remote hands assistance and we will notify you once client access to the facility is permitted.

  • Actualizar
    Actualizar

    DC's hourly update:

    We have completed the full site inspection with the fire marshal and the electrical inspector and utility power has been restored to the site.

    We are now working to restore critical systems and our onsite team has energized the primary electrical equipment that powers the site. Concurrently, we are beginning work to bring the mechanical plant online. Additional engineers from other facilities are on site this morning to expedite site turn up.

    The ETA for bringing up the critical infrastructure systems is approximately 5 hours.

    We are planning for a late afternoon/early evening time frame when clients will be able to come back on site.

  • Actualizar
    Actualizar

    Datacenter update:

    Our site inspection this morning went well and we have been granted authorization to restore utility power to the site and are currently working on re-energizing utility power to the facility. Our onsite team is working with the fire marshal and electrical inspectors, ensuring electrical system safety as we prepare to bring utility power back to the site.

    Once that is completed, we will work towards bringing up our critical infrastructure systems. This will take approximately 5 hours.

    While we are working on that, we will also be working on our fire/life safety systems as we need to replace some smoke detectors and have a full inspection of the fire system prior to allowing customers to enter the facility.

    We will be sending out hourly updates as we make progress on bringing the facility back online.

  • Actualizar
    Actualizar

    Preliminary update from our CSM:

    I heard the preliminary inspection is good and we are taking steps to energize the property now.

    I’m waiting for the official update from DC Ops. More to come.

    We'll post more info as soon as we have it, in addition to our power-up plan.

  • Actualizar
    Actualizar

    As communicated in the previous post, we are awaiting further updates from the DC and Fire departments.
    The fire department's inspection is scheduled to begin at 9 AM EST (1 PM GMT).
    Depending on the outcome we will know more, we hope they're happy with the state of the cleanup, and safety systems.

    There is a lot of pressure on the datacenter to resume operations ASAP as many other companies have been impacted by this incident.

  • Actualizar
    Actualizar

    We have received the following disappointing response from the datacenter:

    The New York datacenter remains powered down at this time per the fire marshal.

    We have just finished the meeting with the fire marshal, electrical inspectors, and our onsite management. We have made great progress cleaning and after reviewing it with the fire marshal, they have asked us to clean additional spaces and they have also asked us to replace some components of the fire system. They have set a time to come back and review these requests at 9am EDT Wednesday. We are working to comply completely with these new requests with these vendors and are bringing in additional cleaning personnel onsite to make the fire marshal's deadline.

    In preparation for being able to allow clients onsite, the fire marshal has stated that we need to perform a full test of the fire/life safety systems which will be done after utility power has been restored and fire system components replaced. We have these vendors standing by for this work tomorrow.

    Assuming that all goes as planned, the earliest that clients will be allowed back into the site to power up their servers would be late in the day Wednesday.

    We are working to see what alternatives we have, if any.

  • Actualizar
    Actualizar

    As we have not heard back about the results of the fire marshal/electric utility/DC ops meeting, we have pinged for an update.

  • Actualizar
    Actualizar

    Datacenter update:

    The New York data center remains powered down at this time per the fire marshal.

    Site management, the fire marshal, and electrical contractors are currently meeting to review the process of the cleaning effort to get approval from the fire marshal to re-energize the site.

  • Actualizar
    Actualizar

    Access update for our team from the datacenter:

    VP of DC Ops will be sending out instructions for re-entry to the site. If all goes as planned, it will be around 6:00/7:00PM. We need to re-energize the critical infra at the site and get it cooled down prior to giving customer access. Will take 4 to 5 hours assuming Fire Marshall gives all clear.

  • Actualizar
    Actualizar

    Mid-day update from the datacenter:

    The New York data center remains powered down at this time per the fire marshal. We continue to clean and ready the site for final approval by the fire marshal in order to re-energize the facility's critical equipment. Site management, the fire marshal, and electrical contractors will be meeting at 2PM EDT in an attempt to receive approval from the fire marshal to re-energize the site. We do not foresee any issues that would result in not receiving such approval.

    Re-energizing critical equipment will take 4-5 hours. After this process, we will be energizing customer circuits and powering on all customer equipment. We will provide updates as to when customers will be allowed in the facility once approved by the fire marshal.

  • Actualizar
    Actualizar

    Current status update from DC Ops:

    Our remediation vendor and our team has worked through the night to clean the UPS' at the request of the fire marshal. They have made significant progress and we hope to have the cleaning completed by mid-day, at which time we will engage the fire marshal to review the site. Following their review, we hope to get a sign off from them so that we can start the reenergizing process. The reenergizing process can take 4-5 hours, as we need to turn up the critical infrastructure prior to any servers.

  • Actualizar
    Actualizar

    Current status from the datacenter below. We've asked for that 8 AM EDT update, as the time frame has come and gone.

    Power remains off at our data center in New York as per the local fire Marshall.
    Datacenter update:
    After reviewing the site, the fire Marshall is requiring that we extensively clean the UPS devices and rooms before they will allow us to re-energize the site. We have a vendor at the site currently who will be performing that cleanup.

    We will continue to provide updates as we receive them.

  • Actualizar
    Actualizar

    We got the update from the datacenter management, that power might be up by 8:00 AM EDT time

  • Actualizar
    Actualizar

    Statement from the datacenter itself:

    Power remains off at our data center in New York as per the local fire marshal.

    We have had an electrical failure with one of our redundant UPS' that started to smoke and then had a small fire in the UPS room. The fire department was dispatched and the fire was extinguished quickly. The fire department subsequently cut power to the entire data center and disabled our generators while they and the utility verify the electrical system. We have been working with both the fire department and the utility to expedite this process.

    We are currently waiting on the fire marshal and local utility to reenergize the site. We are completely dependent upon their inspection and approval. We are hoping to get an update that we can share in several hours.

    At the current time, the fire department is controlling access to the building and we will not be able to let customers in.

  • Identificado
    Identificado

    We've received the update: an isolated fire in a UPS in an electrical room was detected and put out by fire suppression. The local fire department arrived on the scene and per NEC guidelines and likely local laws and general best practices for firefighters, cut the power to the building. This caused the down -> up -> down cycle noted earlier today.

    The current state is that datacenter electricians are on site awaiting access to the building to perform repair work to the UPS, but are currently waiting for permission from the fire department to enter the building.

    Once the electrical work is complete, the power will be applied to the HVAC to subcool the facility, which will take an estimated 3-4 hours, and at that point, power will be restored to data halls, which will bring our network and servers back online.

    The datacenter manager gave a best-case ETA of tomorrow morning, July 11th, for power to be restored to data halls.

  • Investigando
    Investigando

    There is no connectivity to servers hosted in the New York datacenter.
    We are currently investigating this incident.