Platform.sh

Is Platform.sh Down Right Now? Check if there is a current outage ongoing.

Platform.sh is currently Operational

Last checked from Platform.sh's official status page

Historical record of incidents for Platform.sh

Report: "Unscheduled Maintenance on ch-1.platform.sh"

Last update
resolved

The CH-1 region required an unscheduled maintenance period due to load concerns between 0900 and 1100 UTC. Our Operations team increased the capacity of the underlying hosts to provide additional capacity to the gateways. Projects hosted on CH-1 may have experienced performance issues with Console WebUI, SSH, and Git Integration access to projects, as well as connection issues with deployment activities during this maintenance.

Report: "Partial Outage on FR-3"

Last update
investigating

We have detected an issue affecting service on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service.

Report: "Partial Outage on au.platform.sh"

Last update
monitoring

We have detected some timeouts with the API services on the au.platform.sh region. - API, console, CLI, SSH and deployments are affected - Live sites are not affected Our operations team is investigating

Report: "Partial Outage on EU-5"

Last update
investigating

Status moved back to investigating due to potential issues.

monitoring

Alerts have resolved. We are monitoring this situation.

identified

The issue has been identified and a fix is being implemented.

investigating

We noticed only a partial outage only on a few projects. We are currently investigating the issue.

Report: "Partial Outage on fr-3.platform.sh"

Last update
postmortem

# What Happened Recent maintenance on the fr-3.platform.sh region caused some subsystems to be unresponsive. This resulted in an outage of the project console, API, and subsystems in the fr-3.platform.sh region. Live environments on Grid or Dedicated infrastructure were not affected. # Customer Impact ‌ From 2025-06-03 20:00 UTC to 2025-06-04 05:26 UTC, some customers may have had trouble accessing the project console, project API, SSH, and submitting deployments and backups. ‌ # What was done to resolve the incident Our team corrected the internal states of those subsystems in order to make them operational again.

resolved

This incident has been resolved.

monitoring

Services have been restored and we are monitoring to ensure stability

identified

We have detected an issue affecting service on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service. The console and project services (deployments) are currently unavailable for some projects on the region. Production site uptime is NOT affected. We will update you as soon as we have further information.

Report: "Partial Outage on fr-3.platform.sh"

Last update
Identified

We have detected an issue affecting service on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service. The console and project services (deployments) are currently unavailable for some projects on the region.Production site uptime is NOT affected. We will update you as soon as we have further information.

Report: "Routine Maintenance in fr-3.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the fr-3.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in de-2.platform.sh"

Last update
Scheduled

We will be performing maintenance in the de-2.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Report: "Routine Maintenance in fr-4.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the fr-4.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in fr-1.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the fr-1.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in eu-5.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the eu-5.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Scheduled Maintenance – Accounts Service"

Last update
Completed

The scheduled maintenance has been completed.

Update

Scheduled maintenance is still in progress. We will provide updates as necessary.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing scheduled database maintenance on the Accounts service to improve system performance and reliability.During this time, certain account-related functionalities may be temporarily unavailable, including project creation, ticket submission, user management, provisioning, billing, updating billing addresses, payment methods, SSH keys, and profile pictures.All customer environments will remain available, and the project Console will continue to function normally throughout the maintenance window.If you have any questions or concerns, please don't hesitate to reach out via our discord channel by logging into https://discord.gg/PkMc2pVCDV.

Report: "Routine Maintenance in eu-4.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the eu-4.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in au-2.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the au-2.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in ca-1.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

Platform.sh has scheduled a maintenance window in the ca-1.platform.sh region.The host servers that run the region will be rebooted and/or upgraded. Downtime during this maintenance is expected only for a small proportion of the region’s projects running on the affected host.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in eu.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the eu.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Partial Outage on EU-5"

Last update
postmortem

A host in the grid had a failure and the automatic evacuations of running containers in it failed too. Operations team manually evacuated those containers to other hosts and the faulty host was replaced with a new one.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and Engineers are working on a fix.

investigating

We are continuing to investigate this issue.

investigating

We noticed a few sites on alert due to a host service issue on EU-5. This is for Grid only. Our Engineers are working on it.

Report: "Routine Maintenance in au.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the au.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Partial Outage on EU-5"

Last update
Postmortem
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Identified

The issue has been identified and Engineers are working on a fix.

Update

We are continuing to investigate this issue.

Investigating

We noticed a few sites on alert due to a host service issue on EU-5. This is for Grid only. Our Engineers are working on it.

Report: "Routine Maintenance in us-3.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the us-3.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in ch-1.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the ch-1.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in us-4.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the us-4.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in us-2.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the us-2.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in uk-1.platform.sh"

Last update
Update

The scheduled maintenance has been completed.

Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing a routine maintenance in the uk-1.platform.sh region to renew our TLS certificates as part of our standard rotation process.All projects and environments will remain operational during this maintenance; however, you may experience brief interruptions or increased latency at certain points.If you have any questions or concerns about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also reach out to us via our public chat channel at https://chat.platform.sh.Thank you for your understanding as we work to ensure the security and reliability of our services.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in eu-3.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing a routine maintenance in the eu-3.platform.sh region to renew our TLS certificates as part of our standard rotation process.All projects and environments will remain operational during this maintenance; however, you may experience brief interruptions or increased latency at certain points.If you have any questions or concerns about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also reach out to us via our public chat channel at https://chat.platform.sh.Thank you for your understanding as we work to ensure the security and reliability of our services.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in eu-2.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the eu-2.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Routine Maintenance in us.platform.sh"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

We will be performing maintenance in the us.platform.sh region to release the latest features, bug fixes, and performance updates.All projects and environments will continue to function during the upgrade; however, at one point each project will be reallocated resulting in a short interruption. Integrations and webhooks may fail during the maintenance, and you may notice increased deployment times.If you have any questions about this maintenance, please contact Platform.sh Support by logging in to https://accounts.platform.sh and visiting the Support Center. You can also access our public chat channel using https://chat.platform.sh.Sincerely,Platform.sh Support Team

Report: "Partial Outage on us-3"

Last update
postmortem

## **What happened** Our monitoring systems have detected slow heartbeat responses from several storage nodes. Initial analysis shown network latency or disk I/O contention on specific nodes. This incident did not affect Dedicated infrastructure. ## **Customer Impact** Customer Impact was between 08:20 and 09:00 A.M UTC at 2025-05-16. Some live sites in the affected region experienced an outage and customers were unable to access console or conduct any deployments. ## **What was done to resolve the incident** The storage back-end healed itself.

resolved

This incident has been resolved.

monitoring

Our systems are now stable and we are continuously monitoring it.

identified

The issue has been identified and Engineering team is working on it.

investigating

We have detected an issue affecting service on the US-3 regions. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience downtime or slowness. We will update you as soon as we have further information.

investigating

We identified partial outage on our US-3 Grid projects.

Report: "Partial Outage on ca-1.platform.sh"

Last update
resolved

A hardware issue with our upstream provider caused a single grid host to reboot unexpectedly. As a result, we had to manually fail-over the environments running on that host. We are investigating why the automatic fail-over did not take place and will ensure that the expected fail-over occurs in future. Customers may have experienced outages of up to 26 minutes during this incident.

Report: "Partial Outage on FR-3.platform.sh"

Last update
postmortem

# What Happened Our upstream provider for region FR-3.platform.sh had an unexpected network equipment issue during scheduled maintenance. # Customer Impact Between 2025-04-17 01:36 UTC and 03:21 UTC, services including project console UI, API, deployments, SSH, backups and customers' environments as well as live ones were unavailable. # What was done to resolve the incident We received a recovery report from our upstream provider and have restored the affected services and environments. Services and environments in region FR-3.platform.sh should now be available and accessible. Incident resolved.

resolved

This incident has been resolved.

monitoring

Our team has successfully restored the availability of those affected environments following the recovery report from the upstream provider. All FR-3 environments should now be accessible.

identified

Our upstream provider has confirmed that a network equipment issue is impacting their scheduled maintenance.

identified

Our upstream provider is reporting unspecified problems with their infrastructure. Our team will be keeping a close eye on the updates from our upstream provider and will take corrective actions as soon as possible.

investigating

We have detected an issue affecting service on the FR-3 region. Our Operations team has been notified and is currently working to restore service. Environments on affected regions may be unavailable. We will update you as soon as we have further information.

Report: "Partial outage of Console and Deployment services on FR-3 region"

Last update
postmortem

# What Happened A networking issue was found in region FR-3.platform.sh, affecting the subsystems responsible for handling console UI, project API, SSH and deployment functions. Customers may have experienced issues while accessing project console, interacting with project API, logging into environments with SSH, submitting deployments and backups. ‌ # Customer Impact From 2025-03-25 17:58 **UTC** to 2025-03-26 01:43 **UTC** , some customers were unable to access console, environments or submit any deployments. ‌ There was no impact on environments or production sites. ‌ # What was done to resolve the incident Our team has restored the availability of subsystems responsible for handling console UI, project API and deployments functions. Incident resolved.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

Our Operations team has identified a networking issue and deployed a correction fix to restore the availability of those related subsystems.

investigating

We have detected an issue affecting service on the FR-3.platform.sh region. Some customers may not be able to submit deployments nor access to console UI and/or project API. Our Operations team has been notified and is actively recovering the availability of those services. This issue does not affect availability of any environments, only the API services for the project. There is no site downtime as a result of this issue.

Report: "Degraded Performance of Console and Deployment services on AU, AU-2 regions."

Last update
postmortem

# What happened A recent upgrade to regions AU.platform.sh and AU-2.platform.sh was unable to fully purge the stale network states within our routing infrastructure. This caused client programs within our subsystems responsible for handling deployments, console and project API having busy-looping processes retrying invalid TCP connections indefinitely and consumed all the available CPU resources. ‌ This caused the console to freeze, slow loading times when using project API functions, and delays in deployments or backups. ‌ After further investigation, the Operations team has deployed an emergency fix to eliminate those busy-looping processes. Our team has stopped the scheduled upgrades to other regions and is still working on a permanent fix. ‌ # Customer impact From 2025-03-20 12:50 **UTC** to 2025-03-21 09:00 **UTC**, customers may experience delays when loading console, triggering deployments or taking backups for their projects in regions AU.platform.sh and AU-2.platform.sh . ‌ From 2025-03-21 07:27 **UTC** to 2025-03-21 07:55 **UTC**, there may be issues with outgoing connections in AU.platform.sh environments, including live ones, due to our emergency fix deployment. ‌ # What was done to resolve the incident The Operations team has taken out the busy-looping processes in the subsystems for deployments, console, and project API functions and deployed an emergency correction fix to the affected regions. ‌ Incident has been resolved.

resolved

This incident has been resolved.

monitoring

After applying an emergency fix, we can see that the services are now operational for both AU and AU-2 regions. Our Operations team is still actively monitoring the services and implementing a permanent fix for this issue.

identified

AU is now fully recovered. Our engineers are still implementing a permanent fix for this issue.

identified

The outgoing connections issues have been corrected after deploying an emergency fix.

identified

We are continuing to work on a fix for this issue.

identified

We noticed that some customer environments are failing to make outgoing connections. Our engineers are still actively working on the recovery.

identified

AU-2 is now fully recovered.

identified

We have detected an issue affecting service on the AU and AU-2 regions. Our Operations team has been notified and is currently working to resolve the issue. Projects on affected regions may experience delays when loading console, triggering deployments or taking backups. This issue does not affect availability of any environments, only the API services for the project. There is no site downtime as a result of this issue.

Report: "Outage on EU-5"

Last update
postmortem

WHAT HAPPENED An incident at upstream provider \(AWS\) affecting networking resulted in outages in EU-5 region CUSTOMER IMPACT Sites were unavailable from 23:30 UTC Feb 13 to 01:00 UTC Feb 14. WHAT WAS DONE TO RESOLVE THE INCIDENT Fixes implemented by upstream provider resolved the incident.

resolved

The upstream problem has been resolved, and we are not receiving any further alerts. As a result, we are going to mark the incident as resolved.

monitoring

Affected projects are back online as upstream provider has implemented fixes - we will continue to monitor the situation.

investigating

We have detected an issue at our upstream provider affecting service on the EU-5 region. This issue affects multiple production sites as well as development environments. Access to your site, project UI as well as Git and SSH access may be affected. We will update you as soon as we have further information.

Report: "Partial Outage on EU-5 region"

Last update
resolved

We have not seen any further storage alerts in the region and marking the incident as resolved.

monitoring

Operations team has implemented fixes and storage infrastructure is no longer in degraded state. We have not received new outage reports however, We will continue to monitor the situation.

identified

Our Team have added additional storage nodes and are actively monitoring the region.

investigating

We have detected degraded performance due to updates to storage infrastructure affecting service on the EU-5 region. Our Operations team has been notified and is currently investigating. Projects on the region may experience web request time-outs. We will update you as soon as we have further information.

Report: "HTTP Traffic reporting in console unavailable on Upsun"

Last update
resolved

Our engineers have completed the roll-out of the fix and HTTP Traffic reporting is now working in the Upsun console.

investigating

Our engineers are continuing to investigate this issue. We believe we have identified a mitigation path, and are working to roll it out at an appropriate time. We will continue to provide updates here as we have new information to offer.

investigating

We have detected an issue affecting HTTP Traffic reporting in the Upsun console. Our Engineers are investigating and working to resolve this issue. No site services are impacted by this issue. Only reporting in the Upsun console. We will update you as soon as we have further information.

Report: "Console issues when creating variables."

Last update
resolved

Our Team has deployed the fix for the issue and the incident is now resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently seeing an issue with our console when users are trying to create variables. Our Teams are actively investigating the issue. Please use the CLI to add variables for now. https://docs.platform.sh/development/variables/set-variables.html#create-project-variables

Report: "Reports of phishing emails from @internal.platform.sh"

Last update
resolved

This incident has been resolved.

investigating

We are aware of reports of spam/phishing emails being sent from @internal.platform.sh. Out of an abundance of caution we recommend not opening these emails or clicking on any links. We're investigating the issue with our email provider Sendgrid and are working to resolve this issue. For any questions please contact our support team by creating a ticket. Information on how to do that is in our documentation: https://docs.platform.sh/learn/overview/get-support.html

Report: "Partial Outage on us-2"

Last update
postmortem

## **What happened** We detected a degradation of one or more physical storage drives in the us-2 region. As a result of that one host was down and containers exhibited unresponsiveness and or poor performance. After investigation, our engineers evacuated the containers to bring it back online. ## **Customer Impact** Between 2024-11-20 13:10 UTC - 13:55 UTC , platform.sh gird customers containers had read/write issues to the disk. ## **What was done to resolve the incident** Our team quickly evacuated the containers to another host and re-opened / restarted the affected containers and restored availability.

resolved

This incident has been resolved now.

monitoring

A fix has been implemented and we are monitoring it.

identified

The issue has been identified and we are working on a fix.

investigating

We have detected an issue affecting services on the us-2 region. Our Operations team has been notified and is currently working to restore service. Projects may experience downtime. We will update you as soon as we have further information.

Report: "Partial Outage on FR-3"

Last update
postmortem

## **What happened** We have identified issues on hosts with git containers. This led to a Console and API outage on platform FR-3 region. This incident did not affect website availability on Grid or Dedicated infrastructure. After investigation, our engineers identified the host and rebooted and made sure all services were up after the reboot. ## **Customer Impact** Between 2024-11-22 11:00 AM UTC and 14:00 PM UTC , customers were unable to access console or conduct any deployments. There was no impact on environments or production sites. ## **What was done to resolve the incident** Our team quickly found the offending host and rebooted it. Then they re-opened / restarted the affected containers and restored availability. Console, API and Auth sub-systems outage on platform cloud are now resolved. ‌ `.`

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and fix is being implemented.

investigating

We have detected an issue affecting service on the FR-3 regions. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience unable to login to console, listing environments via command line or unable to push changes. We will update you as soon as we have further information.

Report: "Disk issue in eu-5"

Last update
postmortem

## What happened A few of our customers containers had disk issues in EU-5. This incident affected read/write operation for containers that were affected. After our investigation our Engineers found that one OSD \(CEPH storage\) had slow ops, and hit the IOPs limit. Hence just that one osd was changed to io2 type to mitigate the issue. ## **Customer Impact** Between 2024-11-25 09:45 AM UTC and 10:05 AM UTC , platform eu-5 gird customers containers had issues read/write issues. ## **What was done to resolve the incident** Our team quickly fixed the disk and also re-opened / restarted the affected containers and restored availability. This incident are now resolved.

resolved

This incident has been resolved

monitoring

A fix has been implemented and we are monitoring the situation.

identified

We have identified a slow disk in eu-5 and a fix has been implemented.

investigating

We are currently investigating a disk issue in eu-5

Report: "Partial Outage on Dedicated Clusters"

Last update
postmortem

## **What happened** Underlying services on a small number of projects were found to be in unhealthy state and caused application errors. ## **Customer Impact** Sites were not accessible for up to 2 hours and 30 minutes. ## **What was done to resolve the incident** A configuration fix was applied to ensure site services can run without hiccups.

resolved

Services on some dedicated cluster were found to be in unhealthy state causing site outages.

Report: "Outage on Orange Flexible Engine dedicated hardware"

Last update
resolved

This incident has been resolved.

monitoring

We're no longer seeing issues with our upstream provider and all affected sites have stabilized. We're continuing to investigate this with our provider, as well as continuing to monitor for any further issues affecting site availability.

investigating

We're seeing further connectivity issues on affected clusters. We're still working with our provider to investigate and resolve the issue.

investigating

Affected sites are coming back online. We are still investigating the issue with our provider.

investigating

Our monitoring has detected issues with our cloud infrastructure provider, which affect all sites hosted on Orange Flexible Engine dedicated hardware. Our operations team has been notified, and they are investigating the issue with our provider. Projects hosted on the underlying provider Orange Flexible Engine may experience connectivity issues to and from the cluster nodes. We will update you as soon as we have further information.

Report: "Degraded performance on Console"

Last update
postmortem

## **What happened** A configuration error prevented some users from accessing web console ## **Customer Impact** Users were not able to access their projects through web console. Access through CLI was not impacted. ## **What was done to resolve the incident** Configuration fix was applied to ensure console loads for all users.

resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

Some users may experience intermittent unavailability with UI (Console) and CLI. Environments availability is NOT impacted. We will update you as soon as we have further information.

Report: "User Accounting outage"

Last update
postmortem

#### WHAT HAPPENED Data was affected during planned disk maintenance of the Accounts system. #### CUSTOMER IMPACT No live sites were impacted Customers would not have been able to access projects \(through console, CLI, SSH\) or perform deployments from 2024-10-17 01:49:10 UTC to 2024-10-17 02:50:19 UTC Any User Accounting changes made between those times may need to be redone. #### WHAT WAS DONE TO RESOLVE THE INCIDENT Data was restored using backup taken before start of maintenance.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We have detected an issue affecting User Accounting and may be affected for the next 1 hour. Our Operations team has been notified and is currently working to restore service. Live sites are not affected We will update you as soon as we have further information.

Report: "Partial outage on de-2.platform.sh"

Last update
postmortem

## **What happened** Our rate limiter started throttling connections between the projects hosted in de-2 region resulting in connections timing out. This incident only affected multi-app project and projects using microservices architecture. ## **Customer Impact** Sites were intermittently available from 13:40 to 14:52 UTC. This incident did not affect Dedicated clusters. ## **What was done to resolve the incident** The regional connection limit has been raised.

resolved

We've seen no further issues and the de-2 region is stable.

monitoring

Affected sites are recovering. We're continuing to monitor the situation.

identified

We've identified an issue in our region gateway configuration and a fix is being deployed.

investigating

We have detected an issue affecting service on the DE-2 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience downtime. We will update you as soon as we have further information.

Report: "Partial Outage on US-3 region"

Last update
postmortem

## **What happened** Upstream infrastructure provider rebooted one of our regional gateways. ## **Customer Impact** Sites were intermittently available from 12:12 to 12:21 UTC. This incident did not affect Dedicated clusters. ## **What was done to resolve the incident** Self-resolved after the gateway went online.

resolved

This incident has been resolved.

identified

All of the region alerts have cleared. We are still investigating the root cause.

investigating

We have detected an issue affecting service on the US-3 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may be unreachable. We will update you as soon as we have further information.

Report: "Partial Outage on fr-3.platform.sh"

Last update
postmortem

## **What happened** Our rate limiter started throttling connections between the region and our API resulting in an inability to load the Console UI and use the CLI in some projects. This incident did not affect live environments on Grid or Dedicated infrastructure. ## **Customer Impact** Between 2024-09-12 11:32 and 12:19 UTC, some customers were unable to load the Console UI and use the CLI. There was no impact on environments or production sites. ## **What was done to resolve the incident** We have raised the connection limit between the region and API.

resolved

What happened: Our rate limiter started throttling connections between the region and our API resulting in an inability to load the Console UI and use the CLI for some of the projects This incident did not affect live environments on Grid or Dedicated infrastructure. Customer Impact Between 2024-09-12 07:25 and 10:25 UTC, some customers were unable to load the Console UI and use the CLI. There was no impact on environments or production sites. What was done to resolve the incident We have raised the connection limit between the region and API.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We have detected an issue affecting our console and CLI on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service. Some projects on affected region may experience inability to access the Console UI, connect via SSH and use the CLI. We will update you as soon as we have further information.

Report: "Partial Outage on fr-3.platform.sh"

Last update
postmortem

## **What happened** Our rate limiter started throttling connections between the region and our API resulting in an inability to load the Console UI and use the CLI in some projects. This incident did not affect live environments on Grid or Dedicated infrastructure. ## **Customer Impact** Between 2024-09-12 07:25 and 10:25 UTC, some customers were unable to load the Console UI and use the CLI. There was no impact on environments or production sites. ## **What was done to resolve the incident** We have raised the connection limit between the region and API.

resolved

This incident has been resolved.

investigating

We have detected an issue affecting our console and CLI on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience inability to access the Console UI, connect via SSH and use the CLI. We will update you as soon as we have further information.

Report: "Partial Outage on CA-1"

Last update
postmortem

## **What happened** A large increase in incoming connections to the CA-1 region caused sites in the region to become intermittently available. This incident did not affect Dedicated clusters. ## **Customer Impact** Sites were intermittently available from 15:42 to 16:07 UTC. ## **What was done to resolve the incident** We have taken steps to block malicious traffic.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We have detected an issue affecting service on the CA-1 regions. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience timeouts and slow response times. We will update you as soon as we have further information.

Report: "Dedicated clusters hosted in Orange FE are alerting"

Last update
postmortem

**What happened** Upstream infrastructure provider encountered network issues that resulted in packet loss. **Customer Impact** Some projects running on dedicated infrastructure suffered degraded performance for around an hour and a half. **What was done to resolve the incident** Upstream provider applied a fix.

resolved

This incident has been resolved.

monitoring

Orange FE has implemented a fix and we are monitoring the results. The alerts on our monitoring have cleared.

identified

We have contacted Orange FE for further investigation.

investigating

We are investigating of dedicated clusters hosted in Orange FE alerting of time outs.

Report: "Console & API Issue in regions EU-5 and FR-4"

Last update
postmortem

**WHAT HAPPENED** We identified an issue in some build hosts which affected metadata and metrics in affected projects. **CUSTOMER IMPACT** No live sites were impacted unless there was a deployment that encountered an error during this period Some customers may not have been able to access or encountered performance degradation with the console, CLI, SSH, metrics and deployments from 2024-09-19 08:33 to 2024-09-23 23:01 UTC. **WHAT WAS DONE TO RESOLVE THE INCIDENT** The underlying issue was corrected and any projects that did not heal automatically were corrected with a recent backup.

resolved

This incident has been resolved.

monitoring

We have seen no new instances of issues related to this incident, but we will continue in the monitoring phase for a short time longer.

monitoring

We are continuing to monitor for any further issues.

monitoring

Platform.sh teams have resolved the issue regarding the affected projects. Console, CLI and API-related tasks should be working for all the projects in affected regions. We will be monitoring the results closely.

identified

The recovery work is still in progress for both regions.

identified

We are getting new issue reports for this incident. Platform.sh teams are now actively working on recovering those affected projects.

monitoring

Platform.sh teams have resolved the issue regarding the affected projects. Console, CLI and API-related tasks should be working for all the projects in affected regions. We will be monitoring the results closely.

identified

Our teams are still reviewing the issues, fixing affected projects, and working on a permanent fix.

identified

After further internal review, we have noticed region FR-4 is also affected by this incident. The recovery work is still in progress for both regions.

identified

We are continuing to work on a fix for this issue.

identified

Another issue has been identified and a fix is being implemented

monitoring

Platform.sh teams have deployed the mitigation for this issue and are monitoring it's effects.

identified

Platform.sh teams are continuing to deploy the mitigation for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating an issue affecting the Console & API on some projects in the eu-5 region. Our operations teams are aware of the issue and are taking measures to correct it. Affected projects will not be able to access their project's Console or perform API related tasks.

Report: "Outage on Orange Flexible Engine dedicated hardware."

Last update
postmortem

## **What happened** Upstream infrastructure provider encountered network issues that resulted in packet loss ## **Customer Impact** Some projects running on dedicated infrastructure suffered degraded performance for less than an hour. ## **What was done to resolve the incident** Upstream provider applied a fix.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

A networking equipment issue has been identified with our upstream provider and a fix is being implemented, there may still be further connectivity issues until the issue has been corrected.

investigating

Our upstream provider is still investigating the networking issue.

investigating

Our monitoring has detected issues with our cloud infrastructure provider, which affect all sites hosted on Orange Flexible Engine dedicated hardware. Our operations team has been notified, and they are investigating the issue with our provider. Projects hosted on the underlying provider Orange Flexible Engine may experience connectivity issues to and from the cluster nodes. We will update you as soon as we have further information.

Report: "Outage on fr-4.platform.sh"

Last update
postmortem

## **What happened** Cloud partner suffered network connectivity issues. ## **Customer Impact** Access to to Websites, Console and API was disrupted for some customers from 13:24 UTC to 15:10 UTC on 26 September 2024 ## **What was done to resolve the incident** Cloud provider implemented mitigation to restore capacity. Communications from cloud partner were actively monitored until confirmation of resolution.

resolved

Resolution applied by cloud partner has been effective, all systems are fully functional.

identified

Our cloud partners have released an update that a networking issue has occurred and that mitigation actions have been deployed to restore service to their customers. Platform.sh teams are continuing to monitor this corrective action and will reach out to any remaining impacted customers from this event.

investigating

Platform.sh teams are continuing to investigate this issue with our cloud partners.

investigating

We have detected an issue affecting service on the fr-4.platform.sh region. This issue affects multiple production sites as well as development environments. Access to your site, project UI as well as Git and SSH access may be affected. This outage does not affect Dedicated Enterprise Clusters. We will update you as soon as we have further information.

Report: "Console and CLI issues on FR-4.platform.sh"

Last update
postmortem

**WHAT HAPPENED** Hosts on this region entered a degraded state. **CUSTOMER IMPACT** Some customers experienced intermittent slowness and long response times while using console, CLI, SSH and submitting deployments from 2024-09-18 09:56 to 2024-09-19 14:21 UTC. However, existing live sites were not impacted. **WHAT WAS DONE TO RESOLVE THE INCIDENT** The degraded hosts have been restored, and additional capacity was added to the region. Services such as the console, CLI, SSH, and deployments shouldn't experience slowness anymore.

resolved

We can see that no new report related to this incident. Your projects and environments should now function as expected.

monitoring

A fix has been implemented at 2024-09-18 19:04 UTC and we are continuing to monitor the results.

identified

The slowness is re-occurring and we are working on a permanent fix.

monitoring

We have finished implementing the fix and we are monitoring the results.

identified

We are still working on in implementing the fix.

identified

The issue has been identified and a fix is being implemented. Live sites aren't being affected by this incident.

investigating

We have detected an issue affecting service on the FR-4 region. Our Operations team has been notified and is currently working to restore service. Affected projects may experience limited access to web console and CLI services, as well as unexpectedly long deployment times and difficulty accessing services via SSH connection. Deployed production sites are not affected at this time. We would recommend suspending deployments to environments on the affected region. We will update you as soon as we have further information.

Report: "Console and CLI issues on FR-4.platform.sh"

Last update
postmortem

**WHAT HAPPENED** Hosts on this region entered a degraded state. **CUSTOMER IMPACT** Some customers experienced intermittent slowness and long response times while using console, CLI, SSH and submitting deployments from 2024-09-18 09:56 to 2024-09-19 14:21 UTC. However, existing live sites were not impacted. **WHAT WAS DONE TO RESOLVE THE INCIDENT** The degraded hosts have been restored, and additional capacity was added to the region. Services such as the console, CLI, SSH, and deployments shouldn't experience slowness anymore.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We continue to experience high CPU usage on build hosts. We are doubling build capacity and rebalancing projects to recover from API/CLI slowness.

investigating

We have detected an issue affecting service on the FR-4 region. Our Operations team has been notified and is currently working to restore service. Affected projects may experience limited access to web console and CLI services, as well as unexpectedly long deployment times and difficulty accessing services via SSH connection. Deployed production sites are not affected at this time. We would recommend suspending deployments to environments on the affected region. We will update you as soon as we have further information.

Report: "Partial Outage on UK-1"

Last update
postmortem

**What happened** Cloud provider incident [https://status.cloud.google.com/incidents/ETJGhvY9Xaktw7tgi8dF](https://status.cloud.google.com/incidents/ETJGhvY9Xaktw7tgi8dF), this led to connectivity loss in London GCP region hence hence making projects on uk-1 region temporarily inaccessible. **Customer Impact** Between 13:23 and 13:33 UTC on 2024-08-12, some environments were inaccessible due to the GCP incident in the UK-1 region.

resolved

We've seen no further issues, however we continue to investigate for the post-mortem report.

monitoring

There has not been further alerts and we are monitoring the region. We are still investigating for the post mortem report.

identified

We are continuing to work on a fix for this issue.

identified

All alerts have cleared and we are continuing to investigate.

investigating

We have detected an issue affecting service on the UK-1 regions. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience downtime. We will update you as soon as we have further information.

Report: "Partial Outage on US-3 region"

Last update
postmortem

### What happened A host on the region entered a degraded state. This incident did affect live environments on a single host on our Grid infrastructure. ### Customer Impact Between 2024-07-31 06:30 and 06:58 UTC, environments on the degraded host were impacted due to the project moving host, which then resulted in a site outage for the environments. ### What was done to resolve the incident The degraded host was recovered, and service and activities on impacted environments were then able to resolve successfully, and to restore service to impacted sites.

resolved

This incident has been resolved.

monitoring

We have isolated the single host causing this issue and projects are now online and responding.

investigating

We have detected an issue affecting service on the US-3 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience web request timeouts. We will update you as soon as we have further information.

Report: "Partial Outage on CA-1 region"

Last update
postmortem

### What happened A host on the region entered a degraded state. ### Customer Impact Environments on the degraded host were impacted in the form of stuck activities \(such as a backup activity\) which then resulted in a site outage for the environments in question. ### What was done to resolve the incident The degraded host was recovered, and subsequent stuck activities on impacted environments were then able to resolve successfully, and to restore service to impacted sites.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are investigating an issue that has resulted in outages for some sites hosted on the CA-1 region. Our Operations team has been notified and is currently working to restore service. We will update you as soon as we have further information.

Report: "Deployments failing on Dedicated Enterprise Gen 2 clusters"

Last update
postmortem

## **What happened** A recent code change in the deployment daemon running on Dedicated Enterprise Gen 2 clusters resulted in an inability to finish deployments. ## **Customer Impact** The issue has first been observed at 08:54 UTC and a fix has been rolled out at 12:27 UTC. ## **What was done to resolve the incident** Deployment daemon has been patched.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We have detected an issue affecting deployments on Dedicated Enterprise Gen 2 clusters. Our Operations team has been notified and is currently working to restore service. Please refrain from deployments until further notice. We will update you as soon as we have further information.

Report: "API server outage"

Last update
postmortem

## What happened After recent maintenance work, the API server was not able to communicate with internal database servers. ‌ ## Customer Impact Customers were unable to use the CLI or console, which included connecting to your project with SSH, and other activities like submitting deployments for a period of 40 minutes. The availability of your live sites were not affected. ## What was done to resolve the incident Our team quickly discovered the DB connection issues and made those DB servers available for the API server.

resolved

The CLI, console and API services should now function as expected. This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We have detected an issue affecting CLI, console and API usage. Our engineers are actively investigating this issue.

Report: "Partial Outage on UK-1"

Last update
postmortem

### What happened We observed an extreme spike in traffic to our UK-1 region, which affected the availability of some projects. ### Customer Impact Some projects may have been unavailable for around 30 minutes until mitigation actions were in place. ### What was done to resolve the incident Our team investigated the issue and implemented mitigations to restore normal service.

resolved

We recently observed an extreme spike in traffic to our UK-1 region.

Report: "Partial Outage on FR-4"

Last update
postmortem

## **What happened** A large increase in incoming connections to the FR-4 region caused sites in the region to become intermittently available. This incident did not affect Dedicated clusters. ## **Customer Impact** Sites were intermittently available from 12:00 to 14:00 UTC. ## **What was done to resolve the incident** We have added extra gateway hosts to distribute the load and taken steps to block traffic.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We have detected an issue affecting service on the FR-4 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience intermittent connectivity issues or timeouts. We will update you as soon as we have further information.

Report: "Partial Outage on EU-5"

Last update
postmortem

## **What happened** A large increase in incoming connections to the EU-5 region caused sites in the region to become intermittently available. This incident did not affect Dedicated clusters. ## **Customer Impact** Sites were intermittently available from 14:30 to 15:00 UTC. Some customers may have also had issues connecting to the project management console during that time. ## **What was done to resolve the incident** We have added extra gateway hosts to distribute the load and taken steps to block traffic.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We have detected an issue affecting service on the EU-5 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience intermittent connectivity issues or timeouts. This also impacts the project management console for some users. Other regions will not have any impacts to site availability. We will update you as soon as we have further information.

Report: "Accounts / Auth systems outage"

Last update
postmortem

### WHAT HAPPENED Based on reports from our monitoring system, the internal accounts and auth subsystems were not able to give timely responses to certain API calls. Our team identified a lock contention issue within the project information database system. This incident did NOT affect live environments on Grid or Dedicated infrastructure. However, SSH, CLI and console features including deployments may have been temporarily unavailable. ‌ ### CUSTOMER IMPACT Between 2024-05-25 02:28 and 03:56 UTC , some customers may have experienced issues while using the SSH / console / CLI / submitting deployments. This incident did NOT affect live environments on Grid or Dedicated infrastructure. ‌ ### WHAT WAS DONE TO RESOLVE THE INCIDENT Our team have restarted those deadlocked processes to make the accounts and auth subsystems available. Accounts and Auth subsystems outages are now resolved. Further investigation on this lock contention issue will be conducted and a fix will be implemented to optimize our subsystems further.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue. This issue does not impact existing live sites / environments. However, several functions including SSH, console, CLI may not be functioning as expected.

Report: "Partial Outage on CA-1 region"

Last update
postmortem

**What Happened** We detected a substantial drop in connections reaching the CA-1 region from "14:29 UTC" to "15:39 UTC" on May 11th, 2024. **Customer Impact** Customers experienced intermittent availability during this time as connections were dropped, and requests failed to reach the CA-1 origin. **Incident Resolution** We suspect a transient network failure with our upstream provider, or another network layer upstream, was responsible for this incident.

resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We have detected an issue affecting service on the CA-1 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience unresponsive site. We will update you as soon as we have further information.

Report: "Partial Outage on EU 4"

Last update
postmortem

### What Happened: We detected a partial outage on the EU-4 region, and our investigation identified a host in the region was not operating normally. ### Customer Impact: A small number of customers may have experienced an outage on any environment that was residing on the abnormal host. This may have included Production environments. ### Incident Resolution: Our operations and engineering team isolated the host and manually fixed any environment that did not automatically recover.

resolved

All affected project and cluster are successfully recovered.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We have detected an issue affecting service on the EU 4 region. Our Operations team has been notified and is currently working to restore service. We will update you as soon as we have further information.

Report: "Partial Outage on FR-1 region"

Last update
postmortem

**What Happened:** On May 11th, at about 14:50 UTC, our upstream provider, experienced an internal incident that affected the availability of their Virtual Machines \(VMs\) and Object Storage Devices \(OSDs\). **Customer Impact:** This incident affected all container types \(production, staging, development\) in the FR-1 region, impacting multiple projects. Affected projects experienced container outages, including production containers in some cases. **Incident Resolution:** During the incident, our engineers actively communicated with the upstream provider to ensure a swift resolution. On May 12th, at about 20:30 UTC, following the underlying provider full recovery, our engineers ensured all production environments were fully operational. On May 13th, at about 07:00 UTC, our engineers conducted a comprehensive review of all affected components to confirm their full functionality and address any outstanding technical issues.

resolved

This incident has been resolved.

monitoring

All services are currently operational. Our engineering team is conducting a comprehensive review of all affected components to confirm their full functionality and address any outstanding technical issues.

monitoring

All affected projects data and environments have been recovered successfully. Some metrics services may still be unavailable.

monitoring

Our upstream provider has recovered impacted hosts. We are monitoring the results.

identified

Most impacted hosts and projects have been fully recovered. The upstream provider is still working on full recovery.

identified

Our hosts have been recovered and all projects are operational. The upstream provider is still working on full recovery of their data centre.

identified

The upstream provider has recovered our core hosts and the projects are back online.

identified

We are continuing to work with our upstream provider on recovery.

identified

We are continuing to work with our upstream provider on recovery.

identified

Our upstream provider has encountered an unexpected temperature increase which required the emergency shutdown of the servers impacted by this incident. They are actively working on recovery.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and we are working on recovering the affected hosts.

investigating

We have detected an issue affecting service on the FR-1 region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience slowness and unresponsive sites. We will update you as soon as we have further information.

Report: "Downtime for some sites on the CA-1 region"

Last update
postmortem

**What Happened**: The incident was triggered when the incoming request gateway on cluster CA-1 depleted its resources, caused by an unforeseen surge in usage. This led to the gateway being unable to process incoming traffic effectively. **Customer Impact**: Approximately 50 projects were affected during a 30-minute period, experiencing intermittent downtime ranging from 5 to 10 minutes. This disruption impacted the operational functionality of these projects, leading to degraded service availability. **Incident Resolution**: To address this issue, we have permanently increased the resource capacity of the incoming request gateway to better handle similar spikes in traffic in the future. Additionally, we have identified the source of the unexpected usage and are implementing measures to prevent a recurrence of this problem.

resolved

All services are fully operational. No further downtime was registered in the past 2 hours.

monitoring

During our first investigation round, we have identified that the downtime could have been related to resource constraints on our incoming gateway. We have permanently increased the gateway capacity to prevent similar issues in the future.

investigating

We have detected an issue intermittently impacting approximately 50 production environments between 07:15 and 07:45 AM UTC. The affected environments experienced downtime ranging from 5 to 10 minutes. We are actively investigating issue.

Report: "API and CLI timeouts"

Last update
postmortem

**WHAT HAPPENED** Users were unable to interact with projects through web console and CLI due to issues with the authentication backend. This incident did not affect live environments on Grid or Dedicated infrastructure. After investigation, our engineers identified errors in logs which lead to following remedial actions: * Backend was restarted * Underlying hosts were replaced * Additional capacity was added **CUSTOMER IMPACT** Between 2024-05-01 01:48 UTC and 03:21 UTC, customers were unable to access console or use CLI There was no impact on environments or production sites. **WHAT WAS DONE TO RESOLVE THE INCIDENT** The authentication backend was restarted and additional capacity was added.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

The issue as reoccured and our team is investigating

monitoring

We identified an issue with our Auth system. Our team has performed some work to restore service availability and we're monitoring the situation

investigating

We are investigating a report of outages on our API and CLI

Report: "Partial Outage on eu-2.platform.sh"

Last update
postmortem

**WHAT HAPPENED** Environments were in unexpected states during the planned maintenance event for region [eu-2.platform.sh](http://eu-2.platform.sh) [https://status.platform.sh/incidents/97ls23652mws](https://status.platform.sh/incidents/97ls23652mws) Affected environments including live sites were not available for use during the incident, this also includes functionalities like SSH accesses / deployments. ‌ Dedicated clusters were **NOT** affected by this incident. ‌ **CUSTOMER IMPACT** Some customers' environments were not available for up to 9 hrs and 12 minutes in total. \(Worst case estimate for all live & non-live environments\) Start: 2024-04-24 21:50 **UTC** End: 2024-04-25 07:02 **UTC** ‌ **WHAT WAS DONE TO RESOLVE THE INCIDENT** The unexpected states had been corrected by our engineers and now the environments should function as expected. Our engineering team will also be investigating a potential internal bug responsible for those unexpected states causing this incident. We will be improving the internal subsystems in order to minimize the negative impact from these planned maintenance events in the future.

resolved

Our engineers have deployed a fix to correct the issues. All environments in this region should now function as expected.

identified

All live / default environments should now function as expected.

identified

We have detected an issue affecting service on the eu-2.platform.sh region. Our Operations team has been notified and is currently working to restore service. Projects on affected regions may experience outages. We will update you as soon as we have further information.

Report: "Gateways unresponsive on uk-1.platform.sh"

Last update
postmortem

**What Happened** On March 29, 2024, at 07:23 UTC, a large increase in incoming connections to the [uk-1.platform.sh](http://uk-1.platform.sh) region caused the gateways to become unresponsive. Availability of sites/environments in the region was impacted. This incident did not affect Dedicated clusters. **Customer Impact** Grid projects on the [uk-1.platform.sh](http://uk-1.platform.sh) region were affected, resulting in temporary outages and reduced performance for customer sites. Sites availability and performance was shortly degraded. **What Was Done to Resolve the Incident** To address the issue, our engineers initially blocked the majority of the malicious \(DDoS\) traffic at the time and made various configuration and software changes in our internal gateway systems to handle these situations better. They continued monitoring the changes to ensure system stability and performance.

resolved

This incident has been resolved.

monitoring

A large increase in incoming connections to the uk-1.platform.sh region caused the gateways to become unresponsive. Traffic has been blocked and we're seeing gateway performance returning to normal. We're continuing to monitor the situation and will take additional action if needed.

Report: "Partial Outage on fr-3.platform.sh"

Last update
postmortem

## What happened Upstream provider had planned network maintenance causing unexpected interruption to our services in region `fr-3.platform.sh` ‌ ## Customer Impact All environments in region `fr-3.platform.sh` were unreachable temporarily, some of those environments may have experienced outages for more than 1.5 hours. ‌ ## What was done to resolve the incident Based on the internal monitoring alerts, our team migrated all the affected environments to the healthy hosts. And our upstream provider had also resolved the network issues so that our infrastructure can then function as expected.

resolved

This incident has been resolved.

monitoring

Our upstream provider is still working on their planned network maintenance events. Given the unexpected interruption to our services, we would like to monitor this region for prolonged period of time. All environments should now function as expected. Please submit a support ticket if you need further assistance from our team.

monitoring

Our team has successfully migrated affected environments to working hosts and will monitor the situation with our upstream provider

identified

Our team is still actively working on migrating the environments away from the affected hosts.

identified

There is a technical issue stemming from our upstream provider. They are actively investigating and recovering the availability of the service.

identified

We have detected an issue affecting service on the fr-3.platform.sh region. Our Operations team has been notified and is currently working to restore service. Projects on affected region may experience outages. We will update you as soon as we have further information.

Report: "Gateways unresponsive on uk-1.platform.sh"

Last update
postmortem

## **What happened** A surge in incoming connections to the uk-1.platform.sh region rendered the gateways unresponsive, affecting the availability of sites and environments within the region. This incident did not affect Dedicated clusters. ## **Customer Impact** During the incident, which commenced at 07:23 UTC on 2024-03-29, certain sites in the uk-1.platform.sh region experienced unavailability. Sites were restored within 10 minutes, with normal operations resuming by 07:33 UTC. ## **What was done to resolve the incident** To mitigate the impact, we swiftly implemented measures to block the majority of the malicious \(DDoS\) traffic targeting the region. ## **Short term mitigations** Our engineering team is actively analyzing the attack patterns to fortify the resilience of our incoming request gateways, including reviewing rate limit configurations and adding further gateway capacity.

resolved

This incident has been resolved.

monitoring

A large increase in incoming connections to the uk-1.platform.sh region caused the gateways to become unresponsive. Traffic has been blocked and we're seeing gateway performance returning to normal. We're continuing to monitor the situation and will take additional action if needed.

Report: "Gateways unresponsive on us-4.platform.sh"

Last update
postmortem

## What happened A large increase in incoming connections to the `us-4.platform.sh` region caused the gateways to become unresponsive. Availability of sites/environments in the region was negatively impacted. This incident did not affect Dedicated clusters. ## Customer impact Some sites in `us-4.platform.sh` region were unavailable during the incident \(starting at 04:23 **UTC** on 2024-03-27\). Most affected sites have recovered within 18 minutes \(by 04:41 **UTC**\). ## What was done to resolve the incident We have blocked some malicious \(DDOS\) traffic to the region.

resolved

Alerts for the region have cleared following filtering of the excess traffic, and the region is again stable.

monitoring

Traffic has been blocked and we're seeing gateway performance returning to normal. We're continuing to monitor the situation and will take additional action if needed.

identified

A large increase in incoming connections to the us-4.platform.sh region caused the gateways to become unresponsive again. Our Operations team has been notified and is currently investigating the issue.