Is Sauce Labs Down Right Now? Discover if there is an ongoing service outage.

Sauce Labs is currently Operational

Last checked Jul 29, 2025 17:44 UTC from Sauce Labs's official status page

Historical record of incidents for Sauce Labs

Jul 22, 2025

Report: "2025-July-22 Service Incident"

Last update 2025-07-22T08:50:03.899Z

investigating2025-07-22T08:50:03.387Z

We are currently seeing intermittent errors accessing the US-West-1 Sauce Labs dashboard. We are investigating.

Jul 3, 2025

Report: "statuspage maintenance message"

Last update 2025-07-03T07:07:50.274Z

resolved2025-07-03T07:07:50.257Z

This incident has been resolved.

investigating2025-07-03T07:07:29.362Z

Please ignore - updating to refresh statuspage

Jul 2, 2025

Report: "2025-July-2 Service Incident"

Last update 2025-07-02T10:11:54.925Z

investigating2025-07-02T10:11:54.380Z

We are currently experiencing reduced availability of Real Devices in our EU-Central-1 datacenter. We are investigating.

Jun 18, 2025

Report: "2025-June-18 Service Incident"

Last update 2025-06-18T11:53:14.440Z

investigating2025-06-18T11:53:13.816Z

We are currently experiencing issues with our app storage service in the EU Data Center. We are investigating.

Jun 16, 2025

Report: "2025-June-16 Service Incident"

Last update 2025-06-16T12:49:14.592Z

investigating2025-06-16T12:49:14.128Z

We are currently experiencing reduced availability of Real Devices in our EU-Central-1 datacenter. We are investigating.

Jun 13, 2025

Report: "2025-June-13 Resolved Service Incident"

Last update 2025-06-13T11:20:37.216Z

resolved2025-06-13T06:30:00.000Z

Between 07:30 - 09:00 UTC, There was an issue with the Saucelabs Dashboard User Interface, this caused problems accessing apps and test results in the all datacenters. After remedial action this issue was resolved.

May 17, 2025

Report: "2025-May-6 Resolved Service Incident"

Last update 2025-05-17T11:00:21.459Z

postmortem2025-05-17T10:51:09.420Z

### **Dates:** Tuesday April 6th 2025, 03:38 UTC - 03:58 UTC ### **What happened:** RDC devices were unavailable in the EU-Central-1 Datacenter. ### **Why it happened:** There was a DNS caching failure during a third party provider’s maintenance. ### **How we fixed it:** Availability was restored automatically, after the instances completed their start-up. ### **What we are doing to prevent it from happening again:** We will update our DNS caching for better fault tolerance.

resolved2025-05-06T02:30:00.000Z

Between 03:38 - 03:58 UTC, Real Devices in our EU data center were unavailable. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2024-October-1 Service Incident"

Last update 2025-05-17T10:59:06.770Z

postmortem2025-05-17T10:58:10.118Z

### **Dates:** Monday September 30 2024, 15:53 - 16:10 UTC ### **What happened:** The Sauce Labs dashboard for the EU-Central-1 datacenter was not accessible. ### **Why it happened:** The gateway in front of [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) was misconfigured. ### **How we fixed it:** A rollback was executed to the previous known working version. ### **What we are doing to prevent it from happening again:** Improved synthetic monitoring for [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/).

resolved2024-10-01T20:48:52.054Z

This incident has been resolved.

monitoring2024-10-01T18:20:22.806Z

Access to support portal has been restored. We continue to monitor.

identified2024-10-01T17:06:45.000Z

We are continuing to monitor the status and update once the third party issue is resolved

identified2024-10-01T16:56:56.000Z

We are continuing to monitor the status and update once the third party issue is resolved

identified2024-10-01T16:32:55.588Z

A third party provider is currently experiencing an outage which is impacting support.saucelabs.com. If this is an emergency and you wish to reach Sauce Labs Support team, please email us at support.outage@saucelabs.com.

Report: "2025-February-21 Service Incident"

Last update 2025-05-17T10:56:36.869Z

postmortem2025-05-17T10:54:51.946Z

### **Dates:** Saturday February 22nd 2025, 00:00 - 04:46 UTC ### **What happened:** Customers were unable to create SauceConnect tunnels in our US-West-1 and EU-Central-1 regions. ### **Why it happened:** The SSL certificate for the Sauce Connect frontend had expired. ### **How we fixed it:** The certificate was renewed and deployed to the affected regions. ### **What we are doing to prevent it from happening again:** We are improving our alerting around SSL certificate expiration.

resolved2025-02-22T05:20:12.362Z

All services are now operating as normal. This incident is resolved.

identified2025-02-22T02:25:23.767Z

We have identified the issue and have taken a remedial action. We are monitoring.

investigating2025-02-22T01:34:07.021Z

We are continuing to investigate this issue.

investigating2025-02-22T01:33:41.579Z

New Sauce Connect Tunnels are not able to be created on our US-West-1 & EU-Central-1 Datacenters. We are investigating.

Report: "2025-April-2 Service Incident"

Last update 2025-05-17T10:48:15.319Z

postmortem2025-05-17T10:45:15.123Z

### **Dates:** Tuesday April 1st 2025, 14:54 UTC - Wednesday April 2nd 2025, 13:45 UTC ### **What happened:** iOS tests using Sauce Connect failed to start. ### **Why it happened:** There was a mismatch in the SSL certificate's expiration dates. ### **How we fixed it:** Re-ordered new SSL certificates. ### **What we are doing to prevent it from happening again:** We are adding monitoring for the SSL certificates.

resolved2025-04-02T15:23:23.750Z

After taking remedial action, all services are operating as normal. This incident is resolved.

monitoring2025-04-02T13:48:15.767Z

We have identified the root cause and have deployed a fix for this issue. We are monitoring.

investigating2025-04-02T11:03:02.277Z

We are continuing to see iOS simulator test failures when using Sauce Connect in the US-West-1 & EU-Central-1 Data Centers. We are actively investigating.

investigating2025-04-02T09:53:24.858Z

We are seeing failing iOS simulator tests when using Sauce Connect in US-West-1 & EU-Central-1 Data Centers. We are investigating.

Report: "2025-March-27 Service Incident"

Last update 2025-05-17T10:44:18.652Z

postmortem2025-05-17T10:41:26.127Z

### **Dates:** Wednesday March 26th 2025, 19:15 - Thursday March 27th 2025, 14:30 UTC ### **What happened:** An internal cache for mobile applications in one datacenter was unable to reach its origin, causing artifacts to age out and no longer be available to be served. ### **Why it happened:** A change to the cache’s routing prevented it from reaching its origin. ### **How we fixed it:** The routing change was reverted. ### **What we are doing to prevent it from happening again:** Additional monitoring and alerting has been implemented to identify this issue more quickly in the future.

resolved2025-03-27T14:35:46.363Z

App storage errors have been resolved. All services are fully operational.

investigating2025-03-27T14:23:38.453Z

We are seeing failures due to issue with our storage service affecting Virtual and Real Device Cloud as well as any saucectl and Mobile App Distribution in our US-West Data Center. We are investigating

May 14, 2025

Report: "2025-May-14 Service Incident 1"

Last update 2025-05-14T17:16:58.442Z

resolved2025-05-14T17:16:58.055Z

We have identified the root cause and deployed a fix for this issue. This incident is resolved.

investigating2025-05-14T17:16:41.522Z

We are continuing to investigate this issue.

investigating2025-05-14T16:49:52.327Z

We are currently investigating an issue where video recordings for Android tests are intermittently missing in our US West Datacenter. The issue began on May 13th at approximately 13:00 CEST.

Report: "2025-May-14 Service Incident 1"

Last update 2025-05-14T16:49:00.000Z

Investigating2025-05-14T16:49:00.000Z

We are currently investigating an issue where video recordings for Android tests are intermittently missing in our US West Datacenter. The issue began on May 13th at approximately 13:00 CEST.

Report: "2025-May-14 Service Incident"

Last update 2025-05-14T09:40:23.107Z

resolved2025-05-14T09:40:22.676Z

After taking remedial action, app uploads and testing are now both working correctly. This incident is resolved

investigating2025-05-14T09:10:22.036Z

We are currently seeing errors in uploading and installing apps in our US-East-4 Datacenter. We are investigating.

Report: "2025-May-14 Service Incident"

Last update 2025-05-14T09:10:00.000Z

Investigating2025-05-14T09:10:00.000Z

We are currently seeing errors in uploading and installing apps in our US-East-4 Datacenter. We are investigating.

May 6, 2025

Report: "2025-May-6 Resolved Service Incident"

Last update 2025-05-06T02:30:00.000Z

Resolved2025-05-06T02:30:00.000Z

Between 03:38 - 03:58 UTC, Real Devices in our EU data center were unavailable. After taking remedial action, the issue has been resolved. All services are fully operational.

Apr 23, 2025

Report: "2025-April-23 Resolved Service Incident"

Last update 2025-04-23T17:42:56.826Z

resolved2025-04-23T17:42:56.816Z

Between 16:55 - 17:25 UTC, we experienced Android Real Device Test failures in our EU data center. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2025-April-23 Resolved Service Incident"

Last update 2025-04-23T17:42:00.000Z

Resolved2025-04-23T17:42:00.000Z

Between 16:55 - 17:25 UTC, we experienced Android Real Device Test failures in our EU data center. After taking remedial action, the issue has been resolved. All services are fully operational.

Apr 2, 2025

Report: "2025-April-2 Service Incident"

Last update 2025-04-02T09:53:00.000Z

Investigating2025-04-02T09:53:00.000Z

We are seeing failing iOS simulator tests when using Sauce Connect in US-West-1 & EU-Central-1 Data Centers. We are investigating.

Apr 1, 2025

Report: "2025-March-6 Service Incident"

Last update 2025-04-01T16:50:21.485Z

postmortem2025-04-01T16:34:48.786Z

### **Dates:** Thursday March 6th 2025, 12:00 - 12:22 UTC ### **What happened:** Customers were unable to start new Android Real device tests in all datacentres. ### **Why it happened:** During a deployment, a race condition occurred during the reallocation of devices from "old" device pools to "new" device pools which caused devices to become unavailable in both pools. ### **How we fixed it:** The "old" pools were shutdown releasing the device lock, allowing the "new" pools to acquire the lock. ### **What we are doing to prevent it from happening again:** We are improving our device allocation processes.

resolved2025-03-06T12:20:40.581Z

After taking remedial action, Android Real Devices are now available in all our datacenters. This incident is resolved

investigating2025-03-06T12:18:02.827Z

We are currently experiencing significantly reduced availability of Android Real Devices in all our datacenters. We are investigating.

Report: "2025-Feb-17 Resolved Service Incident"

Last update 2025-04-01T16:33:14.690Z

postmortem2025-04-01T16:25:31.829Z

### **Dates:** Wednesday February 17th 2023, 12:52 - 13:53 UTC **What happened:** Real Device tests in EU-Central-1 datacenter were experiencing issues reaching internet destinations. ### **Why it happened:** A new internet provider was introduced that was experiencing issues with their network.. ### **How we fixed it:** A rollback was executed to move traffic off of the affected internet provider. ### **What we are doing to prevent it from happening again:** We engaged the provider and they corrected configuration issues with their network.

resolved2025-02-17T18:44:25.062Z

Between 12:52 and 13:53 UTC, some Real Device tests in EU-Central-1 Data center experienced failures reaching internet destinations because of an issue on an upstream ISP network. We have identified the issue and taken remedial action.

Report: "2025-January-21 Service Incident"

Last update 2025-04-01T16:30:21.489Z

postmortem2025-04-01T16:19:06.046Z

### Dates: Tuesday January 21st 2025, 12:40 - 13:44 UTC ### **What happened:** Tests using iOS Real Devices experienced failures to download and install apps for the eu-central01 and us-west-1 datacenters. **Why it happened:** A misconfiguration with the resigner service caused errors when communicating with the app storage. ### **How we fixed it:** A rollback was executed, restoring configuration to the previous working state. ### **What we are doing to prevent it from happening again:** * Improve canary deployments for each Real Device Cloud region. * Improve end to end testing for canary deployments. * Improve SLOs for the Real Device application installs

resolved2025-01-21T13:46:55.771Z

After taking remedial action, apps are now installing correctly in all datacenters. This incident is resolved.

investigating2025-01-21T13:39:07.128Z

We are currently seeing App installation errors when trying to run iOS tests on our US-West-1 & EU-Central-1 Datacenter. We are investigating

Report: "2025-January-3 Service Incident"

Last update 2025-04-01T16:20:21.463Z

postmortem2025-04-01T16:05:03.683Z

### **Dates:** Friday 3 January 2025, 09:52 - 11:07 UTC ### **What happened:** Customers were unable to access test results in the Web UI for our US-West-1 datacenter. ### **Why it happened:** A defect was introduced during a product deployment. ### **How we fixed it:** A rollback was executed to the previous working version. ### **What we are doing to prevent it from happening again:** We are creating additional checks for the authentication method upgrades.

resolved2025-01-03T11:10:33.535Z

After taking remedial action, Test Results are now available again in all data centers. This incident is resolved.

investigating2025-01-03T11:06:53.727Z

We are currently seeing an issue with Live and automated test results not being displayed on the test results page in our US-West-1 data center. We are investigating.

Report: "2025-January-8 Service Incident"

Last update 2025-04-01T16:17:54.150Z

postmortem2025-04-01T16:14:20.616Z

### **Dates:** Wednesday 8 January 2025, 11:00 - 12:25 UTC ### **What happened:** We were experiencing intermittent issues with Virtual device tests missing test assets in the US-West-1 region ### **Why it happened:** The asset uploader service was experiencing Http errors uploading assets to our backend storage. ### **How we fixed it:** The 3rd-party provider resolved an issue with their infrastructure. ### **What we are doing to prevent it from happening again:** We are improving the caching and retry logic of the asset uploader service to prevent further occurrence.

resolved2025-01-08T12:54:16.514Z

After taking remedial action, test assets are being retained successfully in all tests. This incident is resolved.

investigating2025-01-08T12:44:33.632Z

We are currently seeing intermittent issues with test assets not being retained for Virtual device tests in our US-West-1 data center. We are investigating.

investigating2025-01-08T12:41:20.484Z

We are continuing to investigate this issue.

investigating2025-01-08T11:56:40.789Z

We are currently seeing Live and Automated test results are not being retained for Virtual Device tests intermittently in our US-West-1 data centers. We are investigating.

investigating2025-01-08T11:40:06.561Z

We are currently seeing Live and automated test results are not being retained intermittently in our US-West-1 data centers. We are investigating.

Report: "2024-October-11 Resolved Service Incident 2"

Last update 2025-04-01T16:10:21.368Z

postmortem2025-04-01T15:59:06.936Z

### **Dates:** Friday October 11th 2024, 09:19 - 19:19 UTC ### **What happened:** Real Device tests using Sauce Connect 5 tunnels experienced failures in all datacenters. ### **Why it happened:** An improperly tested change to Sauce Connect service was rolled out. ### **How we fixed it:** The sauce connect service was rolled back to a previously working version. ### **What we are doing to prevent it from happening again:** Fixes to post-deploy checks are planned to prevent this issue from reoccurring.

resolved2024-10-11T19:26:24.000Z

Between 9:33 and 19:21 UTC, Real Device tests running via Sauce Connect tunnel (version 5) experienced failures on US-West-1 and EU-Central-1 and US-East-4 Data centers. We have identified the issue and taken remedial action.

Report: "2024-November-19 Service Incident"

Last update 2025-04-01T15:58:23.994Z

postmortem2025-04-01T15:54:52.677Z

### **Dates:** Wednesday November 19th 2024, 21:30 - 23:30 UTC ### **What happened:** Android real devices were unable to run app tests in all datacenters. ### **Why it happened:** An unexpected policy change by Google caused MDM managed devices to be locked down due to policy violations around accessibility services. ### **How we fixed it:** We rolled back enablement of the TalkBack accessibility service. ### **What we are doing to prevent it from happening again:** We opened a support case with Google to get further information on why this policy change happened, as well as began investigating running our own MDM solution.

resolved2024-11-19T22:28:57.978Z

After taking remedial action all services are operating as normal. This incident is resolved.

investigating2024-11-19T21:35:25.989Z

Android real devices in US West 1, EU Central 1, and US East datacenter are unable to run automated Appium and Espresso app tests and manual app tests. We are investigating.

Report: "2024-December-18 Service Incident"

Last update 2025-04-01T15:53:50.955Z

postmortem2025-04-01T15:51:42.575Z

### **Dates:** Wednesday December 18th 2024, 09:05 - 11:31 UTC ### **What happened:** Customers were unable to purchase self-service plans through the billing page. ### **Why it happened:** Breaking changes were introduced between our billing service client SDK and the third-party billing provider. ### **How we fixed it:** The SDK for the third-party billing provider was updated in our service. ### **What we are doing to prevent it from happening again:** We’re working with the third-party billing provider to better understand their versioning practices to ensure we remain inline with them.

resolved2024-12-18T11:57:20.607Z

Customers are able to purchase Self-Service plans from our dashboard again. This issue is resolved.

investigating2024-12-18T10:43:56.547Z

There is currently an issue with purchasing self-service plans from our dashboard, We are investigating.

Report: "2024-October-30 Service Incident"

Last update 2025-04-01T15:50:21.477Z

postmortem2025-04-01T15:39:10.581Z

### **Dates:** Wednesday October 30th 2023, 03:45 - 21:06 UTC ### **What happened:** Intermittent errors occurred when starting or using Sauce Connect tunnel in the US West datacenter. ### **Why it happened:** A service responsible for creating new bindings between the tunnel endpoints and test VMs experienced timeouts on some hosts. ### **How we fixed it:** Service was restored after clearing bindings on the affected hosts. ### **What we are doing to prevent it from happening again:** Monitoring and alerting for when the service is unable to create new bindings has been improved. The root cause of the condition is under investigation.

resolved2024-10-30T21:52:59.099Z

After taking remedial action all services are operating as normal. This incident is resolved.

monitoring2024-10-30T20:53:25.566Z

We have taken remedial action and are seeing an improvement with Sauce Connect tunnel startup and allocation in the US-West-1 data center. We are monitoring

investigating2024-10-30T19:00:33.835Z

We are currently seeing intermittent issues with Sauce Connect tunnel startup and allocation in the US-West-1 data center. We are investigating.

Report: "2024-December-12 Resolved Service Incident"

Last update 2025-04-01T15:49:02.756Z

postmortem2025-04-01T15:46:11.095Z

### **Dates:** Thursday December 12 2024, 12:40 - 13:30 UTC ### **What happened:** Video recordings for Android Real Device tests were not showing in test results. ### **Why it happened:** A code deployment contained a broken dependency. ### **How we fixed it:** A rollback was executed to the previous known working version. ### **What we are doing to prevent it from happening again:** A check for broken dependencies was added to the deployment.

resolved2024-12-12T13:59:16.256Z

Between 12:40 UTC and 13:20 UTC, Video recordings were missing in test reports for Android tests, affected all regions . We have identified the issue and taken remedial action.

Report: "2024-September-30 Resolved Service Incident"

Last update 2025-04-01T15:40:21.406Z

postmortem2025-04-01T15:29:13.154Z

### **Dates:** Monday September 30 2024, 15:53 - 16:10 UTC ### **What happened:** The Sauce Labs WebUI in the EU Central 1 datacenter was inaccessible. ### **Why it happened:** The gateway in front of [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) was misconfigured. ### **How we fixed it:** The gateway was rolled back to the previous working version. ### **What we are doing to prevent it from happening again:** Synthetic monitoring for [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) has been improved.

resolved2024-09-30T16:41:34.000Z

Between 15:53 - 16:11 UTC, Sauce Labs dashboard was unavailable in our EU Data Center. The access has been restored and all services are operational.

Report: "2024-October-11 Resolved Service Incident"

Last update 2025-04-01T15:36:08.510Z

postmortem2025-04-01T15:34:12.501Z

### **Dates:** Friday October 11 2024, 04:30 - 04:57 UTC ### **What happened:** Real Devices in our US East 4 datacenter were unavailable due to a lack of internet connectivity at the site. ### **Why it happened:** Both network providers used by the US East datacenter were down. ### **How we fixed it:** One of the network providers restored service, allowing RDC devices to become available. ### **What we are doing to prevent it from happening again:** Our Network team is looking into alternate IP transit providers to improve reliability.

resolved2024-10-11T05:00:00.000Z

Between 04:30 and 04:57 UTC, we were experiencing elevated error rates and reduced device availability for Real device tests in our US-East data center. The issue is now resolved. All services are fully operational.

Mar 27, 2025

Report: "2025-March-27 Service Incident"

Last update 2025-03-27T14:35:00.000Z

Resolved2025-03-27T14:35:00.000Z

App storage errors have been resolved. All services are fully operational.

Investigating2025-03-27T14:23:00.000Z

Mar 22, 2025

Report: "Sauce Labs Planned Maintenance"

Last update 2025-03-22T19:00:00.000Z

Completed2025-03-22T19:00:00.000Z

The scheduled maintenance has been completed.

In progress2025-03-22T17:00:00.000Z

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled2025-03-03T17:54:00.000Z

Sauce Labs will have a two hour previously scheduled maintenance window on March 22nd, starting at 17:00 UTC and ending at 19:00 UTC. During this maintenance window, we will be making updates to our infrastructure and services in our US West data center. These maintenance actions may cause portions of the service (including running automated and manual tests) to be unavailable for up to two hours.

Jan 7, 2025

Report: "2025-January-7 Service Incident"

Last update 2025-01-07T18:09:11.023Z

resolved2025-01-07T18:09:10.649Z

Access to *.testfairy.com via Chromium-based browsers has been restored. This incident is resolved.

investigating2025-01-07T10:58:18.402Z

There is currently an intermittent issue with accessing *.testfairy.com pages via Chromium-based browsers, preventing app distribution. Safari browsers can still access these sites. We are investigating.

investigating2025-01-07T09:36:30.634Z

There is currently an issue with accessing *.testfairy.com pages via Chromium-based browsers, preventing app distribution. Safari browsers can still access these sites. We are investigating.

investigating2025-01-07T09:12:44.311Z

We are currently seeing .testfairy.com Website pages throwing full-page red alert error messages. We are investigating.

Sep 26, 2024

Report: "2024-09-15 Resolved Service Incident"

Last update 2024-09-26T21:52:58.094Z

postmortem2024-09-26T21:39:31.045Z

**Dates:** Saturday September 15th, 2024 17:02 - 20:15 UTC **What happened:** Our Real Device \(RDC\) availability dropped below 80% across our US West and US East regions. **Why it happened:** The issue was caused by a failure in our certificate renewal process set to occur automatically, but failed to run as expected. **How we fixed it:** We fixed the problem by restarting the impacted service, which automatically renewed the necessary certificates. This brought our device availability back to normal. **What we are doing to prevent it from happening again:** We have initiated an investigation to review and fix the renewal process script to ensure it functions correctly in the future.

resolved2024-09-15T17:00:00.000Z

Between 17:02 and 20:15 UTC, automated tests executed on our real devices in the US West Data Center experienced an increased error rate due to reduced availability of the devices. The issue has been identified. All services are back to normal.

Sep 23, 2024

Report: "2024-August-12 Service Incident"

Last update 2024-09-23T10:22:56.442Z

postmortem2024-09-23T10:11:13.878Z

### **Dates:** Saturday August 10th 2024, 19:00 - Monday August 12th 2024, 21:13 UTC ### **What happened:** Following scheduled network upgrades, connectivity between Real Devices and our IPsec VMs was disrupted. Additionally, a misconfiguration in Sauce Connect caused iOS Real Device Live Testing and app installations to fail. ### **Why it happened:** During the network upgrade, a community tag was missing from a subset of production routes causing connectivity issues between IPsec VMs and Real Devices. Additionally, a misconfiguration in Sauce Connect caused iOS Live Testing connectivity issues. ### **How we fixed it:** To resolve the IPsec issue, we added the missing tags. To resolve the Sauce Connect with iOS Real Devices issue, we rolled back the Sauce Connect service configuration. ### **What we are doing to prevent it from happening again:** We’re adding additional end to end testing and alerting for our IPsec and Sauce Connect on iOS products.

resolved2024-08-12T22:44:37.975Z

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating2024-08-12T21:13:15.623Z

We are currently seeing real device iOS installation failures, showing "Unable to Verify App" in the US-West Data Center. We are investigating.

Report: "2024-July-4 Service Incident"

Last update 2024-09-23T10:12:56.532Z

postmortem2024-09-23T10:01:02.216Z

**Dates:** Thursday July 4th, 2024 13:13 - 14:06 UTC ### **What happened:** An incident occurred due to I/O errors on the database node, causing connectivity issues within our infrastructure. This affected Virtual jobs and Tunnel services in the US-West region, leading to disruptions in Sauce Labs products. **Why it happened:** Our job orchestration service failed to connect to a backend data store. **How we fixed it:** The data store issues were resolved, restoring connectivity. ### **What we are doing to prevent it from happening again:** We’re undertaking several initiatives including process improvements and enhanced monitoring, outlining steps to fix the issue and prevent it from happening again,

resolved2024-07-04T14:28:45.602Z

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating2024-07-04T14:05:31.815Z

We are currently seeing elevated error rates in Desktop and Virtual Mobile tests using the US-West-1 datacenter. We are currently investigating.

investigating2024-07-04T13:49:09.355Z

All jobs in our US-West-1 Data Center are failing to start. We are investigating.

Report: "2024-September-9 Service Incident"

Last update 2024-09-23T10:12:56.466Z

postmortem2024-09-23T10:12:46.507Z

### **Dates:** Friday September 6 2024, 20:30 UTC - Monday September 9 2024, 11:57 UTC ### **What happened:** RDC devices in the US-West had intermittent connectivity issues. ### **Why it happened:** An unexpected, intermittent DNS resolution failure began occurring after a new DNS server was added. ### **How we fixed it:** The faulty DNS server was removed. ### **What we are doing to prevent it from happening again:** Additional monitoring for this particular network failure was added.

resolved2024-09-09T13:29:45.373Z

After taking remedial action all services are operating as normal. This incident is resolved.

investigating2024-09-09T12:42:22.430Z

We are currently seeing intermittent network connectivity issues in Live and Automated Real Device tests in the US-West-1 Data Center. We are still investigating.

investigating2024-09-09T10:58:34.221Z

We are currently seeing intermittent network connectivity issues in iOS Live and Automated Real Device tests in the US-West-1 Data Center. We are investigating.

Aug 28, 2024

Report: "2024-August-15 Service Incident"

Last update 2024-08-28T08:12:56.526Z

postmortem2024-08-28T08:02:31.001Z

### Dates: Thursday Aug 15th 2023, 17:24 - 19:05 UTC ### **What happened:** The SauceLabs Single Sign-on service suffered a failure causing customers using SSO to be unable to login to the UI. This incident did not impact automated testing. ### **Why it happened:** An invalid SauceLabs iDP was deployed. ### **How we fixed it:** We rolled back the configuration for the SauceLabs iDP, restoring service.. ### **What we are doing to prevent it from happening again:** We’ve deployed stricter validation in our Single Sign-on service

resolved2024-08-15T21:26:18.247Z

This incident has been resolved.

investigating2024-08-15T19:48:46.145Z

We have received multiple reports of users experiencing issues accessing our dashboard via SSO. We are investigating.

Aug 15, 2024

Report: "2024-July-8 Service Incident"

Last update 2024-08-15T08:42:56.470Z

postmortem2024-08-15T08:30:06.642Z

### **Dates:** ‌07/08/2024 18:21 UTC to 07/08/2024 19:13 UTC ### **What happened:** Many pages of our website were unreachable or prohibitively slow. Potential customers couldn’t visit our website and existing customers who use our site as a login area may have been hindered. . ### **Why it happened:** Our CMS provider, Contentful, was experiencing an outage which meant that we couldn’t generate pages. ### **How we fixed it:** Contentful resolved the outage. ### **What we are doing to prevent it from happening again:** We are switching our website from on-demand rendering to build-time rendering, this means the pages are built at runtime and served from a CDN, so no need for real-time rendering.

resolved2024-07-08T19:35:59.644Z

After taking remedial action all services are operating as normal. This incident is resolved.

investigating2024-07-08T19:06:45.145Z

We are currently seeing saucelabs.com website pages loading slow and throwing errors. We are investigating.

Aug 10, 2024

Report: "2024-June-14 Service Incident"

Last update 2024-08-10T13:18:29.663Z

postmortem2024-08-10T13:18:25.577Z

### **Dates:** Wednesday June 14th 2024, 10:30 - 19:20 UTC ### **What happened:** Customers may have received intermittent 503 errors when running tests. ### **Why it happened:** A required network component on a node crashed during automated restarts and did not automatically recover. ### **How we fixed it:** We recovered the impacted node, which allowed the network component to start correctly. ### **What we are doing to prevent it from happening again:** We have improved the alerting for the network component to ensure automated upgrades and restarts complete correctly.

resolved2024-06-14T19:27:33.172Z

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating2024-06-14T18:45:54.214Z

We are currently seeing intermittent issues when running automated tests in our US West Data center. Users would encounter the "503 Service Unavailable" error on their end. We are investigating.

Jun 25, 2024

Report: "2024-April-30 Resolved Service Incident"

Last update 2024-06-25T04:02:56.365Z

postmortem2024-06-25T03:53:22.241Z

### **Dates:** Tuesday April 30 2024, 14:40 - 15:05 UTC ### **What happened:** New tests in the Virtual Device Cloud would not start. ### **Why it happened:** A change for one service caused an unhandled exception in an upstream service. ### **How we fixed it:** We rolled back the change introduced in a recent deployment. ### **What we are doing to prevent it from happening again:** We are reviewing and improving our processes for introducing changes that may affect dependent services.

resolved2024-04-30T15:00:00.000Z

Between 14:40 and 15:05 UTC, we were seeing elevated errors rates when starting Desktop and Virtual device tests in the EU-Central-1 and US-West-1 data centers. The issue is now resolved. All services are fully operational.

Report: "2024-May-28 Service Incident"

Last update 2024-06-25T03:52:01.411Z

postmortem2024-06-25T03:49:39.645Z

### **Dates:** Tuesday May 28 2024, 13:42 - 14:41 UTC ### **What happened:** There was low availability of iOS17 devices in all of Sauce Labs DCs. ### **Why it happened:** A false positive check marked some devices as offline. ### **How we fixed it:** The check was fixed. ### **What we are doing to prevent it from happening again:** We’ve implemented additional guard-rails and reduced the time to deploy.

resolved2024-05-28T14:41:18.828Z

After taking remedial action, the availability of iOS 17 real devices has returned to normal. This incident is resolved.

monitoring2024-05-28T14:16:31.763Z

After taking remedial action, the availability of iOS 17 real devices is returning to normal. We are currently monitoring.

investigating2024-05-28T13:42:15.870Z

We are experiencing reduced availability of iOS 17 Real devices in all our data centres. We are investigating.

Report: "2024-June-7 Service Incident"

Last update 2024-06-25T03:48:06.904Z

postmortem2024-06-25T03:46:41.792Z

### **Dates:** Friday June 7 2024, 12:16 - 13:54 UTC ### **What happened:** SC4 tunnels were intermittently not starting in US-WEST. ### **Why it happened:** An unexpected, intermittent DNS resolution failure began occurring after upgrading a hypervisor on which the primary DNS server was running. ### **How we fixed it:** We repaired the primary DNS server and returned it back to service. ### **What we are doing to prevent it from happening again:** We’re evaluating an alternative DNS deployment.

resolved2024-06-07T13:56:01.777Z

After taking remedial action, Sauce Connect 4 tunnels are starting successfully in our US-West data center. All services are fully operational.

monitoring2024-06-07T13:52:27.679Z

We have fixed the root cause, and Sauce Connect 4 tunnels are starting successfully in our US-West data center. We are monitoring.

identified2024-06-07T13:40:00.229Z

We have identified the root cause and are working on implementing a fix.

investigating2024-06-07T12:52:18.000Z

Sauce Connect 4 tunnels are intermittently failing to start in our US-West data center. We are investigating.

Report: "2024-May-29 Service Incident"

Last update 2024-06-25T03:44:24.769Z

postmortem2024-06-25T03:43:55.786Z

### **Dates:** Wednesday May 29 2024, 15:50 - 20:00 UTC ### **What happened:** Tests leveraging Sauce Connect 5 Tunnels created during the time of the incident failed. ### **Why it happened:** A change was released to introduce new functionality that unintentionally prevented newly-created Sauce Connect 5 Tunnels from passing traffic. ### **How we fixed it:** The deployment was reverted and Sauce Connect 5 Tunnels created during the time of the incident were shut down. ### **What we are doing to prevent it from happening again:** We’re investing in additional validation checks for Sauce Connect 5.

resolved2024-05-29T16:00:00.000Z

Between 1550 and 2330 UTC, some customers running tests with Sauce Connect 5 tunnels that were created between 1550 and 2100 UTC may have experienced test failures in the US-West-1 and EU-Central-1 Data Centers. The error in the tests would be reported as "Infrastructure Error -- The Sauce VMs failed to start the browser or device". The issue is now resolved. All services are fully operational.

May 23, 2024

Report: "2024-May-2 Resolved Service Incident"

Last update 2024-05-23T10:12:56.447Z

postmortem2024-05-22T15:31:55.831Z

### **Dates:** Thursday May 2 2024, 09:30 - 11:00 UTC ### **What happened:** There was reduced availability of Real devices for customers. ### **Why it happened:** An internal API change resulted in devices being incorrectly marked as unavailable. ### **How we fixed it:** We rolled back the change introduced in a recent deployment. ### **What we are doing to prevent it from happening again:** We are updating the dependency to improve error handling.

resolved2024-05-02T11:00:00.000Z

Between 9:30 AM and 11:00 AM UTC, we were experiencing decreasing availability of Real Devices in all data centers. This has now been resolved and device availability is at normal levels.

Report: "2024-May-3 Service Incident"

Last update 2024-05-23T10:12:56.447Z

postmortem2024-05-23T10:00:32.289Z

### **Dates:** Thursday May 3 2024, 10:36 - 12:02 UTC ### **What happened:** There was limited access to the Sauce Labs UI for customers in our “US East 4” datacenter. ### **Why it happened:** A renewed SSL certificate was not applied properly. ### **How we fixed it:** The certificate was applied correctly. ### **What we are doing to prevent it from happening again:** We have improved our certificate expiration monitoring for this environment.

resolved2024-05-03T12:08:47.655Z

After taking remedial action, access to the dashboard and manual testing in our US-East-4 datacenter has been restored. This incident is resolved.

investigating2024-05-03T11:48:54.999Z

We are currently experiencing an issue accessing our web application in the US-East-4 datacenter, this affects access to the Sauce Labs dashboard and manual testing. We are investigating.

May 14, 2024

Report: "2024-April-17 Service Incident"

Last update 2024-05-14T18:06:27.072Z

postmortem2024-05-14T18:03:21.707Z

### **Dates:** Monday April 17th 2024, 15:50 - 16:55 UTC ### **What happened:** Automated and live tests for virtual and real device clouds in the US West region failed. ### **Why it happened:** An invalid configuration was deployed that prevented connections from being accepted. Additionally, following the change being reverted, an internal component was unable to appropriately handle the sudden surge in traffic. ### **How we fixed it:** We rolled back the configuration to the previous version and restarted an internal service. ### **What we are doing to prevent it from happening again:** We are evaluating strategies for more rapidly scaling the affected internal service.

resolved2024-04-17T17:08:45.547Z

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating2024-04-17T17:07:31.352Z

We are continuing to investigate this issue.

investigating2024-04-17T16:49:20.042Z

We are continuing to investigate this issue.

investigating2024-04-17T16:44:39.000Z

We are aware of reports of intermittent errors when attempting to start automated and live tests in our US-West-1 data center. We are investigating.

Apr 24, 2024

Report: "2024-April-14 Service Incident"

Last update 2024-04-24T02:57:40.459Z

postmortem2024-04-24T02:56:03.351Z

### **Dates:** Sunday April 14th 2024, 17:05 - 18:40 UTC ### **What happened:** Customers were unable to access the Web UI or API for Sauce Labs services. ### **Why it happened:** Our internal and external API Gateways were unable to process traffic. ### **How we fixed it:** We restarted the API Gateway service. ### **What we are doing to prevent it from happening again:** We are working to improve our runbooks for similar issues in the future.

resolved2024-04-14T18:46:52.145Z

After taking remedial action, services in all datacenters are operating as normal. This incident is resolved.

investigating2024-04-14T17:55:16.833Z

We are seeing issues accessing and running tests on the Sauce Labs platform in all datacenters, this is affecting all services. We are currently investigating.

Apr 9, 2024

Report: "2024-March-27 Service Incident"

Last update 2024-04-09T15:34:22.746Z

postmortem2024-04-09T15:34:18.072Z

### **Dates:** Wednesday March 27th 2024, 10:20 - 11:55 UTC ### **What happened:** Availability of real devices running iOS 14.x - 16.x was decreased, which may have resulted in an inability to obtain a device during periods of increased usage. ### **Why it happened:** A third-party library was updated. This update introduced a minor bug that exposed another non-recoverable bug, resulting in specific versions of iOS appearing offline in our system. ### **How we fixed it:** We rolled back the library update. ### **What we are doing to prevent it from happening again:** We are fixing the triggered defect in our code as well as improving the detection of offline devices during deployment of the affected software.

resolved2024-03-27T12:19:58.243Z

After taking remedial action, iOS Real devices are available in all our data centers. This incident is now resolved.

investigating2024-03-27T11:28:34.905Z

We are experiencing reduced availability of iOS Real devices older than iOS 17 in all our data centers. We are investigating.

Report: "2024-Mar-25 Resolved Service Incident"

Last update 2024-04-09T14:32:56.813Z

postmortem2024-04-09T14:22:58.047Z

### **Dates:** Monday March 25th 2024, 14:47 - 14:59 UTC ### **What happened:** Automated and live tests for virtual and real device clouds in the US West region failed. ### **Why it happened:** An invalid configuration was deployed that prevented connections from being accepted. ### **How we fixed it:** We rolled back the configuration to the previous version. ### **What we are doing to prevent it from happening again:** The underlying cause for the misconfiguration of the service, a template logic error, was fixed.

resolved2024-03-25T16:06:16.000Z

Between 14:48 - 14:58 UTC, a configuration change on our side resulted in test failures reaching ondemand.us-west-1.saucelabs.com. We have identified the issue and taken remedial action.

Report: "2024-March-6 Service Incident"

Last update 2024-04-09T14:14:49.392Z

postmortem2024-04-09T14:13:57.819Z

### **Dates:** Wednesday March 6th 2024, 12:43 - 19:00 UTC ### **What happened:** Test artifact uploads and downloads intermittently failed or were slower to complete, resulting in issues displaying test results. ### **Why it happened:** A subsystem involved in artifact uploads would not properly recover from some failures, resulting in leaked connections and connection pool exhaustion. Once fully exhausted, uploads were no longer possible. ### **How we fixed it:** We identified the bug in the code and deployed a fix. ### **What we are doing to prevent it from happening again:** We have fixed the specific bug and implemented monitoring for similar symptoms.

resolved2024-03-06T19:37:32.229Z

This incident has been resolved.

investigating2024-03-06T19:03:54.000Z

We are currently experiencing intermittent issues loading real device test reports on the dashboard in our US West Data Center. We are investigating.

Mar 27, 2024

Report: "2024-March-11 Service Incident"

Last update 2024-03-27T19:55:05.900Z

postmortem2024-03-27T19:54:21.537Z

### **Dates:** Saturday March 9th 2024, 20:00 UTC - Monday March 11th, 13:22 UTC ### **What happened:** Some Windows, Linux, and Android Emulator workloads in our US-West-1 data center experienced elevated error rates. ### **Why it happened:** A hardware instruction silently failed to be programmed in a network device’s TCAM. This resulted in a subset of traffic from one of our data centers being unable to reach the public Internet. ### **How we fixed it:** We forced a reprogramming of the TCAM instructions in question. ### **What we are doing to prevent it from happening again:** We are investigating ways to better understand silent failures in this portion of our infrastructure.

resolved2024-03-11T13:38:30.197Z

After taking remedial action, we are seeing error rates return to normal levels. This incident is now resolved.

investigating2024-03-11T11:17:55.013Z

We are seeing elevated error rates in Desktop tests in the US-West-1 datacenter. We continue to investigate.

investigating2024-03-11T10:14:03.667Z

We are currently seeing elevated error rates in Desktop tests in the US-West-1 datacenter. We are investigating.

Mar 14, 2024

Report: "2024-March-14 Resolved Service Incident"

Last update 2024-03-14T08:21:45.081Z

resolved2024-03-14T04:00:00.000Z

Between 4:00-6:30 AM UTC, we were experiencing errors retrieving test artifacts for test results in our US-West-1 datacenter. This has now been resolved and all test results are available, no data loss was incurred.

Mar 13, 2024

Report: "2024-March-12 Service Incident"

Last update 2024-03-13T16:36:42.810Z

resolved2024-03-13T16:36:42.277Z

After taking remedial action, Samsung Real devices are available in all our data centers. This incident is now resolved.

identified2024-03-13T14:31:14.588Z

The issue has been identified and a fix is being implemented.

investigating2024-03-13T10:06:46.331Z

We continue to experience reduced availability of Samsung Real devices in our Public pool in the US-West-1 and EU-Central-1 data centers. We continue to investigate.

investigating2024-03-12T22:40:47.285Z

We continue to experience reduced availability of Samsung Real devices in our Public pool in the US-West-1 and EU-Central-1 data centers. Investigation is underway.

investigating2024-03-12T16:55:51.848Z

We are experiencing reduced availability of Samsung Real devices in our US-West-1 and EU-Central-1 data centers. We continue to investigate.

investigating2024-03-12T14:12:58.655Z

We are experiencing reduced availability of Samsung Real devices in the US-West-1 data center. We are investigating.

Mar 12, 2024

Report: "2024-February-29 Service Incident"

Last update 2024-03-12T12:32:21.516Z

postmortem2024-03-12T12:32:15.785Z

### **Dates:** Thursday February 29th 2024, 16:25 UTC - Friday March 1st, 00:15 UTC ### **What happened:** iOS Simulators in our OS-West-1 data center intermittently experienced longer startup times and in some cases failed to acquire a session, resulting in test failure. ### **Why it happened:** An increase in utilization throughout the Mac Cloud created additional pressure and lack of availability for some images, which resulted in higher-than-normal startup times. In some cases, the startup times were high enough to cause test failure due to abandonment of the session. ### **How we fixed it:** Burst capacity was leveraged to provide some relief. Later, utilization decreased to normal levels. ### **What we are doing to prevent it from happening again:** We are working to both increase capacity and decrease startup times for iOS Simulators.

resolved2024-02-29T23:19:30.216Z

After taking remedial action error rates in Desktop and Virtual Device tests are back to normal levels. This incident is resolved

investigating2024-02-29T22:02:50.168Z

We are currently seeing elevated error rates in Desktop and Virtual device tests in the US-West-1 datacenter. We are still investigating.

investigating2024-02-29T20:43:39.390Z

We are currently seeing elevated error rates in Desktop and Virtual device tests in the US-West-1 datacenter. We are currently investigating.

Report: "2024-February-26 Service Incident"

Last update 2024-03-12T12:28:36.341Z

postmortem2024-03-12T12:28:29.983Z

### **Dates:** Monday, February 26th, 2024 23:08 UTC - Tuesday, February 27th, 2024 02:56 UTC ### **What happened:** Increased network load which led to test slowness or failures affecting Real Device tests in our US-West-1 data center. ### **Why it happened:** During the troubleshooting of an increase in network load in our US-West-1 data center, a network failover event was executed.. ### **How we fixed it:** The network failover and additional traffic shaping stabilized services. ### **What we are doing to prevent it from happening again:** Improvements to the runbook for manually initiated failovers.

resolved2024-02-27T02:33:42.756Z

After taking remedial action, Real device availability and error rates are back to normal in the US-West-1 data center. All services are fully operational.

monitoring2024-02-27T02:02:17.939Z

After taking remedial action, Real device availability and error rates are returning to normal levels in the US-West-1 data center. We are currently monitoring.

investigating2024-02-27T01:12:36.064Z

We are currently seeing elevated error rates and device availability in Real device tests in our US-West-1 datacenter. We are currently investigating

Report: "2024-February-19 Service Incident"

Last update 2024-03-12T11:38:29.073Z

postmortem2024-03-12T11:38:26.653Z

### **Dates:** Monday February 19th 2024, 17:30 UTC - Monday February 19th, 23:06 UTC ### **What happened:** Real Devices in our US-West-1 data center running iOS 17 were degraded and may intermittently have disconnected a user. Additionally, at times, iOS 17 Real Devices may have been unavailable for use. ### **Why it happened:** Real Devices running iOS 17 experienced an unknown gradual degradation over time. ### **How we fixed it:** Real Devices running iOS 17 were restarted. ### **What we are doing to prevent it from happening again:** We are working to improve our observability of iOS 17 and improving iOS 17-specific design elements for Real Devices.

resolved2024-02-19T22:12:13.698Z

After taking remedial action, iOS 17 Real devices are available in US-West-1 data center. All services are fully operational.

identified2024-02-19T21:51:58.723Z

We have identified the cause and taking remedial action.

investigating2024-02-19T21:20:51.293Z

iOS 17 Real devices are currently unavailable in the US-West-1 data center. We are investigating.

Mar 7, 2024

Report: "2024-February-27 Service Incident"

Last update 2024-03-07T10:00:16.609Z

postmortem2024-03-07T10:00:05.614Z

### **Dates:** Saturday February 24th 2023, 01:00 UTC to Wednesday February 28th 2023, 23:14 UTC ### **What happened:** Increased network load which led to intermittent test slowness or failures. ### **Why it happened:** An update was automatically triggered when a customer test was executed which led to increased network load. ### **How we fixed it:** Deployed new images for impacted operating systems. ### **What we are doing to prevent it from happening again:** Ensure new images do not auto update and improve alerting on this condition.

resolved2024-02-28T06:08:21.005Z

Real device availability and error rates for automated and live browser and virtual mobile device tests are within the normal levels in the US-West-1 data center.

investigating2024-02-28T03:52:42.342Z

We are seeing elevated error rates for automated and live browser and virtual mobile device tests and intermittently unavailable real devices in our US-West datacenter. We are investigating.

Mar 6, 2024

Report: "2024-March-5 Service Incident"

Last update 2024-03-06T09:52:30.447Z

resolved2024-03-06T09:52:30.425Z

This incident has been resolved.

monitoring2024-03-05T21:57:57.614Z

After taking remedial action, the redirect is working as expected. We are currently monitoring.

investigating2024-03-05T21:12:06.082Z

We've identified redirect issues for Mobile Beta login on our Dashboard in both US-West-1 and EU-Central-1 Datacenters. Investigation is underway.

Mar 5, 2024

Report: "2024-February-20 Resolved Service Incident"

Last update 2024-03-05T16:22:56.911Z

postmortem2024-03-05T16:14:23.225Z

### **Dates:** Monday February 19th 2024, 11:20 UTC - Tuesday February 20th, 10:15 UTC ### **What happened:** Some Emulator and Simulator sessions with typically long startup times were unable to start. ### **Why it happened:** A service was updated to improve its performance. This service handles waiting additional time for a VM to become available. Its internal logic was modified to process a single wait request, but sometimes additional wait requests are necessary. ### **How we fixed it:** The deployment was reverted. ### **What we are doing to prevent it from happening again:** We are investigating simulating slower startup times to ensure we can effectively identify similar regressions in the future.

resolved2024-02-19T11:30:00.000Z

Between 2024-Feb-19 11:20 UTC and 2024-Feb-20 10:20 UTC, error rates for desktop and virtual mobile device tests were elevated in our US-West-1 and EU-Central-1 data centers. We have taken remedial action. All services are fully operational.

Report: "2024-February-12 Service Incident"

Last update 2024-03-05T16:12:39.275Z

postmortem2024-03-05T16:12:32.215Z

### **Dates:** Sunday February 11th 2024, 13:35 - Monday February 12th 2024, 09:00 UTC ### **What happened:** Over the course of the incident, iOS 17 device availability decreased until all devices were unavailable. ### **Why it happened:** A change in behavior in Xcode resulted in non-exiting child processes that eventually caused the app to become unresponsive. ### **How we fixed it:** Child processes were reaped across the service’s fleet. ### **What we are doing to prevent it from happening again:** We are working to improve the reliability of services involved in managing iOS 17 real devices.

resolved2024-02-12T09:01:33.336Z

After taking remedial action, we are seeing iOS 17 Real devices are now available again. This incident is resolved.

investigating2024-02-12T08:44:36.853Z

iOS 17 Real devices are currently unavailable in the US-West-1 and EU-Central-1 data centers. We are investigating.

Report: "2024-February-8 Service Incident"

Last update 2024-03-05T16:10:12.720Z

postmortem2024-03-05T16:10:05.067Z

### **Dates:** Thursday February 8th 2024, 12:48 - 15:15 UTC ### **What happened:** Test execution using Sauce Orchestrate was unavailable. ### **Why it happened:** An internal queue used by the Live Logging feature of Sauce Orchestrate ran out of space due to an unexpected interaction when configuring a size limit that resulted in reserving more space than was available in the queue cluster. ### **How we fixed it:** Existing streams were pruned and configurations were updated to use a message limit instead of a size limit. ### **What we are doing to prevent it from happening again:** We are improving our monitoring of the internal queue service.

resolved2024-02-08T16:33:21.185Z

After taking remedial action, we are seeing normal execution of Sauce Orchestrate tests on the US-West-1 data center. This incident is resolved.

monitoring2024-02-08T15:40:35.454Z

A fix has been implemented and we are monitoring the results.

investigating2024-02-08T15:23:06.986Z

We are seeing elevated error rates for tests running via Sauce Orchestrate in the US-West-1 Datacenter. We are investigating.