Sauce Labs

Is Sauce Labs Down Right Now? Check if there is a current outage ongoing.

Sauce Labs is currently Operational

Last checked from Sauce Labs's official status page

Historical record of incidents for Sauce Labs

Report: "2025-June-13 Resolved Service Incident"

Last update
resolved

Between 07:30 - 09:00 UTC, There was an issue with the Saucelabs Dashboard User Interface, this caused problems accessing apps and test results in the all datacenters. After remedial action this issue was resolved.

Report: "2025-May-6 Resolved Service Incident"

Last update
postmortem

### **Dates:** Tuesday April 6th 2025, 03:38 UTC - 03:58 UTC ### **What happened:** RDC devices were unavailable in the EU-Central-1 Datacenter. ### **Why it happened:** There was a DNS caching failure during a third party provider’s maintenance. ### **How we fixed it:** Availability was restored automatically, after the instances completed their start-up. ### **What we are doing to prevent it from happening again:** We will update our DNS caching for better fault tolerance.

resolved

Between 03:38 - 03:58 UTC, Real Devices in our EU data center were unavailable. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2024-October-1 Service Incident"

Last update
postmortem

### **Dates:** Monday September 30 2024, 15:53 - 16:10 UTC ### **What happened:** The Sauce Labs dashboard for the EU-Central-1 datacenter was not accessible. ### **Why it happened:** The gateway in front of [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) was misconfigured. ### **How we fixed it:** A rollback was executed to the previous known working version. ### **What we are doing to prevent it from happening again:** Improved synthetic monitoring for [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/).

resolved

This incident has been resolved.

monitoring

Access to support portal has been restored. We continue to monitor.

identified

We are continuing to monitor the status and update once the third party issue is resolved

identified

We are continuing to monitor the status and update once the third party issue is resolved

identified

A third party provider is currently experiencing an outage which is impacting support.saucelabs.com. If this is an emergency and you wish to reach Sauce Labs Support team, please email us at support.outage@saucelabs.com.

Report: "2025-February-21 Service Incident"

Last update
postmortem

### **Dates:** Saturday February 22nd  2025, 00:00 - 04:46 UTC ### **What happened:** Customers were unable to create SauceConnect tunnels in our US-West-1 and EU-Central-1 regions. ### **Why it happened:** The SSL certificate for the Sauce Connect frontend had expired. ### **How we fixed it:** The certificate was renewed and deployed to the affected regions. ### **What we are doing to prevent it from happening again:** We are improving our alerting around SSL certificate expiration.

resolved

All services are now operating as normal. This incident is resolved.

identified

We have identified the issue and have taken a remedial action. We are monitoring.

investigating

We are continuing to investigate this issue.

investigating

New Sauce Connect Tunnels are not able to be created on our US-West-1 & EU-Central-1 Datacenters. We are investigating.

Report: "2025-April-2 Service Incident"

Last update
postmortem

### **Dates:** Tuesday April 1st 2025, 14:54 UTC - Wednesday April 2nd 2025, 13:45 UTC ### **What happened:** iOS tests using Sauce Connect failed to start. ### **Why it happened:** There was a mismatch in the SSL certificate's expiration dates. ### **How we fixed it:** Re-ordered new SSL certificates. ### **What we are doing to prevent it from happening again:** We are adding monitoring for the SSL certificates.

resolved

After taking remedial action, all services are operating as normal. This incident is resolved.

monitoring

We have identified the root cause and have deployed a fix for this issue. We are monitoring.

investigating

We are continuing to see iOS simulator test failures when using Sauce Connect in the US-West-1 & EU-Central-1 Data Centers. We are actively investigating.

investigating

We are seeing failing iOS simulator tests when using Sauce Connect in US-West-1 & EU-Central-1 Data Centers. We are investigating.

Report: "2025-March-27 Service Incident"

Last update
postmortem

### **Dates:** Wednesday March 26th 2025, 19:15 - Thursday March 27th 2025, 14:30 UTC ### **What happened:** An internal cache for mobile applications in one datacenter was unable to reach its origin, causing artifacts to age out and no longer be available to be served. ### **Why it happened:** A change to the cache’s routing prevented it from reaching its origin. ### **How we fixed it:** The routing change was reverted. ### **What we are doing to prevent it from happening again:** Additional monitoring and alerting has been implemented to identify this issue more quickly in the future.

resolved

App storage errors have been resolved. All services are fully operational.

investigating

We are seeing failures due to issue with our storage service affecting Virtual and Real Device Cloud as well as any saucectl and Mobile App Distribution in our US-West Data Center. We are investigating

Report: "2025-May-14 Service Incident 1"

Last update
resolved

We have identified the root cause and deployed a fix for this issue. This incident is resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating an issue where video recordings for Android tests are intermittently missing in our US West Datacenter. The issue began on May 13th at approximately 13:00 CEST.

Report: "2025-May-14 Service Incident 1"

Last update
Investigating

We are currently investigating an issue where video recordings for Android tests are intermittently missing in our US West Datacenter. The issue began on May 13th at approximately 13:00 CEST.

Report: "2025-May-14 Service Incident"

Last update
resolved

After taking remedial action, app uploads and testing are now both working correctly. This incident is resolved

investigating

We are currently seeing errors in uploading and installing apps in our US-East-4 Datacenter. We are investigating.

Report: "2025-May-14 Service Incident"

Last update
Investigating

We are currently seeing errors in uploading and installing apps in our US-East-4 Datacenter. We are investigating.

Report: "2025-May-6 Resolved Service Incident"

Last update
Resolved

Between 03:38 - 03:58 UTC, Real Devices in our EU data center were unavailable. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2025-April-23 Resolved Service Incident"

Last update
resolved

Between 16:55 - 17:25 UTC, we experienced Android Real Device Test failures in our EU data center. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2025-April-23 Resolved Service Incident"

Last update
Resolved

Between 16:55 - 17:25 UTC, we experienced Android Real Device Test failures in our EU data center. After taking remedial action, the issue has been resolved. All services are fully operational.

Report: "2025-April-2 Service Incident"

Last update
Investigating

We are seeing failing iOS simulator tests when using Sauce Connect in US-West-1 & EU-Central-1 Data Centers. We are investigating.

Report: "2025-March-6 Service Incident"

Last update
postmortem

### **Dates:** Thursday March 6th 2025, 12:00 - 12:22 UTC ### **What happened:** Customers were unable to start new Android Real device tests in all datacentres. ### **Why it happened:** During a deployment, a race condition occurred during the reallocation of devices from "old" device pools to "new" device pools which caused devices to become unavailable in both pools. ### **How we fixed it:** The "old" pools were shutdown releasing the device lock, allowing the "new" pools to acquire the lock. ### **What we are doing to prevent it from happening again:** We are improving our device allocation processes.

resolved

After taking remedial action, Android Real Devices are now available in all our datacenters. This incident is resolved

investigating

We are currently experiencing significantly reduced availability of Android Real Devices in all our datacenters. We are investigating.

Report: "2025-Feb-17 Resolved Service Incident"

Last update
postmortem

### **Dates:** Wednesday February 17th 2023, 12:52 - 13:53 UTC **What happened:** Real Device tests in EU-Central-1 datacenter were experiencing issues reaching internet destinations. ### **Why it happened:** A new internet provider was introduced that was experiencing issues with their network.. ### **How we fixed it:** A rollback was executed to move traffic off of the affected internet provider. ### **What we are doing to prevent it from happening again:** We engaged the provider and they corrected configuration issues with their network.

resolved

Between 12:52 and 13:53 UTC, some Real Device tests in EU-Central-1 Data center experienced failures reaching internet destinations because of an issue on an upstream ISP network. We have identified the issue and taken remedial action.

Report: "2025-January-21 Service Incident"

Last update
postmortem

### Dates: Tuesday January 21st 2025, 12:40 - 13:44 UTC ### **What happened:** Tests using iOS Real Devices experienced failures to download and install apps for the eu-central01 and us-west-1 datacenters. **Why it happened:** A misconfiguration with the resigner service caused errors when communicating with the app storage. ### **How we fixed it:** A rollback was executed, restoring configuration to the previous working state. ### **What we are doing to prevent it from happening again:** * Improve canary deployments for each Real Device Cloud region. * Improve end to end testing for canary deployments. * Improve SLOs for the Real Device application installs

resolved

After taking remedial action, apps are now installing correctly in all datacenters. This incident is resolved.

investigating

We are currently seeing App installation errors when trying to run iOS tests on our US-West-1 & EU-Central-1 Datacenter. We are investigating

Report: "2025-January-3 Service Incident"

Last update
postmortem

### **Dates:** Friday 3 January 2025, 09:52 - 11:07 UTC ### **What happened:** Customers were unable to access test results in the Web UI for our US-West-1 datacenter. ### **Why it happened:** A defect was introduced during a product deployment. ### **How we fixed it:** A rollback was executed to the previous working version. ### **What we are doing to prevent it from happening again:** We are creating additional checks for the authentication method upgrades.

resolved

After taking remedial action, Test Results are now available again in all data centers. This incident is resolved.

investigating

We are currently seeing an issue with Live and automated test results not being displayed on the test results page in our US-West-1 data center. We are investigating.

Report: "2025-January-8 Service Incident"

Last update
postmortem

### **Dates:** Wednesday 8 January 2025, 11:00 - 12:25 UTC ### **What happened:** We were experiencing intermittent issues with Virtual device tests missing test assets in the US-West-1 region   ### **Why it happened:** The asset uploader service was experiencing Http errors uploading assets to our backend storage. ### **How we fixed it:** The 3rd-party provider resolved an issue with their infrastructure.  ### **What we are doing to prevent it from happening again:** We are improving the caching and retry logic of the asset uploader service to prevent further occurrence.

resolved

After taking remedial action, test assets are being retained successfully in all tests. This incident is resolved.

investigating

We are currently seeing intermittent issues with test assets not being retained for Virtual device tests in our US-West-1 data center. We are investigating.

investigating

We are continuing to investigate this issue.

investigating

We are currently seeing Live and Automated test results are not being retained for Virtual Device tests intermittently in our US-West-1 data centers. We are investigating.

investigating

We are currently seeing Live and automated test results are not being retained intermittently in our US-West-1 data centers. We are investigating.

Report: "2024-October-11 Resolved Service Incident 2"

Last update
postmortem

### **Dates:** Friday October 11th 2024, 09:19 - 19:19 UTC ### **What happened:** Real Device tests using Sauce Connect 5 tunnels experienced failures in all datacenters. ### **Why it happened:** An improperly tested change to Sauce Connect service was rolled out. ### **How we fixed it:** The sauce connect service was rolled back to a previously working version. ### **What we are doing to prevent it from happening again:** Fixes to post-deploy checks are planned to prevent this issue from reoccurring.

resolved

Between 9:33 and 19:21 UTC, Real Device tests running via Sauce Connect tunnel (version 5) experienced failures on US-West-1 and EU-Central-1 and US-East-4 Data centers. We have identified the issue and taken remedial action.

Report: "2024-November-19 Service Incident"

Last update
postmortem

### **Dates:** Wednesday November 19th 2024, 21:30 - 23:30 UTC ### **What happened:** Android real devices were unable to run app tests in all datacenters. ### **Why it happened:** An unexpected policy change by Google caused MDM managed devices to be locked down due to policy violations around accessibility services. ### **How we fixed it:** We rolled back enablement of the TalkBack accessibility service. ### **What we are doing to prevent it from happening again:** We opened a support case with Google to get further information on why this policy change happened, as well as began investigating running our own MDM solution.

resolved

After taking remedial action all services are operating as normal. This incident is resolved.

investigating

Android real devices in US West 1, EU Central 1, and US East datacenter are unable to run automated Appium and Espresso app tests and manual app tests. We are investigating.

Report: "2024-December-18 Service Incident"

Last update
postmortem

### **Dates:** Wednesday December 18th 2024, 09:05 - 11:31 UTC ### **What happened:** Customers were unable to purchase self-service plans through the billing page. ### **Why it happened:** Breaking changes were introduced between our billing service client SDK and the third-party billing provider. ### **How we fixed it:** The SDK for the third-party billing provider was updated in our service. ### **What we are doing to prevent it from happening again:** We’re working with the third-party billing provider to better understand their versioning practices to ensure we remain inline with them.

resolved

Customers are able to purchase Self-Service plans from our dashboard again. This issue is resolved.

investigating

There is currently an issue with purchasing self-service plans from our dashboard, We are investigating.

Report: "2024-October-30 Service Incident"

Last update
postmortem

### **Dates:** Wednesday October 30th 2023, 03:45 - 21:06 UTC ### **What happened:** Intermittent errors occurred when starting or using Sauce Connect tunnel in the US West datacenter. ### **Why it happened:** A service responsible for creating new bindings between the tunnel endpoints and test VMs experienced timeouts on some hosts. ### **How we fixed it:** Service was restored after clearing bindings on the affected hosts. ### **What we are doing to prevent it from happening again:** Monitoring and alerting for when the service is unable to create new bindings has been improved. The root cause of the condition is under investigation.

resolved

After taking remedial action all services are operating as normal. This incident is resolved.

monitoring

We have taken remedial action and are seeing an improvement with Sauce Connect tunnel startup and allocation in the US-West-1 data center. We are monitoring

investigating

We are currently seeing intermittent issues with Sauce Connect tunnel startup and allocation in the US-West-1 data center. We are investigating.

Report: "2024-December-12 Resolved Service Incident"

Last update
postmortem

### **Dates:** Thursday December 12 2024, 12:40 - 13:30 UTC ### **What happened:** Video recordings for Android Real Device tests were not showing in test results. ### **Why it happened:** A code deployment contained a broken dependency. ### **How we fixed it:** A rollback was executed to the previous known working version. ### **What we are doing to prevent it from happening again:** A check for broken dependencies was added to the deployment.

resolved

Between 12:40 UTC and 13:20 UTC, Video recordings were missing in test reports for Android tests, affected all regions . We have identified the issue and taken remedial action.

Report: "2024-September-30 Resolved Service Incident"

Last update
postmortem

### **Dates:** Monday September 30 2024, 15:53 - 16:10 UTC ### **What happened:** The Sauce Labs WebUI in the EU Central 1 datacenter was inaccessible. ### **Why it happened:** The gateway in front of [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) was misconfigured. ### **How we fixed it:** The gateway was rolled back to the previous working version. ### **What we are doing to prevent it from happening again:** Synthetic monitoring for [https://app.eu-central-1.saucelabs.com/](https://app.eu-central-1.saucelabs.com/) has been improved.

resolved

Between 15:53 - 16:11 UTC, Sauce Labs dashboard was unavailable in our EU Data Center. The access has been restored and all services are operational.

Report: "2024-October-11 Resolved Service Incident"

Last update
postmortem

### **Dates:** Friday October 11 2024, 04:30 - 04:57 UTC ### **What happened:** Real Devices in our US East 4 datacenter were unavailable due to a lack of internet connectivity at the site. ### **Why it happened:** Both network providers used by the US East datacenter were down. ### **How we fixed it:** One of the network providers restored service, allowing RDC devices to become available. ### **What we are doing to prevent it from happening again:** Our Network team is looking into alternate IP transit providers to improve reliability.

resolved

Between 04:30 and 04:57 UTC, we were experiencing elevated error rates and reduced device availability for Real device tests in our US-East data center. The issue is now resolved. All services are fully operational.

Report: "2025-March-27 Service Incident"

Last update
Resolved

App storage errors have been resolved. All services are fully operational.

Investigating

We are seeing failures due to issue with our storage service affecting Virtual and Real Device Cloud as well as any saucectl and Mobile App Distribution in our US-West Data Center. We are investigating

Report: "Sauce Labs Planned Maintenance"

Last update
Completed

The scheduled maintenance has been completed.

In progress

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled

Sauce Labs will have a two hour previously scheduled maintenance window on March 22nd, starting at 17:00 UTC and ending at 19:00 UTC. During this maintenance window, we will be making updates to our infrastructure and services in our US West data center. These maintenance actions may cause portions of the service (including running automated and manual tests) to be unavailable for up to two hours.

Report: "2025-January-7 Service Incident"

Last update
resolved

Access to *.testfairy.com via Chromium-based browsers has been restored. This incident is resolved.

investigating

There is currently an intermittent issue with accessing *.testfairy.com pages via Chromium-based browsers, preventing app distribution. Safari browsers can still access these sites. We are investigating.

investigating

There is currently an issue with accessing *.testfairy.com pages via Chromium-based browsers, preventing app distribution. Safari browsers can still access these sites. We are investigating.

investigating

We are currently seeing .testfairy.com Website pages throwing full-page red alert error messages. We are investigating.

Report: "2024-09-15 Resolved Service Incident"

Last update
postmortem

**Dates:** Saturday September 15th, 2024 17:02 -  20:15 UTC  **What happened:** Our Real Device \(RDC\) availability dropped below 80% across our US West and US East regions. **Why it happened:** The issue was caused by a failure in our certificate renewal process set to occur automatically, but failed to run as expected.  **How we fixed it:** We fixed the problem by restarting the impacted service, which automatically renewed the necessary certificates. This brought our device availability back to normal. **What we are doing to prevent it from happening again:** We have initiated an investigation to review and fix the renewal process script to ensure it functions correctly in the future.

resolved

Between 17:02 and 20:15 UTC, automated tests executed on our real devices in the US West Data Center experienced an increased error rate due to reduced availability of the devices. The issue has been identified. All services are back to normal.

Report: "2024-August-12 Service Incident"

Last update
postmortem

### **Dates:** Saturday August 10th 2024, 19:00 - Monday August 12th 2024, 21:13 UTC ### **What happened:** Following scheduled network upgrades, connectivity between Real Devices and our IPsec VMs was disrupted.  Additionally, a misconfiguration in Sauce Connect caused iOS Real Device Live Testing and app installations to fail. ### **Why it happened:** During the network upgrade, a community tag was missing from a subset of production routes causing connectivity issues between IPsec VMs and Real Devices.  Additionally, a misconfiguration in Sauce Connect caused iOS Live Testing connectivity issues. ### **How we fixed it:** To resolve the IPsec issue, we added the missing tags.  To resolve the Sauce Connect with iOS Real Devices issue, we rolled back the Sauce Connect service configuration. ### **What we are doing to prevent it from happening again:** We’re adding additional end to end testing and alerting for our IPsec and Sauce Connect on iOS products.

resolved

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating

We are currently seeing real device iOS installation failures, showing "Unable to Verify App" in the US-West Data Center. We are investigating.

Report: "2024-July-4 Service Incident"

Last update
postmortem

**Dates:** Thursday July 4th, 2024 13:13 - 14:06 UTC  ### **What happened:** An incident occurred due to I/O errors on the database node, causing connectivity issues within our infrastructure. This affected Virtual jobs and Tunnel services in the US-West region, leading to disruptions in Sauce Labs products. **Why it happened:** Our job orchestration service failed to connect to a backend data store. **How we fixed it:** The data store issues were resolved, restoring connectivity. ### **What we are doing to prevent it from happening again:** We’re undertaking several initiatives including process improvements and enhanced monitoring, outlining steps to fix the issue and prevent it from happening again,

resolved

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating

We are currently seeing elevated error rates in Desktop and Virtual Mobile tests using the US-West-1 datacenter. We are currently investigating.

investigating

All jobs in our US-West-1 Data Center are failing to start. We are investigating.

Report: "2024-September-9 Service Incident"

Last update
postmortem

### **Dates:** Friday September 6 2024, 20:30 UTC - Monday September 9 2024, 11:57 UTC ### **What happened:** RDC devices in the US-West had intermittent connectivity issues. ### **Why it happened:** An unexpected, intermittent DNS resolution failure began occurring after a new DNS  server was added. ### **How we fixed it:** The  faulty DNS server was removed. ### **What we are doing to prevent it from happening again:** Additional monitoring for this particular network failure was added.

resolved

After taking remedial action all services are operating as normal. This incident is resolved.

investigating

We are currently seeing intermittent network connectivity issues in Live and Automated Real Device tests in the US-West-1 Data Center. We are still investigating.

investigating

We are currently seeing intermittent network connectivity issues in iOS Live and Automated Real Device tests in the US-West-1 Data Center. We are investigating.

Report: "2024-August-15 Service Incident"

Last update
postmortem

### Dates: Thursday Aug 15th 2023, 17:24 - 19:05 UTC ### **What happened:** The SauceLabs Single Sign-on service suffered a failure causing customers using SSO to be unable to login to the UI. This incident did not impact automated testing. ### **Why it happened:** An invalid SauceLabs iDP was deployed. ### **How we fixed it:** We rolled back the configuration for the SauceLabs iDP, restoring service.. ### **What we are doing to prevent it from happening again:** We’ve deployed stricter validation in our Single Sign-on service

resolved

This incident has been resolved.

investigating

We have received multiple reports of users experiencing issues accessing our dashboard via SSO. We are investigating.

Report: "2024-July-8 Service Incident"

Last update
postmortem

### **Dates:** ‌07/08/2024 18:21 UTC to 07/08/2024 19:13 UTC ### **What happened:** Many pages of our website were unreachable or prohibitively slow. Potential customers couldn’t visit our website and existing customers who use our site as a login area may have been hindered. . ### **Why it happened:** Our CMS provider, Contentful, was experiencing an outage which meant that we couldn’t generate pages. ### **How we fixed it:** Contentful resolved the outage. ### **What we are doing to prevent it from happening again:** We are switching our website from on-demand rendering to build-time rendering, this means the pages are built at runtime and served from a CDN, so no need for real-time rendering.

resolved

After taking remedial action all services are operating as normal. This incident is resolved.

investigating

We are currently seeing saucelabs.com website pages loading slow and throwing errors. We are investigating.

Report: "2024-June-14 Service Incident"

Last update
postmortem

### **Dates:** Wednesday June 14th 2024, 10:30 - 19:20 UTC ### **What happened:** Customers may have received intermittent 503 errors when running tests. ### **Why it happened:** A required network component on a node crashed during automated restarts and did not automatically recover. ### **How we fixed it:** We recovered the impacted node, which allowed the network component to start correctly. ### **What we are doing to prevent it from happening again:** We have improved the alerting for the network component to ensure automated upgrades and restarts complete correctly.

resolved

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating

We are currently seeing intermittent issues when running automated tests in our US West Data center. Users would encounter the "503 Service Unavailable" error on their end. We are investigating.

Report: "2024-April-30 Resolved Service Incident"

Last update
postmortem

### **Dates:** Tuesday April 30 2024, 14:40 - 15:05 UTC ### **What happened:** New tests in the Virtual Device Cloud would not start.  ### **Why it happened:** A change for one service caused an unhandled exception in an upstream service. ### **How we fixed it:** We rolled back the change introduced in a recent deployment. ### **What we are doing to prevent it from happening again:** We are reviewing and improving our processes for introducing changes that may affect dependent services.

resolved

Between 14:40 and 15:05 UTC, we were seeing elevated errors rates when starting Desktop and Virtual device tests in the EU-Central-1 and US-West-1 data centers. The issue is now resolved. All services are fully operational.

Report: "2024-May-28 Service Incident"

Last update
postmortem

### **Dates:** Tuesday May 28 2024, 13:42 - 14:41 UTC ### **What happened:** There was low availability of iOS17 devices in all of Sauce Labs DCs. ### **Why it happened:** A false positive check marked some devices as offline. ### **How we fixed it:** The check was fixed.  ### **What we are doing to prevent it from happening again:** We’ve implemented additional guard-rails and reduced the time to deploy.

resolved

After taking remedial action, the availability of iOS 17 real devices has returned to normal. This incident is resolved.

monitoring

After taking remedial action, the availability of iOS 17 real devices is returning to normal. We are currently monitoring.

investigating

We are experiencing reduced availability of iOS 17 Real devices in all our data centres. We are investigating.

Report: "2024-June-7 Service Incident"

Last update
postmortem

### **Dates:** Friday June 7 2024, 12:16 - 13:54 UTC ### **What happened:** SC4 tunnels were intermittently not starting in US-WEST. ### **Why it happened:** An unexpected, intermittent DNS resolution failure began occurring after upgrading a hypervisor on which the primary DNS server was running. ### **How we fixed it:** We repaired the primary DNS server and returned it back to service.  ### **What we are doing to prevent it from happening again:** We’re evaluating an alternative DNS deployment.

resolved

After taking remedial action, Sauce Connect 4 tunnels are starting successfully in our US-West data center. All services are fully operational.

monitoring

We have fixed the root cause, and Sauce Connect 4 tunnels are starting successfully in our US-West data center. We are monitoring.

identified

We have identified the root cause and are working on implementing a fix.

investigating

Sauce Connect 4 tunnels are intermittently failing to start in our US-West data center. We are investigating.

Report: "2024-May-29 Service Incident"

Last update
postmortem

### **Dates:** Wednesday May 29 2024, 15:50 - 20:00 UTC ### **What happened:** Tests leveraging Sauce Connect 5 Tunnels created during the time of the incident failed. ### **Why it happened:** A change was released to introduce new functionality that unintentionally prevented newly-created Sauce Connect 5 Tunnels from passing traffic. ### **How we fixed it:** The deployment was reverted and Sauce Connect 5 Tunnels created during the time of the incident were shut down. ### **What we are doing to prevent it from happening again:** We’re investing in additional validation checks for Sauce Connect 5.

resolved

Between 1550 and 2330 UTC, some customers running tests with Sauce Connect 5 tunnels that were created between 1550 and 2100 UTC may have experienced test failures in the US-West-1 and EU-Central-1 Data Centers. The error in the tests would be reported as "Infrastructure Error -- The Sauce VMs failed to start the browser or device". The issue is now resolved. All services are fully operational.

Report: "2024-May-3 Service Incident"

Last update
postmortem

### **Dates:** Thursday May 3 2024, 10:36 - 12:02 UTC ### **What happened:** There was limited access to the Sauce Labs UI for customers in our “US East 4” datacenter.  ### **Why it happened:** A renewed SSL certificate was not applied properly. ### **How we fixed it:** The certificate was applied correctly.  ### **What we are doing to prevent it from happening again:** We have improved our certificate expiration monitoring for this environment.

resolved

After taking remedial action, access to the dashboard and manual testing in our US-East-4 datacenter has been restored. This incident is resolved.

investigating

We are currently experiencing an issue accessing our web application in the US-East-4 datacenter, this affects access to the Sauce Labs dashboard and manual testing. We are investigating.

Report: "2024-May-2 Resolved Service Incident"

Last update
postmortem

### **Dates:** Thursday May 2 2024, 09:30 - 11:00 UTC ### **What happened:** There was reduced availability of Real devices for customers.  ### **Why it happened:** An internal API change resulted in devices being incorrectly marked as unavailable. ### **How we fixed it:** We rolled back the change introduced in a recent deployment. ### **What we are doing to prevent it from happening again:** We are updating the dependency to improve error handling.

resolved

Between 9:30 AM and 11:00 AM UTC, we were experiencing decreasing availability of Real Devices in all data centers. This has now been resolved and device availability is at normal levels.

Report: "2024-April-17 Service Incident"

Last update
postmortem

### **Dates:** Monday April 17th 2024, 15:50 - 16:55 UTC ### **What happened:** Automated and live tests for virtual and real device clouds in the US West region failed. ### **Why it happened:** An invalid configuration was deployed that prevented connections from being accepted.  Additionally, following the change being reverted, an internal component was unable to appropriately handle the sudden surge in traffic. ### **How we fixed it:** We rolled back the configuration to the previous version and restarted an internal service. ### **What we are doing to prevent it from happening again:** We are evaluating strategies for more rapidly scaling the affected internal service.

resolved

After taking remedial action, all services are operating as normal. This incident is resolved.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are aware of reports of intermittent errors when attempting to start automated and live tests in our US-West-1 data center. We are investigating.

Report: "2024-April-14 Service Incident"

Last update
postmortem

### **Dates:** Sunday April 14th 2024, 17:05 - 18:40 UTC ### **What happened:** Customers were unable to access the Web UI or API for Sauce Labs services. ### **Why it happened:** Our internal and external API Gateways were unable to process traffic. ### **How we fixed it:** We restarted the API Gateway service. ### **What we are doing to prevent it from happening again:** We are working to improve our runbooks for similar issues in the future.

resolved

After taking remedial action, services in all datacenters are operating as normal. This incident is resolved.

investigating

We are seeing issues accessing and running tests on the Sauce Labs platform in all datacenters, this is affecting all services. We are currently investigating.

Report: "2024-March-27 Service Incident"

Last update
postmortem

### **Dates:** Wednesday March 27th 2024, 10:20 - 11:55 UTC ### **What happened:** Availability of real devices running iOS 14.x - 16.x was decreased, which may have resulted in an inability to obtain a device during periods of increased usage. ### **Why it happened:** A third-party library was updated.  This update introduced a minor bug that exposed another non-recoverable bug, resulting in specific versions of iOS appearing offline in our system. ### **How we fixed it:** We rolled back the library update. ### **What we are doing to prevent it from happening again:** We are fixing the triggered defect in our code as well as improving the detection of offline devices during deployment of the affected software.

resolved

After taking remedial action, iOS Real devices are available in all our data centers. This incident is now resolved.

investigating

We are experiencing reduced availability of iOS Real devices older than iOS 17 in all our data centers. We are investigating.

Report: "2024-Mar-25 Resolved Service Incident"

Last update
postmortem

### **Dates:** Monday March 25th 2024, 14:47 - 14:59 UTC ### **What happened:** Automated and live tests for virtual and real device clouds in the US West region failed. ### **Why it happened:** An invalid configuration was deployed that prevented connections from being accepted. ### **How we fixed it:** We rolled back the configuration to the previous version. ### **What we are doing to prevent it from happening again:** The underlying cause for the misconfiguration of the service, a template logic error, was fixed.

resolved

Between 14:48 - 14:58 UTC, a configuration change on our side resulted in test failures reaching ondemand.us-west-1.saucelabs.com. We have identified the issue and taken remedial action.

Report: "2024-March-6 Service Incident"

Last update
postmortem

### **Dates:** Wednesday March 6th 2024, 12:43 - 19:00 UTC ### **What happened:** Test artifact uploads and downloads intermittently failed or were slower to complete, resulting in issues displaying test results. ### **Why it happened:** A subsystem involved in artifact uploads would not properly recover from some failures, resulting in leaked connections and connection pool exhaustion.  Once fully exhausted, uploads were no longer possible. ### **How we fixed it:** We identified the bug in the code and deployed a fix. ### **What we are doing to prevent it from happening again:** We have fixed the specific bug and implemented monitoring for similar symptoms.

resolved

This incident has been resolved.

investigating

We are currently experiencing intermittent issues loading real device test reports on the dashboard in our US West Data Center. We are investigating.

Report: "2024-March-11 Service Incident"

Last update
postmortem

### **Dates:** Saturday March 9th 2024, 20:00 UTC - Monday March 11th, 13:22 UTC ### **What happened:** Some Windows, Linux, and Android Emulator workloads in our US-West-1 data center experienced elevated error rates. ### **Why it happened:** A hardware instruction silently failed to be programmed in a network device’s TCAM. This resulted in a subset of traffic from one of our data centers being unable to reach the public Internet. ### **How we fixed it:** We forced a reprogramming of the TCAM instructions in question. ### **What we are doing to prevent it from happening again:** We are investigating ways to better understand silent failures in this portion of our infrastructure.

resolved

After taking remedial action, we are seeing error rates return to normal levels. This incident is now resolved.

investigating

We are seeing elevated error rates in Desktop tests in the US-West-1 datacenter. We continue to investigate.

investigating

We are currently seeing elevated error rates in Desktop tests in the US-West-1 datacenter. We are investigating.

Report: "2024-March-14 Resolved Service Incident"

Last update
resolved

Between 4:00-6:30 AM UTC, we were experiencing errors retrieving test artifacts for test results in our US-West-1 datacenter. This has now been resolved and all test results are available, no data loss was incurred.

Report: "2024-March-12 Service Incident"

Last update
resolved

After taking remedial action, Samsung Real devices are available in all our data centers. This incident is now resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We continue to experience reduced availability of Samsung Real devices in our Public pool in the US-West-1 and EU-Central-1 data centers. We continue to investigate.

investigating

We continue to experience reduced availability of Samsung Real devices in our Public pool in the US-West-1 and EU-Central-1 data centers. Investigation is underway.

investigating

We are experiencing reduced availability of Samsung Real devices in our US-West-1 and EU-Central-1 data centers. We continue to investigate.

investigating

We are experiencing reduced availability of Samsung Real devices in the US-West-1 data center. We are investigating.

Report: "2024-February-29 Service Incident"

Last update
postmortem

### **Dates:** Thursday February 29th 2024, 16:25 UTC - Friday March 1st, 00:15 UTC ### **What happened:** iOS Simulators in our OS-West-1 data center intermittently experienced longer startup times and in some cases failed to acquire a session, resulting in test failure. ### **Why it happened:** An increase in utilization throughout the Mac Cloud created additional pressure and lack of availability for some images, which resulted in higher-than-normal startup times.  In some cases, the startup times were high enough to cause test failure due to abandonment of the session. ### **How we fixed it:** Burst capacity was leveraged to provide some relief.  Later, utilization decreased to normal levels. ### **What we are doing to prevent it from happening again:** We are working to both increase capacity and decrease startup times for iOS Simulators.

resolved

After taking remedial action error rates in Desktop and Virtual Device tests are back to normal levels. This incident is resolved

investigating

We are currently seeing elevated error rates in Desktop and Virtual device tests in the US-West-1 datacenter. We are still investigating.

investigating

We are currently seeing elevated error rates in Desktop and Virtual device tests in the US-West-1 datacenter. We are currently investigating.

Report: "2024-February-26 Service Incident"

Last update
postmortem

### **Dates:** Monday, February 26th, 2024 23:08 UTC - Tuesday, February 27th, 2024 02:56 UTC ### **What happened:** Increased network load which led to test slowness or failures affecting Real Device tests in our US-West-1 data center. ### **Why it happened:** During the troubleshooting of an increase in network load in our US-West-1 data center, a network failover event was executed.. ### **How we fixed it:** The network failover and additional traffic shaping stabilized services. ### **What we are doing to prevent it from happening again:** Improvements to the runbook for manually initiated failovers.

resolved

After taking remedial action, Real device availability and error rates are back to normal in the US-West-1 data center. All services are fully operational.

monitoring

After taking remedial action, Real device availability and error rates are returning to normal levels in the US-West-1 data center. We are currently monitoring.

investigating

We are currently seeing elevated error rates and device availability in Real device tests in our US-West-1 datacenter. We are currently investigating

Report: "2024-February-19 Service Incident"

Last update
postmortem

### **Dates:** Monday February 19th 2024, 17:30 UTC - Monday February 19th, 23:06 UTC ### **What happened:** Real Devices in our US-West-1 data center running iOS 17 were degraded and may intermittently have disconnected a user. Additionally, at times, iOS 17 Real Devices may have been unavailable for use. ### **Why it happened:** Real Devices running iOS 17 experienced an unknown gradual degradation over time. ### **How we fixed it:** Real Devices running iOS 17 were restarted. ### **What we are doing to prevent it from happening again:** We are working to improve our observability of iOS 17 and improving iOS 17-specific design elements for Real Devices.

resolved

After taking remedial action, iOS 17 Real devices are available in US-West-1 data center. All services are fully operational.

identified

We have identified the cause and taking remedial action.

investigating

iOS 17 Real devices are currently unavailable in the US-West-1 data center. We are investigating.

Report: "2024-February-27 Service Incident"

Last update
postmortem

### **Dates:** Saturday February 24th 2023, 01:00 UTC to Wednesday February 28th 2023, 23:14 UTC ### **What happened:** Increased network load which led to intermittent test slowness or failures. ### **Why it happened:** An update was automatically triggered when a customer test was executed which led to increased network load. ### **How we fixed it:** Deployed new images for impacted operating systems. ### **What we are doing to prevent it from happening again:** Ensure new images do not auto update and improve alerting on this condition.

resolved

Real device availability and error rates for automated and live browser and virtual mobile device tests are within the normal levels in the US-West-1 data center.

investigating

We are seeing elevated error rates for automated and live browser and virtual mobile device tests and intermittently unavailable real devices in our US-West datacenter. We are investigating.

Report: "2024-March-5 Service Incident"

Last update
resolved

This incident has been resolved.

monitoring

After taking remedial action, the redirect is working as expected. We are currently monitoring.

investigating

We've identified redirect issues for Mobile Beta login on our Dashboard in both US-West-1 and EU-Central-1 Datacenters. Investigation is underway.

Report: "2024-February-20 Resolved Service Incident"

Last update
postmortem

### **Dates:** Monday February 19th 2024, 11:20 UTC - Tuesday February 20th, 10:15 UTC ### **What happened:** Some Emulator and Simulator sessions with typically long startup times were unable to start. ### **Why it happened:** A service was updated to improve its performance.  This service handles waiting additional time for a VM to become available.  Its internal logic was modified to process a single wait request, but sometimes additional wait requests are necessary. ### **How we fixed it:** The deployment was reverted. ### **What we are doing to prevent it from happening again:** We are investigating simulating slower startup times to ensure we can effectively identify similar regressions in the future.

resolved

Between 2024-Feb-19 11:20 UTC and 2024-Feb-20 10:20 UTC, error rates for desktop and virtual mobile device tests were elevated in our US-West-1 and EU-Central-1 data centers. We have taken remedial action. All services are fully operational.

Report: "2024-February-12 Service Incident"

Last update
postmortem

### **Dates:** Sunday February 11th 2024, 13:35 - Monday February 12th 2024, 09:00 UTC ### **What happened:** Over the course of the incident, iOS 17 device availability decreased until all devices were unavailable. ### **Why it happened:** A change in behavior in Xcode resulted in non-exiting child processes that eventually caused the app to become unresponsive. ### **How we fixed it:** Child processes were reaped across the service’s fleet. ### **What we are doing to prevent it from happening again:** We are working to improve the reliability of services involved in managing iOS 17 real devices.

resolved

After taking remedial action, we are seeing iOS 17 Real devices are now available again. This incident is resolved.

investigating

iOS 17 Real devices are currently unavailable in the US-West-1 and EU-Central-1 data centers. We are investigating.

Report: "2024-February-8 Service Incident"

Last update
postmortem

### **Dates:** Thursday February 8th 2024, 12:48 - 15:15 UTC ### **What happened:** Test execution using Sauce Orchestrate was unavailable. ### **Why it happened:** An internal queue used by the Live Logging feature of Sauce Orchestrate ran out of space due to an unexpected interaction when configuring a size limit that resulted in reserving more space than was available in the queue cluster. ### **How we fixed it:** Existing streams were pruned and configurations were updated to use a message limit instead of a size limit. ### **What we are doing to prevent it from happening again:** We are improving our monitoring of the internal queue service.

resolved

After taking remedial action, we are seeing normal execution of Sauce Orchestrate tests on the US-West-1 data center. This incident is resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are seeing elevated error rates for tests running via Sauce Orchestrate in the US-West-1 Datacenter. We are investigating.