Historical record of incidents for Rigor
Report: "Runners unavailable in the Stockholm region"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Rigor runners in the Stockholm region were sporadically operational between 6:10pm EST and 8:10pm EST due to an AWS EC2 instance outage in the eu-north-1 region. We are currently monitoring to ensure service has been fully restored before considering the incident resolved.
Report: "Increased error rates in Optimization performance tests"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We've identified the root cause and working on getting a fix out.
We are currently investigating elevated error rates on some Optimization scans. Previously successful scans still have data available, but new scans may fail to complete.
Report: "Delayed Check Runs in Taichung City, Taiwan"
Last updateThis incident has been resolved.
We believe that we have fixed the issue and are monitoring the location.
We are experiencing delayed test execution in our Taichung City, Taiwan location. We will provide updates as soon as possible.
Report: "Delays in Check Run Results"
Last updateOur upstream provider, AWS, has resolved the issue. All Rigor operations have remained normal after the fix.
We are continuing to monitor the issue from our upstream provider, AWS. At this point in time, Rigor check scheduling has returned to normal. For additional details on the outage and expected resolution time, see https://health.aws.amazon.com/health/status.
The upstream provider has identified the issue and is working on a fix. Checks are still running, but there may continue to be delayed runs and alerts until the incident is resolved.
We are currently investigating issues runnings checks from all monitoring locations due to an upstream provider issue. Runs are currently processing, but there may be gaps in check data and delayed alerts until this issue is resolved.
Report: "Google Chrome Upgrade for Real Browser Checks"
Last updateSplunk Synthetic Monitoring updated Google Chrome to version 125 for Real Browser Checks on July 18 at 12:30 PM EDT. We periodically auto-update to newer versions of Google Chrome when available. Due to differences between browser versions, check behavior or timings can sometimes change and may require updates to your check steps.
Report: "Delayed Check Runs in Iowa, United States"
Last updateThis incident has been resolved.
We are experiencing delayed test execution in our Iowa, United States location. This appears to be the result of a provider outage. We will provide updates as soon as possible.
Report: "Delays in Check Scheduling"
Last updateThis incident has been resolved.
The issue has been fixed and checks are being scheduled on time. You may see gaps in runs between 11:52 AM to 12:10 PM Eastern Time (15:52 - 16:10 UTC).
We're currently investigating delays in our check scheduling service. There may be delays or gaps in run data until this issue is resolved.
Report: "Old Runner Version Launched in New South Wales, Australia"
Last updateDuring a recent attempt to resolve an ongoing issue, an outdated version of our Runner was launched. Runs were not dropped or delayed. This only affected our New South Wales, Australia location.
Report: "Delayed Check Runs in New South Wales"
Last updateWe are experiencing delayed test execution in our New South Wales location. Users may see gaps between check runs in this location and metrics from those runs may be higher than normal.
Report: "Delayed Check Runs in Charleston, South Carolina"
Last updateThis incident has been resolved.
We are experiencing delayed test execution in our Charleston, South Carolina location. Users may see gaps between check runs in this location and metrics from those runs may be higher than normal.
Report: "Delayed Check Runs in Charleston, South Carolina"
Last updateThis incident has been resolved.
We are experiencing delayed test execution in our Charleston, South Carolina location. Users may see gaps between check runs in this location and metrics from those runs may be higher than normal.
Report: "Delayed Check Runs in Johannesburg, South Africa"
Last updateAzure has completed their remediation and this incident has been resolved.
Azure has updated the estimated time to complete fix to 14:00 UTC on 15 Mar 2024.
Azure is currently estimating 4 hours for a fix. Metrics may still be higher in the Johannesburg, South Africa location, but we are processing jobs in real time. We'll continue to monitor the situation.
The current estimate for the completion of the remediation efforts is estimated to be at least 12 hours. We may continue to see degraded performance during that time.
The provider has identified the issue is with multiple fiber cables on the West Coast of Africa as well as on-going fiber cable cuts in the Red Sea. This has impacted all of Africa's capacity for multiple providers and the public Internet. For status updates, please see https://azure.status.microsoft/
We are experiencing delayed test execution in our Johannesburg, South Africa location. Users may see gaps between check runs in this location and metrics from those runs may be higher than normal. We will monitor the provider's status page and provide updates.
Report: "Delayed Monitoring Check Results"
Last updateThis incident has been resolved.
We're currently investigating delays in processing Monitoring check results from all locations. No check data has been lost, but alerts may be delayed and you may see gaps in your check data until this data has been processed. We are working to identify the root cause and will update here when it is resolved.
Report: "Delayed Check Runs in Miami"
Last updateThis incident has been resolved.
We are currently investigating delayed check runs from our Miami monitoring location.
Report: "Checks running from Illinois location delayed"
Last updateCheck running from Illinois location are delayed. This has been resolved as of Oct 7, 6:20 am EDT
Report: "Delayed Check Runs in Miami"
Last updateThis incident has been resolved.
We are currently investigating delayed check runs from our Miami monitoring location.
Report: "Check results are not appearing in web interface"
Last updateThis incident has been resolved. There was no data loss, however check results and alerting may have been delayed.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Increased error rate when saving Real Browser checks"
Last updateOn 6th June, 2023 at 01:36 p.m UTC, an internal database issue on the Rigor platform may have impacted business transaction names from Real browser checks. Our engineering team was able to identify the issue and fully resolve the same day. This impacted a limited number of customers. If you think you may have been impacted, please reach out to your support representative.
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating increased error rates when creating or editing Real Browser checks. You may not be able to create new browser checks or edit existing browser checks until the issue is resolved. Other check types are unaffected.
Report: "Increased response times in Illinois"
Last updateThis incident has been resolved. You may see gaps or elevated response times in Illinois run data between approximately 9:09 AM to 11:30 AM EDT (13:09 to 15:30 UTC). While remediating the issue, some runners were inadvertently deployed with a previous runner version. The latest version has been restored, but you may see some Illinois runs in your check history with an old runner version between 10:53 AM to 11:37 AM EDT (14:53 to 15:37 UTC).
A fix has been implemented and we are monitoring the results.
We are currently investigating increased response times and delayed runs in our Illinois monitoring location.
Report: "Intermittent errors for API checks"
Last updateThis incident has been resolved.
We are investigating changes to API check responses. Some API checks may see intermittent errors while we address the issue.
Report: "AWS service disruptions in us-east-1"
Last updateAWS has resolved their issue. "[03:42 PM PDT] Between 11:49 AM PDT and 3:37 PM PDT, we experienced increased error rates and latencies for multiple AWS Services in the US-EAST-1 Region. Our engineering teams were immediately engaged and began investigating. We quickly narrowed down the root cause to be an issue with a subsystem responsible for capacity management for AWS Lambda, which caused errors directly for customers (including through API Gateway) and indirectly through the use of other AWS services. Additionally, customers may have experienced authentication or sign-in errors when using the AWS Management Console, or authenticating through Cognito or IAM STS. Customers may also have experienced issues when attempting to initiate a Call or Chat to AWS Support, as of 2:47 PM, the issue initiating calls and chats to AWS Support was resolved. By 1:41 PM, the underlying issue with the subsystem responsible for AWS Lambda was resolved. At that time, we began processing the backlog of asynchronous Lambda invocations that accumulated during the event, including invocations from other AWS services. As of 3:37 PM, the backlog was fully processed. The issue has been resolved and the service is operating normally."
AWS is reporting service disruptions in us-east-1. We believe Rigor is operating normally, but we are investigating.
Report: "Filmstrips may not be available and instead return - "Due to file size limit, the filmstrip for this page cannot be shown.""
Last updateThis incident has been resolved.
Rollback is complete and filmstrips are now working as expected.
We suspect a recent deployment may be causing this issue, we are in the process of rolling back our deployment.
We are currently investigating this issue.
Report: "Slow response times in Miami, Florida"
Last updateWe observed elevated response times for some checks running in our Miami, Florida monitoring location between Jan 26 11:00 PM EST and Feb 10 4:00 AM EST. We are working with our upstream provider to identify the root cause of the issue.
Report: "Checks not being scheduled across all providers"
Last updateAWS ECS had a temporary issue which caused a service to become unavailable. All services recovered without intervention and should be back to normal.
There is a short period where checks did not run as scheduled.
Report: "Investigating failures related to Chrome update"
Last updateA bug was identified during a Chrome upgrade yesterday which impacted some Chrome Real Browser checks from 16:30 UTC to 21:30 UTC. A fix was identified and deployed and all checks should be normally operating.
We are continuing to monitor for any further issues.
A fix has been deployed and we are monitoring the results.
A fix has been implemented and we are monitoring the results.
We are deploying a fix and working to verify everything is working as expected.
A fix has been implemented and will be deployed shortly.
The issue has been identified and a fix is being implemented.
We are continuing to investigate this issue.
We believe we have identified the issue and are testing a fix.
We are currently investigating this issue.
Report: "Azure locations with degraded performance"
Last updateAn Azure outage, Tracking ID VSG1-B90, affected multiple regions. Most locations recovered without intervention, however a few locations were preemptively scaled up to speed recovery.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Outage in Belgium location"
Last updateThis incident has been resolved.
Our internal service monitoring indicates that there may be delayed execution of checks from our Belgium location. Our engineering team is monitoring the location and will provide additional information as it is available.
Report: "Rigor Optimization Intermittent failures"
Last updateThis incident has been resolved.
We have identified the issue and are working to get a fix out.
We are still investigating this issue. It seems to be occurring on specific hosts in several regions. We have two potential workarounds: - Retry multiple times - Try in another location We are continuing to investigate.
We are seeing an increased error rate when running Optimization snapshots. A small number of tests are affected, and recurring tests are impacted more often than snapshots initiated through the Optimization UI. We are troubleshooting the issue. We will give an update here once a fix is available.
Report: "Checks not running in multiple locations"
Last updateThis incident has been resolved.
The affected locations have been rolled back to a previous version and checks are currently running on schedule from all locations. Our engineers have identified the root cause and are taking steps to ensure the issue does not happen again. You may see gaps in your data from these locations between approximately 5:30pm ET (21:30 UTC) on October 19 and 3:00pm ET (19:00 UTC) on October 25th.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are currently investigating missing runs for all check types in the following GCP locations: Los Angeles, Belgium, Taiwan, Finland, and South Carolina.
We are currently investigating missing runs for all check types in our Los Angeles, California location. Data and alerts from this location may be missing or delayed until the issue is resolved.
Report: "Outage in Miami location"
Last updateThis incident has been resolved.
We are currently processing checks in real time. We will continue to monitor to make sure everything is processing correctly.
We believe we have identified the issue and have a started processing work again in the Miami location. We will update when we are processing checks in real time.
Our internal service monitoring indicates that there is a delayed execution of checks from our Miami location. Our engineering team is monitoring the location and will provide additional information as it is available.
Report: "Outage in Chile location"
Last updateThis incident has been resolved.
The issue has been resolved with our provider and we are processing on schedule now.
Our internal service monitoring indicates that there may be delayed execution of checks from our Chile location. Our engineering team is monitoring the location and will provide additional information as it is available.
Report: "Outage in Chile location"
Last updateThe issue has been resolved with our provider and we are processing on schedule now.
Our internal service monitoring indicates that there may be delayed execution of checks from our Chile location. Our engineering team is monitoring the location and will provide additional information as it is available.
Report: "Spot Instances Dropped Causing Work Task Queue Stacked"
Last updateThe amount of spot instances came back to normal and we marked this incident as been resolved.
We are continuing monitor for any further issues after re-launch a correct version of autoscaling template.
We have identified the issues was caused by a wrong version of autoscale template been launched.
We are currently investigating this issue.
Report: "Network Instability in Shanghai"
Last updateThe network instability has been resolved
We are currently experiencing network instability in our Shanghai location. This may result in failed checks due to networking timeouts. We will update as we gather more info.
Report: "Network Instability in Shanghai"
Last updateThe network instability has been resolved
We are currently experiencing network instability in our Shanghai location. This may result in failed checks due to networking timeouts. We will update as we gather more info.
Report: "Optimization website periodically inaccessible"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Optimization website down"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Increased latency for one location in north central US"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Queues are being processed normally in north central US (Iowa). We'll continue to monitor the situation.
We are currently experiencing longer than average queues for one location in north central US (Iowa) and are investigating.
Report: "Our monitoring services have indicated a delay in performance in Mexico City resulting in delays in executing checks. Our team is investigating and will provide additional information as it's available."
Last updateThis incident has been resolved.
Our hosting provider resolved an issue and everything is now functioning as expected, we will continue to closely monitor.
We are currently investigating this issue.
Report: "Critical outage ongoing"
Last updateThis incident has been resolved.
A fix has been implemented and services are back online. Continuing to monitor.
A fix has been implemented and many realms are returning to service. Continuing to monitor.
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
Root cause has been identified. Engineers are working on rolling out a fix.
We are continuing to investigate this issue.
We are currently investigating an issue causing latency and queue build up in multiple regions. Runs may be delayed.
We are currently investigating an issue causing latency and queue build up in multiple regions. Runs may be delayed.
Report: "Degraded performance in Argentina"
Last updateThis incident has been resolved.
Our team has re-enabled check execution in the Argentina region and is monitoring health and capacity for any potential issues.
Our investigation indicates there are internet backbone connectivity issues. These issues appear to be regional and outside of the control of our provider. Until these connectivity issues have been corrected we have placed the Argentina location into maintenance mode. In this mode, checks will not be scheduled for execution in this location. Any work already scheduled may still be executed.
Our monitoring services have indicated a delay in performance in Argentina resulting in delays in executing checks. Our team is investigating and will provide additional information as it's available.
Report: "Minor gaps in check data"
Last updateThis incident has been resolved.
An AWS outage is causing delays in running checks in all locations. You may see gaps in your data or delayed alerts until the issue is resolved.
Report: "Increased Latency Observed and Resolved"
Last updateWe encountered an issue this morning that resulted in check delays in all locations between ~8:45 to ~10:05 ET (13:45 - 15:05 UTC). You may see gaps in graphs during this timeframe. Currently the incident has been resolved and all checks are running on schedule.
Report: "There is an outage in the providers at the Chile location"
Last updateThis incident has been resolved.
Our internal service monitoring indicates that there may be delayed execution of checks from our Chile location. Our engineering team is monitoring the location and will provide additional information as it is available.
Report: "Increased Check Failures in Los Angeles, California"
Last updateChecks running from our Los Angeles, California monitoring location may have seen increased response times or failure rates due to an issue with our provider in that location. This incident affected runs that occurred between approximately 5:00 PM Feb 7 - 3:00 AM Feb 8 EST (22:00 Feb 7 - 8:00 Feb 8 UTC).
Report: "Delayed Uptime Check Results"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results. Uptime check results and alert notifications are no longer delayed, but you may see dropped runs between approximately 6:30pm Feb 7 - 12:00am Feb 8 Eastern Time (23:30 Feb 7 - 05:00 Feb 8 UTC).
We are continuing to see delays in Uptime check results and are still investigating the root cause. You may see gaps in your check data or receive delayed alert notifications until the issue is resolved.
We are currently investigating delayed results processing for Uptime Checks. You may see gaps in your check data or receive delayed alert notifications until the issue is resolved.
Report: "Delayed Check Runs in Kuala Lumpur, Malaysia"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to see delayed runs in Kuala Lumpur due to a capacity issue with our upstream provider. We will provide an update here when our provider has resolved the issue.
We are continuing to see delayed runs in Kuala Lumpur due to a capacity issue with our upstream provider. We will provide an update here when our provider has resolved the issue.
We are experiencing delayed test execution in our Kuala Lumpur location. We are working with our upstream provider to identify the cause of the issue. We will provide updates as soon as possible.
Report: "Delayed Check Runs in N. California"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are experiencing delayed test execution in our N. California location. We are working to identify the cause of the issue and will provide updates as soon as possible.
Report: "Provider outage causing increased error rates."
Last updateThis incident has been resolved.
At this time most services have recovered and metric data is no longer delayed, but we are continuing to monitor the provider outage until the issue is fully resolved.
We have identified a provider outage that is affecting multiple services. Currently checks are running on schedule, but metric data may be delayed. Domain performance report data may also be delayed.
We are currently investigating an increase in errors being reported from our application. This increase in errors seems to be linked with a provider outage in the eastern region of the united states and seems to be impacting our checks across many regions. We will continue to investigate and update this status page as we learn more.
Report: "Increased error rates in west coast North America"
Last updateThis incident has been resolved.
The issues identified earlier today have appear to be resolved. We are continuing to monitor the situation but our service is operating as expected currently.
We are investigating increased error rates in west coast regions across several of our cloud providers. Checks are still running as expected, but we will continue to monitor the situation and update this status page as we learn more.