Is Sleuth Down Right Now? Check if there is a current outage ongoing.

Sleuth is currently Operational

Last checked 2025-06-14T12:35:49.494Z from Sleuth's official status page

Historical record of incidents for Sleuth

May 22, 2025

Report: "Website down"

Last update 2025-05-22T19:08:47.880Z

resolved2025-05-22T12:00:00.000Z

We experienced minor issues during the deploy of a new version, which caused our website and services to be unavailable for a period of 4 minutes.

Report: "Website down"

Last update 2025-05-22T07:00:00.000Z

Resolved2025-05-22T07:00:00.000Z

We experienced minor issues during the deploy of a new version, which caused our website and services to be unavailable for a period of 4 minutes.

Mar 5, 2025

Report: "Service unavailability"

Last update 2025-03-05T02:10:16.035Z

resolved2025-02-27T23:30:00.000Z

We were experiencing service unavailability for period of 7 minutes caused by network configuration issues.

Oct 14, 2023

Report: "Reduced performance across the board"

Last update 2023-10-14T23:38:17.492Z

resolved2023-10-14T23:38:13.758Z

Why, oh why is 'ANALYZE VERBOSE' not automatically ran on a major Postgres upgrade :( (we are all green across the board, and even faster, if you can believe it)

monitoring2023-10-14T21:02:05.201Z

A fix has been implemented and we are monitoring the results.

investigating2023-10-14T21:01:56.078Z

We identified a database issue related to the recent upgrade, and performance seems to be returning to normal. We will continue to monitor the situation

investigating2023-10-14T19:14:42.822Z

We are currently investigating this issue.

Sep 19, 2023

Report: "Delayed deploy and impact processing"

Last update 2023-09-19T06:22:47.685Z

resolved2023-09-19T06:22:47.665Z

This incident has been resolved.

monitoring2023-09-18T20:37:32.701Z

The underlying infrastructure issue has been resolved and Sleuth is again fully operational. We're still actively monitoring the situation.

identified2023-09-18T19:27:38.847Z

The underlying AWS infrastructure problem was identified. To ensure data consistency we will delay processing of deploy and impact data until the issues are resolved.

investigating2023-09-18T19:00:36.619Z

We are currently experiencing a degradation of service due to infrastructure network issues. Please stand by as we investigate possible resolution.

Jul 19, 2023

Report: "Performance Degraded"

Last update 2023-07-19T16:20:21.859Z

resolved2023-07-19T16:20:21.842Z

This incident has been resolved.

investigating2023-07-19T14:09:22.000Z

We are currently investigating degraded performance on the sleuth application.

Jun 21, 2023

Report: "Degraded Application Performance"

Last update 2023-06-21T22:54:53.560Z

resolved2023-06-21T22:54:53.549Z

This incident has been resolved.

monitoring2023-06-21T19:54:43.990Z

The web application and deploy processing are no longer experiencing a performance degradation and we are actively monitoring them.

investigating2023-06-21T19:14:35.808Z

The sleuth application is still experiencing delays in deploy processing and slower than normal load times. We have mitigated one root cause, and are still investigating the continued performance issues.

investigating2023-06-21T16:24:25.225Z

We are currently investigating degraded performance of the sleuth website & deploy processing

Feb 16, 2023

Report: "Deploys incorrectly marked as rolled back"

Last update 2023-02-16T23:46:14.920Z

resolved2023-02-16T23:46:13.660Z

We have corrected data for all impacted rollback deploys.

monitoring2023-02-15T19:49:20.866Z

We are working to correct the incorrectly marked rollback deploys.

identified2023-02-15T16:23:49.117Z

A bug in the system caused deploys from 2023-02-02 22:31 UTC and 2023-02-03 09:24 UTC to be processed and to be incorrectly marked as rolled back. The team is working on correcting the deploys incorrectly marked as rolled back.

Jan 19, 2023

Report: "Immediate Session Expiration: All Users"

Last update 2023-01-19T17:28:08.403Z

resolved2023-01-19T17:27:48.372Z

This morning, a security attack vector was discovered by a paid independent researcher. There is no evidence that this attack vector has been exploited. This has been addressed by our team and the vector has been closed at this time. Out of an abundance of caution, we have logged out all Sleuth users. The only action required from you is logging back in the next time you access Sleuth. Thank you for your understanding and continued trust. Please reach out to us if you have any questions regarding this matter.

Jan 6, 2023

Report: "We're experiencing a delay in actions processing"

Last update 2023-01-06T11:17:08.428Z

resolved2023-01-06T11:17:07.993Z

Action execution has been running normally and has completely stabilized. The incident has been resolved.

monitoring2023-01-06T11:03:50.309Z

We have stabilized the execution of affected actions. We are continuing to monitor the performance, but you should be seeing normal behavior with action execution and Slack messages.

identified2023-01-06T09:51:32.478Z

The issue has been identified that is causing a slowdown in the following areas: * Sleuth actions evaluations * Slack message delivery * PR locking You will experience delayed Sleuth actions evaluations, Slack message delivery, and PR locking. All actions are still being registered and will be executed at a later time. No deploy data is being lost.

Jul 8, 2022

Report: "Impact collection is delayed"

Last update 2022-07-08T10:42:28.237Z

resolved2022-07-08T10:42:28.221Z

Impact collection has been running normally and has completely stabilized. The incident has been resolved.

monitoring2022-07-08T09:43:06.362Z

We have stabilized impact collection and it's running normally now.

investigating2022-07-08T09:30:15.808Z

We are currently investigating an issue causing us to collect impact at a delayed rate.

Jul 2, 2022

Report: "We're experiencing a delay in detecting deploys from CI/CD integrations"

Last update 2022-07-02T07:55:42.345Z

resolved2022-07-02T07:55:41.439Z

Deploy detection via CI/CD integrations is now operating normally. We've implemented a work around that allows us to mitigate this kind of issue moving forward and at the same time the provider has completed their maintenance.

identified2022-07-02T06:54:00.008Z

We've identified an issue with processing deploys from CI/CD integrations. One of our supported CI/CD integrations is taking maintenance and this has revealed an issue with how we handle this situation. You will experience delayed deploy detection through CI/CD providers while we mitigate the issue. Webhook deploy processing is functioning normally but is also slightly delayed.

Nov 30, 2021

Report: "We're seeing site-wide slowdowns, we're investigating an increase in DB operations"

Last update 2021-11-30T23:39:24.790Z

resolved2021-11-30T23:39:24.772Z

This incident has been resolved.

monitoring2021-11-30T23:08:35.637Z

We've identified the problem and remediated it. We had a few very long running queries get stuck in our database which had negative follow on effects. We've killed the offending queries and removed the code that trigged them. We'll be following up with a change to stop this kind of vector in the near future.

investigating2021-11-30T22:25:07.568Z

We are seeing slow downs related to increased queries against our database. We are investigating the cause.

Oct 25, 2021

Report: "Impact tracking has been suspended for a short time"

Last update 2021-10-25T22:31:37.956Z

resolved2021-10-25T22:31:37.239Z

We have reenabled impact collection for all.

identified2021-10-25T18:34:30.900Z

We are seeing some issues related to collecting impact. We've temporarily suspended impact collection. We will reenable within a few hours.

Oct 15, 2021

Report: "We are seeing some issues with our Redis instance"

Last update 2021-10-15T13:27:08.076Z

resolved2021-10-15T13:27:07.495Z

We're back to fully operational.

monitoring2021-10-15T13:06:22.134Z

We identified the issue. Our background tasks were creating keys that weren't being cleaned up and eventually chewed up most of our storage. We've cleared those keys and are putting in place a way to stop this from happing moving forward. The service is now restored to normal and we are monitoring.

investigating2021-10-15T12:25:05.018Z

Users may see some sporadic errors and impact processing may be delayed. We are investigating the issue.

Jul 18, 2021

Report: "Website down"

Last update 2021-07-18T04:53:31.449Z

resolved2021-07-18T04:52:54.000Z

Website unresponsive, investigating

Feb 15, 2021

Report: "We're seeing a slowdown on all provided services"

Last update 2021-02-15T18:28:45.706Z

resolved2021-02-15T18:28:45.274Z

We identified the issue and the site is back to fully functional. Our primary DB was running low on disk IOPS credits. We've increased our RDS instance size and storage size which has significantly increased our available IOPS.

identified2021-02-15T17:49:29.315Z

We've identified an issue with our DB such that we are seeing slower performance than usual. The site is still operational but is running at a reduced capacity.

Feb 11, 2021

Report: "We are experiencing downtime related to a bad migration"

Last update 2021-02-11T18:32:58.903Z

resolved2021-02-11T18:32:58.532Z

We've successfully re-run an updated version of the migration and all systems are back to normal.

monitoring2021-02-11T17:36:02.942Z

We have cleared the problem. A migration locked our main deploys table and killing the initiating process did not clear the lock. We've cleared the lock and the site has resumed it's normal functioning. We're monitoring to make sure everything is completely back to normal.

investigating2021-02-11T16:54:04.140Z

We are continuing to investigate this issue.

investigating2021-02-11T16:51:23.450Z

We're investigating the cause now

Jan 13, 2021

Report: "Site unavailable due to a bad deploy"

Last update 2021-01-13T01:15:51.168Z

resolved2021-01-13T01:15:49.437Z

We've fully resolved the issue.

monitoring2021-01-12T22:18:53.611Z

We are continuing to monitor for any further issues.

monitoring2021-01-12T22:18:48.164Z

We have rolled back the bad change and the site is available again. We are monitoring and will update this incident as we learn more.

investigating2021-01-12T22:15:36.755Z

We are investigating site issues that seem due to a bad code deploy. We will update as soon as we have more details.

Dec 17, 2020

Report: "Impact collection is delayed"

Last update 2020-12-17T18:06:07.123Z

resolved2020-12-17T18:06:06.593Z

This incident has been resolved.

monitoring2020-12-17T17:30:39.185Z

We have rolled out a fix and impact collection has returned to normal. We're just monitoring a bit before we resolve this incident.

identified2020-12-17T16:59:24.940Z

We've identified an issue with our impact collection being delayed for new deploys. We're working on a fix and will update this incident as we progress.

Oct 6, 2020

Report: "We're having trouble with our background jobs"

Last update 2020-10-06T00:31:33.436Z

resolved2020-10-06T00:31:32.895Z

The behavior of the application has returned to normal. The issue was a bad deploy where we changed the threading model of our background jobs. Some of the libraries we depend on were not supported in this new model. We have reverted to the old model for now.

identified2020-10-05T23:45:53.131Z

We are having trouble with out background processing. We're working on a fix.

Jul 8, 2020

Report: "We are seeing an issue collecting data from sources that are authenticated via an API key"

Last update 2020-07-08T19:21:51.007Z

resolved2020-07-08T19:21:50.379Z

We've restored the integrations and all things are working again. If you see any issues please contact support.

identified2020-07-08T18:49:13.909Z

We have identified the issue and are working to resolve normal operations. Integrations that are authenticated via API key are affected. This includes Jira, Sentry, Rollbar, Honeybadger, Datadog, CircleCI. We will be able to restore full service once we've worked through the root cause.

Nov 18, 2019

Report: "Performing a major server upgrading"

Last update 2019-11-18T21:38:11.227Z

resolved2015-10-05T01:02:19.777Z

Service is back to normal.

identified2015-10-05T00:55:58.159Z

We've run into issues with the upgrade and are in the process of rolling back.

identified2015-10-05T00:20:27.889Z

We're currently performing a major server upgrade. The service will be unavailable for about 15 minutes.