Historical record of incidents for Rootly Production
Report: "Experiencing issues connecting to Slack"
Last updateSlack has resolved their incident: https://slack-status.com/2025-05/7b32241eb41a54aa.
An upstream Slack incident has resulted in certain users being unable to load the platform. https://slack-status.com/2025-05/7b32241eb41a54aa
Report: "Intermittent Latency for Status Pages"
Last updateThis incident has been resolved.
We're investigating user reports of public status pages not loading.
Report: "Intermittent Latency in Web Application"
Last updateThis issue has been resolved.
Rootly on-call and ability to create incidents is working as expected.
We are continuing to investigate this issue.
Intermittent latency in Web application, team is investigating.
Report: "Degraded Performance Across the Web Platform and Slack"
Last updateThe issue affecting Web and Slack performance has been resolved. All services are functioning normally. We appreciate your patience during this incident and apologize for any inconvenience caused.
We are currently investigating an issue impacting the performance of Slack and web-based services. Users may experience interruptions in accessing the Web Platform and Rootly Slack commands.
Report: "Continued Slack Outage (3rd Party) Impacting Rootly Slack App"
Last updateSlack reported this morning issues impacting their app have been resolved. However, historical events are still attempting to get replayed through their system queue. Stay up to date with Slack: https://slack-status.com/2025-02/1b757d1d0f444c34.
We are still noticing a minority of errors failing on Slack's end due to their outage impacting /rootly commands on Slack. Stay up to date with Slack: https://slack-status.com/2025-02/1b757d1d0f444c34.
Report: "Slack Outage (3rd Party) Impacting Rootly Slack App"
Last updateError rates have come down significantly. Scattered delays when creating channels. The /rootly command is now working as intended. Stay up to date with Slack: https://slack-status.com/2025-02/1b757d1d0f444c34.
Our engineering team is continuing to monitor closely. We've pushed a few changes to modify our retry logic despite Slack API reporting an outage. We are continuing to look for alternative methods. Stay up to date with Slack: https://slack-status.com/2025-02/1b757d1d0f444c34.
Latency and failure rates from Slack remain quite high still https://slack-status.com/2025-02/1b757d1d0f444c34.
Slack is running into an incident that is impacting /slash commands (e.g. /rootly new) and messaging capabilities https://slack-status.com/2025-02/1b757d1d0f444c34. This means button interactions and messages are delayed or will fail completely. Some users have also reported being unable to load Slack completely. Rootly On-Call and Web Platform remain unaffected.
Report: "Delayed Workflows"
Last updateThis incident has been resolved.
The issue has been resolved and latencies are back to normal. We are currently monitoring. Rootly On-Call and incident creation remains unimpacted.
A handful of workflows are currently delayed due to latency in our queue. We are investigating the issue. Rootly On-Call and incident creation remains unimpacted.
Report: "Login Issues on Web Platform"
Last updateLogin issues have been resolved.
We are investigating an issue affecting customers ability to log into Rootly.com. On-Call, Slack, and incident creation remain unaffected.
Report: "Web Platform and Slack Inaccessible"
Last updateOur Web Platform and Slack were inaccessible for roughly 4 minutes before a fix was applied. Rootly On-Call and paging was unaffected.
Report: "Incident Creation from Slack Unavailable (Web Platform + On-Call Not Impacted)"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
Incident creation from Slack is impacted from an upstream provider issue but Web Platform and Rootly On-Call is not impacted.
Report: "Delayed Workflow Runs"
Last updateMainly affecting customers who are actively trialling Rootly, we've made an upgrade that has caused a slight queuing of workflow runs.
Report: "Partial Outage Across Web Platform and Slack"
Last updateThe incident has been resolved. Thank you for your patience.
We are investigating a partial outage affecting some users from accessing the Web Platform and Slack. Rootly On-Call and paging is not impacted.
Report: "Delayed Sending of Emails"
Last updateThis incident has been resolved and was due to an upstream 3rd party issue via SendGrid. Rootly availability and incident creation was not impacted.
Some emails sent through Rootly are queuing in SendGrid and processing after sitting in the queue for extended periods of time. We are currently debugging with SendGrid support.
Report: "Degraded Performance Across the Web Platform and Slack"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are experiencing degraded performance across the application. Incident creation is not impacted, and On-Call is not impacted.
Report: "Degraded Performance Across the Web Platform and Slack"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
Report: "Rootly.com is not loading"
Last updateThis incident has been resolved.
Report: "Intermittent Failures with User Sign In and Incident Creation"
Last updateThis incident has been resolved. RCA will follow and be sent to customers.
The issue has been identified and the team is aware of the root cause. A fix is being implemented.
We are continuing to investigate this issue. Slack, Web, and API are currently impacted. Rootly On-Call remains online without impact.
We are continuing to investigate this issue.
The team is currently investigating issues with users signing into Rootly Web Platform.
Report: "Incident timeline loading slower than normal"
Last updateThis issue has been resolved.
Our incident timeline on our Web Platform is loading slower than usual. Incident creation is unaffected.
Report: "Incident Slack channel creation impact"
Last updateThis incident has been resolved.
Incident Slack channel creation is currently backed up. We are deploying a rollback. Incident creation on the Rootly Web Platform is working normally.
Report: "Outage impacting Rootly Slack App"
Last updateThis incident has been resolved.
We have identified the issue and currently deploying a fix.
Report: "Workflow runs are delayed"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results
We are seeing a backlog of workflows queued waiting to be executed. The team is currently shipping a fix.
Report: "Workflow runs are delayed"
Last updateThis incident has been resolved.
We are seeing a backlog of workflows queued waiting to be executed. The team is currently shipping a fix.
The team has noticed a latency in workflow workers.
Report: "Delays on Slack and Web Platform"
Last updateLooks like everything is working as normal, including incident creation in Slack. We had an upstream hiccup caused by Cloudflare.
We are continuing to investigate this issue. Our engineering team has a fix currently being deployed.
We are continuing to investigate this issue. Our engineering team has a fix currently being deployed.
We are noticing our background jobs processing slower than normal. Any workflows queued to run are behind schedule. Our engineering team has identified the issue and is currently deploying a fix.
Report: "Incident search on Web Platform missing results"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Report: "Slack Outage (3rd Party) Impacting Rootly Slack App"
Last updateSlack has restored full functionality to their services and has resolved their incident. Rootly's services are fully restored as a result. Details can be found on Slack's status page here: https://status.slack.com/2023-07/08e3781ccbef33d5
Slack is currently experiencing an outage which subsequently is affecting Rootly's Slack application. Expect Slack related actions (including those outside of Rootly's Slack app) to be affected (e.g. channel creation, notifications, etc.). Rootly.com Web App continues to function as normal and can still create incidents from there. You still can create incident through our Web Interface. Updates from their incident: https://status.slack.com/2023-07/08e3781ccbef33d5
Report: "Partial Outage on Slack Impacting Rootly Users"
Last updateSlack has resolved their incident affecting channel creation and general access. Workflows, channels, and Slack access are no longer experiencing issues. You may need to reload Slack (Cmd/Ctrl + Shift + R) to see the fix on your end. Details on Slack's status page here: https://status.slack.com/2023-06/20df18536cc9524b
Slack continues to have a partial incident affecting channel creation and general access. More details: https://status.slack.com/2023-06/20df18536cc9524b. Rootly's Web Platform continues to work normally. We continue to track and monitor closely.
Slack is currently having a partial incident affecting channel creation and general access. We are noticing a handful of workflows related to Slack failing for some users. More details: https://status.slack.com/2023-06/20df18536cc9524b. Rootly's Web Platform continues to work normally. We continue to track and monitor closely.
Report: "Creation of Slack Channels is Failing"
Last updateThis incident has been resolved.
The fix has been deployed and we are monitoring for further issues.
We are continuing to work on a fix for this issue.
A fix has been identified and is currently being deployed.
We are currently investigating this issue.
Report: "Small percentage of users seeing channel loading and adding user errors in Slack due to Slack related incident"
Last updateThis incident has been resolved as Slack has pushed a fix. See details: https://status.slack.com/2023-05/f1df0cac8d8d8d68.
Due to incident that Slack is having and is affecting a small percentage of users. The app is behaving slower than normal for some users in those cases. Slack tracking details: https://status.slack.com/2023-05/f1df0cac8d8d8d68. Retrying resolves most issues and the Rootly Web Platform remain operational.
Report: "Users experiencing login issue using SSO authentication"
Last updateSome users experienced issues logging into Rootly.com via SSO between May 19, 2023 12:00 PM UTC to May 19, 2023 3:15 PM UTC. The issue has been resolved.
Report: "Incident Creation Dialog on Slack Timing Out"
Last updateThis incident has been resolved.
Minor - Users creating an incident via /incident new on Slack are receiving a "We had some trouble connecting. Try again?" error. The incident channel is still being successfully created in the background. Rest of the platform working normally.
Report: "Atlassian (Jira + Confluence) and Zoom Integration Failing to Load"
Last updateThis incident has been resolved. Customers specifically impacted (<2%) will be contacted by their CSM directly to re-integrate if needed (no impact to existing workflows). Root cause was our auto-refresh logic for expired tokens had a bug. This has also been resolved with additional guardrails.
Minor - We are investigating an issue where Atlassian (Jira + Confluence) and Zoom integrations are failing. Incident creation is working normally. We believe this could be due to token expiry issues.
Report: "Delayed Incident Creation on Slack and Workflow Queues"
Last updateThis incident has been resolved.
We pushed a fix and now monitoring.
We pushed a fix and now monitoring.
We are noticing Slack channel creation and Workflow runs to be delayed due to backed up queues.
Report: "Slack Incident Creation Errors"
Last updateThis incident has been resolved.
We've noticed Slack Incident creation is failing for some users. Web incident creation is still fine.
Report: "Users reporting not being able to create incidents"
Last updateThis incident has been resolved.
Queues has been cleared up
Our workers queues are backed up. Team is working on a fix
We are currently investigating this issue.
Report: "Upstream Cloud Provider Issues"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Rootly's upstream provider is experiencing intermittent issues.
Report: "Latency in Workflow Workers"
Last updateThis incident has been resolved.
The team has noticed latency in workflow workers. A fix has been made and we are now monitoring the situation.
Report: "[Slack] Some users reporting not able to create incidents"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "[Web + Slack ] Some users reporting not able to create incidents"
Last updateThis incident has been resolved.
A fix has been deployed
We are currently investigating this issue
Report: "Delayed Workflow Processing"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are noticing for a few customers' workflows are taking longer than expected to process (although still successfully).
Report: "Web Platform and Slack unavailable for some regions"
Last updateThis incident has been resolved.
Web Platform and Slack are up and running. Will continue to monitor. Related to Cloudflare issues.
We are currently investigating this issue.
Report: "API unavailable to some regions"
Last updateThis incident has been resolved.
A fix has been implemented, and API is available. Will continue to monitor.
The team is investigating issues of the API being unavailable.
Report: "Intermittent Incident Creation in Slack"
Last updateThe incident has been resolved.
We are currently investigating reports of intermittent ability to create incidents in Slack. Web Platform (Rootly.com) is not impacted.
Report: "Some users are unable to access Web Platform and API"
Last updateThis incident has been resolved.
We identified the issue and applied a fix. Web Platform and API are back online. Will monitor for any additional issues.
Team is investigating issues with accessing Web Platform and API.
Report: "Outage"
Last updateThis incident has been resolved.
We pushed a workaround on our end and everything is now back online.
We are continuing to work on a fix for this issue.
We identified the incident is due to a DNS issue from our upstream service provider. Will continue providing updates.
Currently experiencing an outage across our application, will share more details shortly. Apologies for the inconvenience!