Historical record of incidents for Atlas
Report: "Customers may experience delays or failures receiving emails"
Last updateWe were experiencing cases of degraded performance for outgoing emails from Confluence, Jira Work Management, Jira Service Management, Jira, Opsgenie, Trello, Atlassian Bitbucket, Guard, Jira Align, Jira Product Discovery, Atlas, Compass, and Loom Cloud customers. The system is recovering and mail is being processed normally as of 16:45 UTC. We will continue to monitor system performance and will provide more details within the next hour.
Report: "Unable to invite new users due to missing recaptcha token"
Last updateBetween 08:00 UTC to 11:54 UTC, we experienced problems with the invitation of new users for Cloud customers on admin.atlassian.com. The issue has been resolved and the service is operating normally.
We continue to work on resolving the invitation workflow in admin.atlassian.com We have identified the root cause and performed changes in the environment to mitigate the issue.
We are investigating reports of intermittent errors for some Atlassian customers when they are trying to invite users using their admin panels (admin.atlassian.com) We will provide more details once we identify the root cause.
Report: "Some users cannot access https://home.atlassian.com"
Last updateThe fix for this issue has been deployed to production and testing was successfully completed. We have been monitoring for the past 30 minutes and are seeing no new instances of this issue.
We have identified an issue with https://home.atlassian.com that impacts authentication of users who do not have Projects or Goals and have a fix now deployed to production. We are testing this issue now but expect that this incident has been concluded.
We are investigating reports of intermittent errors for some users on Atlas for Cloud customers. When being redirected to home.atlassian.com, customers are landing in an error page (Something went wrong). We are working on resolving the issue.
Report: "Some products are hard down"
Last updateBetween 03-07-2024 20:08 UTC and 03-07-2024 20:31 UTC, Atlas experienced downtime. The issue has been resolved, and the service is operating normally.
We have mitigated the problem and continue looking into the root cause. The outage was between 8:08pm 03/07 UTC - 08:31pm 03/07 UTC We are now monitoring closely.
We are investigating an issue with <FUNCTIONALITY IMPACTED> that is impacting <SOME/ALL> Atlassian, Atlassian Partners, Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira, Opsgenie, Atlassian Developer, Atlassian (deprecated), Trello, Atlassian Bitbucket, Guard, Jira Align, Jira Product Discovery, Atlas, Atlassian Analytics, and Rovo Cloud customers. We will provide more details within the next hour.
Report: "Error responses across multiple Cloud products"
Last updateBetween 22:18 UTC to 22:56 UTC, we experienced errors for multiple Cloud products. The issue has been resolved and the service is operating normally.
We are investigating an issue with error responses for some Cloud customers across multiple products. We have identified the root cause and expect recovery shortly.
Report: "Intermittent errors and slow user experience when trying to access Atlas"
Last updateThis incident has been resolved.
We are continuing to monitor and to investigate the issue to determine root cause. No further impact since recovery 9 hours ago.
Atlas has become responsive a while back and everything seems to be fine. Currently unknown root cause so moving to Monitoring till we understand what caused underlying issue.
We are currently investigating the issue to determine root cause and resolve.
Report: "Admin Portal Feature Access Issue"
Last updateBetween 6:30 AM UTC to 9:50 AM UTC, we experienced failures in accessing some features from the Admin Portal. The issue has been resolved and the service is operating normally.
We are investigating an issue causing failures in accessing some features from the Admin Portal, which is impacting some of our Cloud customers. We have identified the root cause and anticipate recovery shortly.
Report: "Investigating new product purchasing"
Last updateBetween 28th Feb 2024 23:15 UTC to 29th Feb 2024 00:05 UTC, we experienced issue with new product purchasing for all products. All new sign up products have been successfully provision and confirmed issue has been resolved and the service is operating normally.
We are investigating an issue with new product purchasing that is impacting for all products. Customers adding new cloud products may have experienced a long waiting page or an error page after attempting to add a product. We have mitigated the root cause and are working to resolve impact for customers who attempted to add a product during the impact period. We will provide more details within the next hour.
Report: "Service Disruptions Affecting Atlassian Products"
Last update### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
We experienced increased errors on Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, Compass, and Atlassian Analytics. The issue has been resolved and the services are operating normally.
We have identified the root cause of the Service Disruptions affecting all Atlassian products and have mitigated the problem. We are now monitoring this closely.
We have identified the root cause of the increased errors and have mitigated the problem. We continue to work on resolving the issue and monitoring this closely.
We are investigating reports of intermittent errors for all Cloud Customers across all Atlassian products. We will provide more details once we identify the root cause.
Report: "HOT-106981: Outage in Atlassian Intelligence functionality in multiple products"
Last updateBetween 23:45 UTC to 00:30 UTC, we experienced an outage in some Atlassian Intelligence features for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Product Discovery, Atlas, and Compass. The issue has been resolved and the service is operating normally.
We have identified the root cause of the increased errors and have mitigated the problem. We are now monitoring closely.
We are investigating an issue with Atlassian Intelligence that is impacting some Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Product Discovery, Atlas, and Compass Cloud customers. We will provide more details within the next hour.
Report: "Atlassian's cross product user search service is currently degraded."
Last update### **SUMMARY** On Dec 18, 2023, between 12:29 p.m. and 3:35 p.m. UTC, Dec 18, 2023, Atlassian's cloud customers using Atlas, Bitbucket Cloud, Compass, Confluence Cloud, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery products were unable to search for users or use the "@mention" functionality. Customers' user search results failed or were delayed as Atlassian's service returning user search results was degraded in several regions. The incident originated from a computationally intensive operation that was triggered multiple times in rapid succession, resulting in degraded performance of Atlassian's user search service across several regions. Notably, customers in the EU west region were most affected. The incident was detected within 2 minutes by automated monitoring, and our team promptly took action by recovering unhealthy systems and scaling up the service's infrastructure temporarily. The resolution process concluded in 3 hours and 06 minutes. ### **IMPACT** The overall impact was between Dec 18, 2023, between 12:29 p.m. UTC and Dec 18, 2023, 3:35 p.m. UTC. The Incident caused service disruption to cloud customers worldwide. Customers experienced delayed or failed user searches when using the following Atlassian cloud products: * Atlas * Bitbucket Cloud * Compass * Confluence Cloud * Jira Service Management * Jira Software * Jira Work Management * Jira Product Discovery ### **ROOT CAUSE** The incident stemmed from Atlassian's user search service receiving commands to process multiple computationally intensive operations in rapid succession. These operations were directed at the same customer data set, and therefore overloaded resources within a clustered database system, leading to memory exhaustion and subsequent unresponsiveness to user search requests. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** To prevent a recurrence of such incidents, we are implementing the following measures: * Implement a mechanism to queue computationally intensive operations in order to avoid overloading the resources within the systems and process them without impact on customer experience. * Fine-tune our clustered database settings to mitigate the impact of resource exhaustion on the overall system. We apologize to customers whose services were affected during this incident; we are taking immediate steps to improve the service’s resiliency. Thanks, Atlassian Customer Support
It has been resolved. Atlassian's cross product user search is working.
Atlassian's cross product user search service is currently healthy. Searches for users within Atlassian products are working as expected. We are in the process of investigating the root cause of this incident.
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
Atlassian's cross product user search service is recovering. Searches for users within Atlassian products are returning to normal.
We are investigating reports of intermittent errors for <SOME/ALL> Atlassian, Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Align, Jira Product Discovery, Atlas, and Compass Cloud customers. We will provide more details once we identify the root cause.
Report: "Forge Function Invocations outage impacting Smartlinks"
Last updateForge Invocations had an 8 minute outage between 2023-11-29 03:05:13 UTC to 2023-11-29 03:13:27 UTC resulting in Smart Links failing. This service has recovered post this time period.
Report: "Delayed email and Slack notifications"
Last updateBetween 7 November, 9:30pm UTC to 8 November, 7:30am UTC, we experienced degraded email messages for Atlas. The issue has been resolved and the service is operating normally.
We are investigating cases of delayed email and Slack notifications for some Atlas customers. We will provide more details within the next 24 hours.
Report: "HOT-106056 Atlassian Intelligence functionally completely down"
Last updateBetween 13:40 UTC to 15:30 UTC, we experienced an outage in all Atlassian Intelligence features for Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas. The issue has been resolved and the service is operating normally.
We continue to work on resolving the outage affecting Atlassian Intelligence related features for Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas. We have identified the root cause and expect recovery shortly.
Report: "Degraded experience for Atlassian Intelligence features"
Last updateBetween 23:51 19th Oct, 2023 UTC to 03:30 20th Oct, 2023 UTC, we experienced a service degradation of Atlassian Intelligence capabilities in Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas. The issue has been resolved and the service is operating normally.
OpenAI has mitigated the problem and we are currently not seeing any errors w.r.t. Atlassian Intelligence capabilities. We are now monitoring closely.
OpenAI team has applied a fix and we are seeing a reduction in the failure rate for Atlassian Intelligence capabilities in Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas. We continue to closely monitor our systems and coordinating with OpenAI for a faster recovery of Atlassian Intelligence capabilities.
We continue to work on resolving the failures in Atlassian Intelligence capabilities for Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas. The cause of incident is increased API failure rate from OpenAI, we are in touch with them to understand the time to recover.
We are investigating reports of intermittent errors in Atlassian Intelligence capabilities for some Confluence, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, and Atlas Cloud customers. We will provide more details once we identify the root cause.
Report: "Atlas is unavailable"
Last updateBetween 5:07am UTC to 6:14, we experienced an outage for Atlas. The issue has been resolved and the service is operating normally.
We are investigating an issue with Atlas that is impacting all Atlas Cloud customers. We will provide more details within the next hour.
Report: "Atlassian Account login issues"
Last update### **SUMMARY** On Sep 13, 2023, between 12:00 PM UTC and 03: 30 PM UTC, some Atlassian users were unable to sign in to their accounts and use multiple Atlassian cloud products. The event was triggered by a misconfiguration of rate limits in an internal service which caused a cascading failure in sign-in and signup-related APIs. The incident was quickly detected by multiple automated monitoring systems. The incident was mitigated on Sep 13, 2023, 03: 30 PM UTC by the rollback of a feature and additional scaling of services which put Atlassian systems into a known good state. The total time to resolution was about 3 hours & 30 minutes. ### **IMPACT** The overall impact was between Sep 13, 2023, 12:00 PM UTC and Sep 13, 2023, 03: 30 PM UTC on multiple products. The Incident caused intermittent service disruption across all regions. Some users were unable to sign in for sessions. Other scenarios that temporarily failed were new user signups, profile retrieval, and password reset. During the incident we had a peak of 90% requests failing across authentication, user profile retrieval, and password reset use cases. ### **ROOT CAUSE** The issue was caused due to a misconfiguration of a rate limit in an internal core service. As a result, some sign-in requests over the limit received HTTP 429 errors. However, retry behavior for requests caused a multiplication of load which led to higher service degradation. As many internal services depend on each other, the call graph complexity led to a longer time to detect the actual faulty service. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We are continuously improving our system's resiliency. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Audit and improve service rate limits and client retry and backoff behavior. * Improve scale and load test automation for complex service interactions. * Audit cross-service dependencies and minimize them where possible related to sign-in flows. Due to the unavailability of sign-in, some customers were unable to create support tickets. We are making additional process improvements to: * Enable our unauthenticated support contact form and notify users that it should be used when standard channels are not available. * Create status page notifications more quickly and ensure that for severe incidents, notifications to all subscribers are enabled. We apologize to users who were impacted during this incident; we are taking immediate steps to improve the platform’s reliability and availability. Thanks, Atlassian Customer Support
Between 12:45 UTC to 15:30 UTC, we experienced login and signup issues for Atlassian Accounts. The issue has been resolved and the service is operating normally. We will publish a post-incident review with the details of the incident and the actions we are taking to prevent similar problem in the future.
We are no longer seeing occurrences of the Atlassian Accounts login errors, all clients should be able to successfully login now. We will continue to monitor.
We can see a reduction in the Atlassian Accounts login issues after the mitigation actions were taken. We are still monitoring closely and will continue to provide updates.
We have identified the root cause of the Atlassian Accounts login issues impacting Cloud Customers and have mitigated the problem. We are now monitoring this closely.
We are investigating an issue with Atlassian Accounts login that is impacting some Cloud customers. We will provide more details within the next hour.
Report: "Atlassian services unable to scale"
Last updateWe have resolved a case of degraded performance for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, and Compass Cloud customers. We will provide more details within the next hour.
Report: "Sign-ups, Product Activation, and Billing not working"
Last updateWe mitigated the issue with Sign-ups, Product Activation, and Billing, and the systems are back to BAU, and all functionality is restored.
We have identified the root cause of the Sign-ups, Product Activation, and Billing not working and have mitigated the problem. We are now monitoring closely.
We are investigating an issue with Sign-ups, Product Activation, and Billing that is impacting all of our Cloud Customers. We will provide more details within the next hour.
Report: "Performance issues and outages with Cloud products"
Last updateWe experienced performance issues and outages for several Atlassian Cloud Products. The issue has been resolved and the service is operating normally.
We have identified the root cause of an issue with an internal infrastructure component that has been impacting multiple Cloud products, including Jira Software, Jira Service Management and Confluence, and customers. This issue had lead to a performance impact and, in some cases, outages. We have implemented a fix to resolve the issue and recovery is in progress.
We are investigating an issue with an internal infrastructure component that is impacting multiple Cloud products, including Jira Software, Bitbucket, Jira Service Management and Confluence, and customers. These issues include performance impact and, in some cases, outages. Users may experience slow loading and uploading of attachments, login issues or inability for new customers to sign up. We have identified the root cause and are actively working on the service recovery.
Report: "Media capabilities degraded"
Last update### **SUMMARY** On May 15, 2023, between 02:36 and 04:08 UTC, Atlassian customers using Bitbucket, Confluence, Jira Align, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, and Atlas products with services hosted in the us-west-1 region were impacted by an incident related to the storing and retrieval of data assets, including media, attachments and build artifacts. The event was triggered by a network migration of an internal service as part of an initiative to increase security by hardening partitions between network segments. The incident was detected within three minutes by automated monitoring and mitigated by a rollback of the change which put Atlassian systems into a known good state. The total time to resolution was about one hour and 32 minutes. ### **IMPACT** The impact across products was: * Bitbucket - Bitbucket Pipelines self-hosted builds were failing, access to Git LFS failed and cloud-hosted builds were delayed. * Confluence, Jira Align, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, and Atlas - media capabilities \(images, videos, documents, audio\) were affected and it was not possible to upload, download or view existing media attachments or files. The service disruption lasted for one hour and 32 minutes between May 15, 2023, 02:36 and May 15, 2023, 04:08 UTC and caused service disruption to customers with services hosted in the us-west-1 region. ### **ROOT CAUSE** The issue was caused by an attempted migration of a service to a new network segment. As part of this migration, a DNS record pointing to the old network segment was not updated, which resulted in failure when the old network stack was removed. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified as moving services across network segments is not a regular activity and is difficult to accurately replicate in a test environment. To mitigate against these types of issues, we made this change using blue/green deployment practices but failed to run adequate verification steps before decommissioning the old stack. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We are prioritizing the following improvement actions to avoid repeating this type of incident: * Reviewing our systems that decommission service stacks and implementing checks that customer traffic is no longer being served prior to decommissioning the stacks; and * As part of our service network migration process, we are adding steps to identify when there are associated DNS records that require attention. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Between 2023-05-15 02:40 UTC to 2023-05-15 04:09 UTC, we experienced an partial outage for Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, and Atlas. The issue has been resolved and the service is operating normally.
We have identified the root cause of the outage and have mitigated the problem. We are now monitoring closely.
We are investigating reports of intermittent errors for <SOME/ALL> Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, and Atlas Cloud customers. We will provide more details once we identify the root cause.
Report: "Atlassian cloud product signup timeout"
Last updateBetween 08:20 UTC to 09:30 UTC, we experienced degraded performance on new account signups for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Atlas, and Compass. The issue has been resolved and the service is operating normally.
We are investigating reports of intermittent errors for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Atlas, and Compass Cloud customers. We will provide more details once we identify the root cause.