Is Octopus Deploy Down Right Now? Discover if there is an ongoing service outage.

Octopus Deploy is currently Operational

Last checked Jul 29, 2025 23:27 UTC from Octopus Deploy's official status page

Historical record of incidents for Octopus Deploy

May 23, 2025

Report: "octopus.com/downloads degraded"

Last update 2025-05-23T03:34:35.228Z

resolved2025-05-23T03:34:35.207Z

We have fixed the issue with the navigation bar on octopus.com/downloads

monitoring2025-05-23T03:27:27.123Z

We have shipped a change to fix the navigation bar issue and are monitoring to ensure it works in all regions.

identified2025-05-23T01:40:23.717Z

We have identified the issue with the top navigation bar loading and are working on a fix.

investigating2025-05-22T23:35:20.671Z

octopus.com/downloads top navigation bar is failing to load correctly. We are investigating the cause. Links to downloads from this website are still correct. Please reload the page if you cannot see the content.

May 22, 2025

Report: "octopus.com/downloads degraded"

Last update 2025-05-22T22:34:00.000Z

Resolved2025-05-22T22:34:00.000Z

We have fixed the issue with the navigation bar on octopus.com/downloads

Monitoring2025-05-22T22:27:00.000Z

We have shipped a change to fix the navigation bar issue and are monitoring to ensure it works in all regions.

Identified2025-05-22T20:40:00.000Z

We have identified the issue with the top navigation bar loading and are working on a fix.

Investigating2025-05-22T18:35:00.000Z

Mar 28, 2025

Report: "ECS Update steps failing to execute"

Last update 2025-03-28T13:10:58.026Z

resolved2025-03-28T13:10:58.012Z

This incident has been resolved.

monitoring2025-03-28T12:02:52.308Z

We have uploaded a new version of the ECS Update step. Affected customers can manually force a re-acquisition of the updated packages via the following steps 1. Go to Tasks 2. Click Show Advanced filters 3. Check "Include System Tasks" 4. Select the "Acquire new steps and targets" from the Task Type 5. Choose the most previously run task 6. Select "Re-run" from the context menu (3 dots).

identified2025-03-28T11:54:11.983Z

We are aware of an issue where a recent update has caused ECS Update steps to fail deployment.

Mar 5, 2025

Report: "Performance issues identified with Octopus Cloud"

Last update 2025-03-05T22:23:16.104Z

resolved2025-03-05T22:23:16.091Z

Azure have advised us this issue has been mitigated. Octopus Cloud instances in the West Europe region were affected on 5 March from 6 AM to 5 PM UTC. Azure will be providing us with a Root Cause Analysis, after which we can share more information.

investigating2025-03-05T16:17:10.767Z

We continue to see issues with our upstream to instances within the WestEU region. We are awaiting further information from Azure.

investigating2025-03-05T14:17:47.377Z

Octopus Deploy is aware of performance issues for some customers on Octopus Cloud. Our engineers are investigating and we will provide an update soon.

Feb 12, 2025

Report: "ADO 'Use Octopus CLI Tool' step Certificate Issue"

Last update 2025-02-12T21:36:24.210Z

resolved2025-02-12T21:36:24.192Z

We've had no further reports of this issue following the certificate update, and customers have indicated that they are back up and deploying. If you encounter any further issues, please email support@octopus.com

monitoring2025-02-12T16:32:18.104Z

We have updated the certificate chain on our host, which has resolved the verification issues. We are continuing to monitor the outcome of this change.

investigating2025-02-12T13:47:23.944Z

We're investigating an issue with the v5 and below versions of the step 'Use Octopus CLI Tool' on build servers where customers are seeing the error 'unable to verify the first certificate'. This is currently affecting customer deployments due to not being able to install the depreciated version of the OctoCLI. The v6 versions of our ADO extension tasks do not require the OctoCLI so upgrading to our v6 steps will resolve this issue. Find out more information about the migration to the new step version here: https://octopus.com/blog/azure-devops-octopus-v6

Feb 5, 2025

Report: "Some Octopus Cloud customers affected by missing variable substitution on Helm template sources"

Last update 2025-02-05T22:48:24.909Z

postmortem2025-02-05T22:30:53.229Z

# Report and learnings: Missing variable substitution on Helm template sources ###### Author: Kevin Tchang # Summary The Octopus Server `2025.1.6849` update introduced a bug in the _Deploy a Helm Chart_ step, preventing variable substitution from working correctly in certain Helm template value sources. This led to Helm deployment failures for Cloud customers, which had previously succeeded without issue. # Background [Calamari](https://octopus.com/docs/octopus-rest-api/calamari) is a deployment tool used by Octopus to execute deployment tasks on target machines, such as extracting packages and running scripts. It supports the deployment of many built-in Octopus steps, including the _Deploy a Helm Chart_ step, which allows users to deploy Helm charts to Kubernetes clusters. When deploying Helm charts, Octopus allows users to pass values into the Helm release via **Helm Template Value Sources** \(Helm TVS\). These values can come from various sources, including charts, packages, Git repositories, key-values, or inline YAML. The Helm TVS are sent to Calamari during deployment to configure the Helm chart correctly. **Octopus variable substitution** allows dynamic replacement of placeholder variables within deployment configurations. This enables users to store sensitive data or environment-specific values as variables in Octopus and substitute them during deployment. ## Incident timeline _\(All dates and times below are shown in UTC\)_ ##### 23/1/2025 – 18:49 \(5:49 AEDT\) We receive the first reports of Helm deployments failing due to missing variable substitutions. ##### 23/1/2025 – 23:39 \(10:39 AEDT\) Our support team escalated the issue to our engineering teams. ##### 24/1/2025 – 1:35 \(12:35 AEDT\) Our internal incident response process was initiated. ##### 24/1/2025 – 8:58 \(19:58 AEDT\) The fix for the bug is merged `2025.1.7389`, and our Status Page is updated to _Monitoring_. However, due to the timing of a recent Octopus Server upgrade to .NET 9 and concerns about stability, the fix was not immediately rolled out to all Cloud customers. In the meantime, our support team provided assistance with manual upgrades. ##### 27/1/2025 – 22:21 \(9:21 AEDT\) After the long weekend \(due to a public holiday\), the Status Page was updated to _Resolved_ once the fix was confirmed by customers. ## Technical details Helm TVS are transmitted to Calamari as strings within a JSON array structure. Upon receiving these, Calamari parses the JSON array into their respective TVS types \(such as package, chart, inline-YAML, etc.\), which are then used by the `helm upgrade` command. The bug was inadvertently introduced during a recent change aimed at eliminating the need for escaping quotes when using Octopus variable values in Helm TVS. Prior to this change, variable values with unescaped quotes caused parsing errors because the quotes were misinterpreted when processed from the JSON array structure. The change modified the process so that variable substitution occurs after the JSON array is parsed into its respective TVS types, rather than before. However, this substitution was initially applied only to the **inline YAML TVS type**, and not the other four TVS types, which also could contain variables to substitute. This resulted in a regression, and the fix involved applying variable substitution to all TVS types and ensuring it was handled correctly wherever a variable could appear. ## Remediation and next steps At Octopus, ensuring deployment reliability is a top priority. Following this incident, we conducted a comprehensive review to identify areas for improvement. Given the flexibility and complexity of our product features, designing a robust testing process for all scenarios can be challenging. However, we’re taking proactive steps to improve how we test variable evaluation features.

resolved2025-01-27T22:21:36.000Z

This issue has been resolved and will be rolling out to Octopus Cloud customers in the next couple of days. If you are affected by this issue, please get in touch with support@octopus.com

monitoring2025-01-24T08:58:36.815Z

We have fixed the issue, but due to timing and other factors will not be rolling this out immediately. If you are affected by this issue, please get in touch with support@octopus.com who can manually update your instance.

identified2025-01-24T01:35:41.000Z

We have had reports of and have identified an issue with variable substitution with Helm template sources. We are tracking the fix here: https://github.com/OctopusDeploy/Issues/issues/9224

Report: "Some Octopus Cloud customers affected by parsing issue with ConfigMap on Kubernetes deployments"

Last update 2025-02-05T22:29:17.401Z

postmortem2025-02-05T22:04:40.401Z

# Report and learnings: Configure and apply Kubernetes resources step error parsing configmap.yml ###### Author: Kevin Tchang ## Summary In Octopus Server `2025.1.5751`, a bug caused the deployment of Kubernetes config maps containing multi-line variables, when created through the _Configure and apply Kubernetes resources_ step \(built-in Kubernetes step for deploying containers\), to fail. Config maps created using the dedicated Kubernetes config map step, as well as those generated with Raw YAML or Helm steps, were unaffected. This issue impacted Cloud customers, who experienced deployment failures that had previously been successful. The bug was a regression caused by a change supporting manifest reporting for Kubernetes deployment steps, part of an upcoming feature. This change mistakenly caused line breaks in multi-line Octopus variable values to not be properly escaped when substituted into the config map's key-value pairs. The problem became apparent when customers had PEM certificates or JSON blobs that needed to be inserted into the config map. These were replaced verbatim in Calamari, leading to YAML formatting issues due to unescaped line breaks. ## Background The [_Configure and apply Kubernetes resources_](https://octopus.com/docs/kubernetes/steps/kubernetes-resources) step deploys a combination of Kubernetes Deployment, Service, and Ingress resources. It also allows the optional configuration and deployment of an associated Kubernetes ConfigMap and Secret for reference by the Deployment. To support Rolling Update and Blue/Green deployment strategies, ConfigMap and Secret resources must have unique names for each Deployment version. These resources are assigned [computed names](https://octopus.com/docs/kubernetes/steps/kubernetes-resources?q=configmap#configmap-and-secret), which, by default, combine the resource name with the Octopus deployment ID, and are determined only at deployment time. ## Incident timeline _\(All dates and times below are shown in UTC\)_ ##### **22/1/2025 – 7:31 \(18:31 AEDT\)** Began receiving customer reports of an increase in failing Kubernetes deployments. These failures have been observed across various projects, with similar errors related to parsing config maps. Our support team worked with our customers to troubleshoot the reasons for the failures. ##### **22/1/2025 – 10:58 Jan 22, 2025 \(21:58 AEDT\)** Our support team escalated the issue to our engineering teams. ##### **22/1/2025 – 15:51 \(2:51 AEDT\)** Our internal incident response process was initiated. ##### **22/1/2025 – 21:45 \(8:45 AEDT\)** Our engineers logged on and begin to identify the cause of the incident. ##### **23/1/2025 – 1:42 \(12:42 AEDT\)** The fix for the bug is merged, and our Status Page is updated to _Identified._ ##### **23/1/2025 – 2:47 \(13:47 AEDT\)** Our Status Page is updated to _Monitoring_ as we begin the process to expedite the release `2025.1.7128` of the fix to our affected Cloud customers. ##### **23/1/2025 – 7:49 \(18:49 AEDT\)** Status Page updated to _Resolved._ ## Technical details Before the change to support manifest reporting, the Kubernetes container deployment step created associated Kubernetes config maps \(and secrets\) using the `kubectl create` command with the `--from-files` flag, where each config map key-value pair was sent to Calamari as an individual file. This process was updated to use the more standard `kubectl apply -f` method, where Octopus now sends a single YAML manifest to Calamari representing the config map. The YAML is generated from a config map resource that we build as an in-memory C# object. The bug was introduced when the argument for the config map object used raw, unevaluated Octopus variable values. The issue wasn’t identified during testing because the deployment step involves two stages of variable substitution: the first on Octopus Server, and the second inside Calamari during deployment. The two substitution passes are necessary to support the use of computed names, ensuring that each deployment version has its own unique resources. The change didn't account for multi-line strings as potential variables, causing newline characters to not be properly escaped before serialization. This issue occurred because encoding needs to happen on Octopus Server before the object is serialized into YAML. The second substitution in Calamari is direct on the YAML file. The bug was a regression, and the fix involved evaluating the values before serialization to ensure newline characters were handled correctly. ## Remediation and next steps At Octopus, we take deployment reliability very seriously. After this incident, we conducted a thorough review to identify areas where we can improve our processes, in light of the lessons learned. We’ve identified a complex and unconventional area of the code—specifically script-based Kubernetes deployments—that requires further attention. Given the distinctive challenges these deployments present, we are committed to enhancing this area with additional tests to ensure better reliability.

resolved2025-01-23T07:49:15.651Z

We are currently rolling out the fix to Cloud customers. Instances will be upgraded in their next maintenance window. If you are affected by this issue and want to expedite the upgrade, please contact support@octopus.com

monitoring2025-01-23T02:47:44.000Z

A fix has been implemented and will be rolled out to affected cloud customers in their next maintenance window.

identified2025-01-23T01:42:58.636Z

We have identified the issue with a recently released version of Octopus Server to cloud customers. A recent change has caused config maps with multi-line variables, created via the Kubernetes containers deployment step, to fail. Config maps created by the dedicated Kubernetes config map step are not affected, nor config maps created by Raw YAML or Helm steps. We are in the process of fixing this and will update the public issue with the fixed version: https://github.com/OctopusDeploy/Issues/issues/9221.

investigating2025-01-22T17:47:54.055Z

We are currently investigating the issue

Feb 4, 2025

Report: "Intermittent service disruption for a small number of customers"

Last update 2025-02-04T03:07:30.727Z

resolved2025-02-04T03:07:30.710Z

Two seperate recent changes were identified as causes of the excess memory use. These changes have been rolled back, and we have confirmed through monitoring that memory use has returned to normal on the affected instances. No further crashes terminations have been observed.

investigating2025-01-30T07:24:58.209Z

Internal monitoring notified us that a small number of Octopus Cloud instances were terminated for excess memory use up to 12 hours after a recent a software upgrade. These instances were automatically restarted and resumed functioning correctly. At this stage we are investigating to determine the cause of the excess memory use. If you believe you are affected by this issue please contact support@octopus.com

Jan 14, 2025

Report: "Config as Code projects might be affected by Github incident"

Last update 2025-01-14T00:38:19.904Z

resolved2025-01-14T00:38:19.893Z

The Github incident has been resolved, we have confirmed that Config-as-Code project loading is working correctly

identified2025-01-14T00:12:46.115Z

For customers using Github to store Config As Code projects, an ongoing incident (https://www.githubstatus.com/incidents/qd96yfgvmcf9) with Github is affecting the loading of those projects. There is no impact to projects that are using database storage.

Jan 7, 2025

Report: "Octopus Cloud service may be unavailable for some customers"

Last update 2025-01-07T03:10:21.054Z

resolved2025-01-07T03:10:21.038Z

All affected Instances are now up and healthy.

monitoring2025-01-07T01:55:25.348Z

Root cause has been identified and mitigated. All Instances are now running. Further updates will be provided shortly.

identified2025-01-07T01:17:17.567Z

14 Instances are down at this time. We've identified the issue and are working to restore services as a matter of priority.

investigating2025-01-07T00:20:56.464Z

We are currently investigating the issue.

Dec 11, 2024

Report: "Package Acquisition step sometimes fails with exception"

Last update 2024-12-11T00:08:34.472Z

resolved2024-12-11T00:08:34.455Z

The fix has been released to Cloud - if you are still encountering issues, please ask the support team to upgrade your instance to 2025.1.3365

monitoring2024-12-10T07:27:32.639Z

The fix is now rolling out to Cloud. If this issue is affecting you, please reach out to the support team to ask them to upgrade your instance to the latest version.

identified2024-12-10T02:13:29.739Z

We've identified the underlying code change that caused this issue to start appearing. We have a fix for it that will be rolling out to Cloud customers as quickly as possible. Self-hosted customers will not be affected.

identified2024-12-10T01:39:18.829Z

We are continuing to actively work on resolving this issue. Our first potential mitigation didn't work as expected. We have a second potential mitigation that is performing better in testing. We will update here if/when we ship it.

identified2024-12-09T23:56:14.994Z

We are continuing to actively work on this issue. We have identified the issue and are testing a potential mitigation/fix. We are also continuing to investigate the underlying causes of the issue to build confidence our mitigation will work as expected.

investigating2024-12-09T21:32:12.344Z

Some Octopus Cloud customers are having intermittent issues with package acquisition steps throwing exceptions. We are currently investigating.

Nov 20, 2024

Report: "Octopus Cloud Trial Signup - Service Disruption"

Last update 2024-11-20T07:12:44.669Z

resolved2024-11-20T07:12:44.651Z

This issue has now been resolved and all systems are working as expected

identified2024-11-20T06:53:22.000Z

The cause of the issue has been identified and engineers are working on a fix

identified2024-11-20T06:50:03.000Z

The cause has been identified and engineers are working on a fix

investigating2024-11-20T04:55:15.868Z

We are currently experiencing an issue with Octopus Cloud Trial signups. New Trial instances are not being provisioned as expected. Our team is actively investigating the root cause and working to resolve the issue as quickly as possible. We will provide updates here as soon as more information becomes available. Users attempting to sign up for new Octopus Cloud Trials are unable to provision new trial instances at this time.

Report: "Octopus ID and a subset of Octopus.com unavailable"

Last update 2024-11-20T03:49:15.672Z

resolved2024-11-20T03:49:15.662Z

Between 12:15 PM and 12:25 PM AEST, a brief outage affected Octopus ID and a subset of Octopus.com services, including the Profile, Control Centre v1, Blogs, and License Purchasing. During this time, customers were unable to log in to cloud instances using Octopus ID and access certain areas of the site. The issue was caused by an unintended outage during a system upgrade. The issue was identified and resolved promptly, restoring all services within 10 minutes. All impacted systems are fully operational, and no further issues are expected.

Oct 15, 2024

Report: "Unable to sign into Octopus ID"

Last update 2024-10-15T22:03:53.207Z

resolved2024-10-15T22:03:53.191Z

This incident has been resolved.

monitoring2024-10-09T05:26:05.086Z

A recent configuration change to the Octopus ID Web Application Firewall (WAF) caused some customers to be unable to sign into their Octopus Cloud instance using Octopus ID. The change has been rolled back and customers should be able to sign into Octopus ID and their Octopus Cloud instances again. We are monitoring, but if you are having issues, please get in touch at support@octopus.com

Sep 27, 2024

Report: "Octopus.com/signin Unavaliable"

Last update 2024-09-27T09:10:47.762Z

resolved2024-09-27T09:10:47.446Z

octopus.com/signin has been up for the last hour; we will continue to monitor it throughout the day. Customers that notified us have confirmed they can now logon to their Octopus Cloud Instances.

monitoring2024-09-27T08:17:27.666Z

octopus.com/signin should be available again and we are monitoring at the moment.

monitoring2024-09-27T08:16:21.573Z

We're investigating an issue with Octopus.com. This also affects all customers attempting to sign in to their Octopus Cloud instances.

investigating2024-09-27T08:04:43.316Z

We're investigating an issue with Octopus.com. This also affects all customers attempting to sign in to their Octopus Cloud instances.

Sep 3, 2024

Report: "Octopus ID and a subset of Octopus.com URLS unavailable"

Last update 2024-09-03T04:55:50.797Z

resolved2024-09-03T04:55:50.779Z

Services have been operational and stable since recovery 6 days ago. The incident was caused by an abnormally high frequency of requests, resulting in degraded service. We have relieved the bottleneck and are investigating why the compensating controls we had in place didn't compensate as expected.

monitoring2024-08-27T14:07:42.200Z

Services are currently operational. Our team is continuing to investigate the root cause.

investigating2024-08-27T13:30:43.000Z

Users are reporting 502 and 504 timeouts when attempting to log into their Octopus.com accounts, limiting access to Octopus Cloud and Control Center. A small subset of Octopus.com URLs are also impacted: Octopus.com/blog Octopus.com/start

Jul 30, 2024

Report: "Connectivity issues with Octopus.com"

Last update 2024-07-30T21:13:17.826Z

resolved2024-07-30T21:13:17.813Z

Azure has resolved the underlying issue causing these connectivity problems.

monitoring2024-07-30T19:22:49.809Z

Azure is continuing the global rollout of their mitigation measures. We continue to monitor this situation, and await the completion of this rollout by Azure engineers.

monitoring2024-07-30T15:12:24.631Z

Azure has reported improved service availability. We will continue to monitor this situation.

identified2024-07-30T13:38:33.914Z

Due to an upstream issue with Azure, some users are reporting connectivity issues to Octopus.com, billing.octopus.com, and potentially other Octopus-hosted services.

Jul 21, 2024

Report: "Windows2022 Dynamic Workers crashing and failing to lease"

Last update 2024-07-21T23:15:45.092Z

resolved2024-07-21T23:15:45.077Z

This incident has been resolved.

monitoring2024-07-19T06:31:17.672Z

Incident has been mitigated and Windows Dynamic Workers are operating normally.

identified2024-07-19T06:16:55.741Z

Our upstream provider, CrowdStrike, is currently experiencing an issue which is affecting our ability to lease Windows2022 Dynamic Workers. We will resolve this incident once we have received confirmation the issue has been resolved from our upstream provider.

Jun 21, 2024

Report: "Octopus Cloud - intermittent deployment failures when using Git Credentials - MultipleActiveResultSets error"

Last update 2024-06-21T08:20:13.385Z

resolved2024-06-21T08:20:13.371Z

Octopus Cloud Customers have reported intermittent failures when deploying using Git Credentials. Deployments fail, showing an error message similar to "There is already an open DataReader associated with this connection which must be closed first. The connection does not support MultipleActiveResultSets".We have identified this problem was introduced by a recent code change, affecting Octopus Deploy versions 2024.3.2940 or higher. All known instances of this issue have been resolved.

identified2024-06-18T09:01:58.761Z

Customers have reported intermittent failures when deploying using Git Credentials. Deployments fail, showing an error message similar to "There is already an open DataReader associated with this connection which must be closed first. The connection does not support MultipleActiveResultSets". We have identified this problem was introduced by a recent code change, affecting Octopus Deploy versions 2024.3.2940 or higher. Our engineering team is investigating further to determine a fix. Potential Workaround: Retry the deployment. The problem is intermittent in nature.

May 13, 2024

Report: "Dynamic Worker Leases failing in East AU"

Last update 2024-05-13T07:34:50.402Z

resolved2024-05-13T07:34:50.389Z

This incident has been resolved.

monitoring2024-05-07T11:30:23.000Z

We have resumed provisioning Dynamic Workers in East AU (from Australia Southeast). We are continuing to monitor for any degradation of service.

monitoring2024-05-03T10:00:59.000Z

Our service provider, Microsoft Azure, has confirmed that the underlying issue has been resolved. We are continuing to monitor for any degradation of service.

monitoring2024-04-29T03:56:24.658Z

We are continuing to see periods of service degradation in the Australia East region. New Dynamic Workers will be provisioned in the Australia Southeast region until we have confirmation from our service provider that the issue has been resolved.

monitoring2024-04-29T01:15:16.626Z

Dynamic Workers are now provisioning successfully in East AU. We are continuing to monitor for any degradation of service.

investigating2024-04-29T00:52:44.810Z

Our upstream provider, Microsoft Azure, is currently experiencing an issue which is affecting our ability to provision worker virtual machines for customers in East AU. We are working with Azure to resolve the issue. If you experience this problem, deployments utilizing dynamic workers will fail with an error specifying that Octopus Deploy could not obtain a dynamic worker lease. We will resolve this incident once we have received confirmation the issue has been resolved from our upstream provider.

investigating2024-04-28T23:56:44.000Z

Octopus Cloud is experiencing issues leasing Dynamic Workers in Australia East. The issue is currently under investigation and this incident will be updated when further details are known.

Apr 3, 2024

Report: "Octopus.com/signin unavailable"

Last update 2024-04-03T08:15:13.313Z

resolved2024-04-03T08:15:12.920Z

octopus.com/signin have been up and available for the last hour, we'll continue to monitor it but all is operational again.

monitoring2024-04-03T07:55:47.771Z

octopus.com/signin should be available again and we are monitoring at the moment.

investigating2024-04-03T06:35:34.834Z

We're investigating an issue with Octopus.com. This also affects all customers attempting to sign in to their Octopus Cloud instances.

Mar 1, 2024

Report: "Dynamic Worker Leases failing in WestEU"

Last update 2024-03-01T01:53:32.950Z

resolved2024-03-01T01:53:32.937Z

This incident has been resolved.

monitoring2024-02-27T01:16:05.653Z

Dynamic Workers are now provisioning successfully. We are continuing to monitor for any degradation of service.

monitoring2024-02-26T22:37:35.195Z

Dynamic Workers are now provisioning successfully. We are continuing to monitor for any degradation of service.

investigating2024-02-26T22:36:38.333Z

Dynamic Workers are now provisioning successfully. We are continuing to monitor for any degradation of service.

investigating2024-02-26T22:09:38.253Z

We continue to work with Azure to investigate the issue. We are also investigating a potential workaround. Another update will be provided within the next hour.

investigating2024-02-26T19:07:15.461Z

Our upstream provider, Microsoft Azure, is currently experiencing an issue which is affecting our ability to provision worker virtual machines for a subset of customers in WestEU. We are working with Azure to resolve the issue. If you experience this problem, deployments utilising dynamic workers will fail with an error specifying that Octopus Deploy could not obtain a dynamic worker lease. We will provide an update in an hour, at approximately 20:00 UTC. We will resolve this incident once we have received confirmation the issue has been resolved from our upstream provider.

Feb 15, 2024

Report: "New signups to Octopus Cloud are unavailable"

Last update 2024-02-15T05:43:24.795Z

resolved2024-02-15T05:43:18.363Z

Signups are available again. We are sorry for any inconvenience.

investigating2024-02-15T05:26:53.358Z

We have identified an issue preventing new signups from completing. Engineers are currently investigating.

Feb 9, 2024

Report: "AWS deployments failing when using service role for EC2 instance"

Last update 2024-02-09T05:03:32.877Z

resolved2024-02-09T05:03:32.862Z

This incident has been resolved.

monitoring2024-01-09T04:03:18.987Z

We have implemented a fix in v2024.1.6809 which will be rolling out to cloud customers over the next two days. If you need the fix in your cloud instance sooner, please contact https://octopus.com/support. We will continue to monitor for any further issues.

identified2024-01-08T22:44:29.584Z

We have found a likely cause of this issue and are working on a fix. In the meantime, please refer to the Github issue for more updates - https://github.com/OctopusDeploy/Issues/issues/8551

investigating2024-01-08T21:22:52.692Z

We're investigating AWS deployment failures for customers upgraded to 2024.1.6245 and are using the "AWS service role for an EC2 instance" option, which throw the error "An account identifier was not found". For more details of the issue and a possible workaround see the Github issue https://github.com/OctopusDeploy/Issues/issues/8551

Jan 24, 2024

Report: "API requests for projects associated with a lifecycle not responding"

Last update 2024-01-24T03:48:56.187Z

resolved2024-01-24T03:48:56.175Z

This issue was resolved in 2023.4.8185 and 2024.1.5399.

monitoring2023-12-19T23:53:30.105Z

A fix for this issue has been released in version 2023.4.8185 which is now available for download, and 2024.1.5399 which will be rolling out to cloud customers early in the new year. If you need the fix in your cloud instance sooner, please contact https://octopus.com/support. We will continue to monitor for any further issues.

identified2023-12-19T02:49:22.450Z

We are currently investigating an issue where API requests for the projects associated with a lifecycle may hang indefinitely, never responding. We believe we have identified the cause and are working on rolling out a version containing a fix. For more details see this GitHub issue: https://github.com/OctopusDeploy/Issues/issues/8533

Report: "Self-hosted customers experiencing high CPU usage after upgrading to 2023.4.8xxx"

Last update 2024-01-24T03:46:38.961Z

resolved2024-01-24T03:46:38.946Z

This issue was resolved in 2023.4.8166.

monitoring2023-12-20T00:25:08.916Z

With the release of 2023.4.8166, we believe we have addressed the memory and CPU usage spikes associated with the variables endpoints. We continue to monitor to ensure these endpoints operate smoothly. For more details see this GitHub issue: https://github.com/OctopusDeploy/Issues/issues/8529

investigating2023-12-15T07:48:00.838Z

We have implemented a fix in v2023.4.8166. Should you encounter any issues after upgrading, please do not hesitate to reach out to our support team.

investigating2023-12-13T07:12:56.059Z

We are actively continuing our investigation into the reported issue. Thank you for your ongoing patience as we work towards a resolution.

investigating2023-12-12T23:57:28.235Z

We are currently investigating this issue that has been reported, as a high priority.

Jan 21, 2024

Report: "Dynamic Worker Leases failing in WestEU"

Last update 2024-01-21T23:09:11.759Z

resolved2024-01-21T23:09:11.741Z

Azure has advised that this issue has been resolved. A preliminary root cause has been published here: https://azure.status.microsoft/en-us/status/history/

monitoring2024-01-21T09:17:30.400Z

Dynamic Workers are now provisioning successfully. We are continuing to monitor for any degradation of service.

identified2024-01-21T07:31:01.720Z

We continue to see issues with our upstream. We are investigating a workaround. Another update will be provided within the next hour.

identified2024-01-21T06:31:25.278Z

Our upstream provider, Microsoft Azure, is currently experiencing an issue which is affecting our ability to provision worker virtual machines for a subset of customers in WestEU. See https://azure.status.microsoft/en-gb/status for more details. If you experience this problem, deployments utilising dynamic workers will fail with an error specifying that Octopus Deploy could not obtain a dynamic worker lease. We will provide an update in an hour, at approximately 0730 UTC. We will resolve this incident once we have received confirmation the issue has been resolved from our upstream provider.

Dec 20, 2023

Report: "Cloud instances may be missing task log lines or show incorrect status for deployment steps"

Last update 2023-12-20T03:54:03.610Z

resolved2023-12-20T03:54:03.591Z

This issue has been resolved and we have not seen any further errors.

monitoring2023-12-15T03:42:14.217Z

A fix has been implemented on all Octopus Cloud instances and we are monitoring for any further errors.

identified2023-12-12T06:41:16.202Z

We are currently rolling out a fix for this issue. We will provide an update once all Octopus Cloud instances have the fix applied.

identified2023-12-11T00:26:29.410Z

We have identified the cause of the issue and are testing mitigations. We are working on this issue as a priority.

investigating2023-12-07T01:17:27.841Z

We have identified a workaround. The issue occurs when a deployment's task log is viewed while the deployment is running. If users navigate away from viewing a running deployment, all task log lines should be successfully saved. This will also avoid the issue where deployment steps may show the wrong status. We are working with our upstream provider to resolve this issue.

investigating2023-12-06T06:20:09.880Z

We have identified an issue where some Octopus Cloud instances since 20 November 2023 are missing some log lines from their task logs. Deployment steps that encounter this problem could also show the wrong status (e.g. showing as still running) even though the step has completed successfully. The deployment task will have the correct status. We are investigating this issue as a priority.

Dec 13, 2023

Report: "Cloud instances unable to communicate with listening and polling tentacles"

Last update 2023-12-13T02:45:30.546Z

resolved2023-12-13T02:45:30.531Z

All Octopus Cloud customers have been upgraded to fixed software versions, resolving this incident. Self-hosted customers were not affected.

monitoring2023-12-11T01:31:46.726Z

We have identified customers that were most likely affected. and prioritized applying the fix to these customers on the 9th and 10th of December. Rollout continues to all remaining customers, and we anticipate this to be complete by the 12th of December.

monitoring2023-12-08T10:21:35.969Z

A fix has been created and is being rolled out to affected customers. This issue affects instances of Tentacle that use SHA1 (an old, unsupported, encryption algorithm), and prevents them from connecting to Octopus Cloud. This was a side effect of the Octopus Server .NET 8 upgrade, and the resulting change to the underlying OS from Debian 11/OpenSSL 1.x to Debian 12/OpenSSL 3.x. If you continue to be affected by this issue, please try the workaround detailed at https://github.com/OctopusDeploy/Issues/issues/8523, or contact support at https://octopus.com/support.

identified2023-12-08T06:06:21.454Z

The underlying cause has been identified. The issue is currently isolated to cloud customers. A fix is actively being work on.

investigating2023-12-08T03:41:47.710Z

We have identified a suspected root cause and workaround. We suspect this is limited to certificates using a SHA1 algorithm but are investigating further. If affected please try the workaround detailed in this issue: https://github.com/OctopusDeploy/Issues/issues/8523

investigating2023-12-08T01:23:08.135Z

Since the morning of Friday 8th December, cloud customers are reporting issues with communicating to listening and polling tentacles. We are investigating this as a priority incident and will work on a resolution as soon as possible.

Nov 17, 2023

Report: "Manual intervention steps on Configuration-as-code projects may throw errors"

Last update 2023-11-17T03:22:27.477Z

resolved2023-11-17T03:22:27.460Z

This incident has been resolved. A public incident report will be made available to affected customers. If you would like a copy, please get in touch with us at support@octopus.com.

monitoring2023-11-14T03:52:00.600Z

We are now in the process of rolling out a version containing a fix to resolve the issue. This will be available to affected customers and we will be monitoring for any further issues.

identified2023-11-13T20:29:23.154Z

We have identified the root cause of this issue and are working on a hotfix build.

investigating2023-11-13T19:32:35.628Z

We are investigating reports of Octopus Cloud users on 2024.1.746 having issues in the follow scenario: * Configuration-as-code projects with a Manual Intervention step, with the "Responsible team" set to a System-level team. More details including a possible workaround can be found in the following GitHub issue: https://github.com/OctopusDeploy/Issues/issues/8473

Oct 12, 2023

Report: "library.octopus.com is inaccessible"

Last update 2023-10-12T04:18:51.975Z

resolved2023-10-12T04:18:51.962Z

library.octopus.com has been restored and Community Step Template Update sync is working correctly again.

investigating2023-10-12T03:25:52.871Z

We are currently investigating this issue. Due to the nature of how Community Library Templates work, this outage will not block deployments. Customers will still be able to deploy and utilise steps already cached on their instance via the regular Community Library Template Sync process, if they have this feature enabled. Customers who have the feature enabled may see the Sync task failing, but their cached templates will not be affected.

Sep 27, 2023

Report: "Investigating bug reports related to step execution logic"

Last update 2023-09-27T20:48:39.722Z

postmortem2023-09-27T20:46:49.918Z

# Incident Report - Deployments run more than once in High Availability \(HA\) clusters # Summary Octopus Server 2023.3 contained a bug causing self-hosted Octopus Server High Availability \(HA\) clusters to run some deployments more than once, often concurrently. This resulted in incorrect task statuses and confusing task logs. ‌ The bug was caused by a feature-flagged change to our internal TaskQueue. That change removed a database write lock that stops multiple Octopus Server nodes in a High Availability \(HA\) configuration picking up the same task. The write lock removal was accidentally left un-flagged. Multiple Octopus Server nodes could execute the same task concurrently without the write lock in place. This primarily presented as incorrect or out-of-order logging of tasks in a deployment. The issue could affect any self-hosted customers running HA mode on 2023.3 releases below 2023.3.13026.Once we received a report that the issue was impacting the correctness of task execution and not only incorrect display, we escalated immediately and resolved the issue as quickly as possible. We know how critical deployments are for our customers, and we take the trust they have in us to execute those deployments correctly very seriously. We apologise to our customers for not meeting our own standards of correct deployment execution. ## Timings Time to detection: 14 days \(from GA of 2023.3 to first report\) Time to incident declaration: * 3 days \(from initial report of log ordering issue\) * 27 minutes \(from first report of incorrect execution\) **Time to resolution: 27 hours 25 minutes** # What happened? From Monday 18 September 2023, we received customer reports that task statuses, outputs and ordering were displaying incorrectly. Our Support team worked with our customers to troubleshoot common reasons for incorrect task display, and escalated to our Engineering team when they couldn’t resolve the issue. On Thursday 21 September 2023, a customer reported that tasks were executing out of order. On Friday 22 September 2023, we identified a change to our task queue that caused the same task to execute on multiple Octopus Server nodes in HA mode had been released in 2023.3. We fixed the issue immediately and contacted affected customers. We received reports from four affected customers, and identified a total of 24 customers who were using the impacted versions. We have contacted all 24 customers. \* All times in AEST ‌ # Technical details of the problem Octopus Server can either be run as a managed instance in Octopus Cloud, or hosted by our customers on their platform of choice. Octopus Cloud gets changes continuously and for self-hosted customers, Octopus Server has major releases [four times a year](https://octopus.com/blog/long-term-support#introducing-the-octopus-server-lts-program), with each release rolling up all the changes from the last three months. Some complex or early access features will only target the next major version and not be backported to previous supported LTS versions. [Octopus Server High Availability](https://octopus.com/docs/administration/high-availability) \(HA\) mode is only used by self-hosted customers. In HA, multiple nodes of Octopus Server are run concurrently and distribute tasks between them. Octopus Server uses the task queue persisted in the shared database to manage task execution across nodes. Octopus Deploy has been working on a fix for an issue where deployments would “hang”, getting stuck in a `Cancelling` state and not progressing. Under the hood, deployments and other work are represented as a `ServerTask` , and they are added to a `TaskQueue`. The first iteration of a fix changed how the database handled conflicting updates to the `ServerTask` entity, and required flow-on changes to the `TaskQueue`. It was added to the 2023.3 release behind a feature flag which defaulted to off. One of the changes was removing a write lock that Octopus Server nodes used to indicate they were executing a specific `ServerTask` on the queue. The write lock removal should have been behind the feature flag, but was mistakenly shipped as a universal change. The Pull Request containing the problem was merged in June 2023 and has since been running in CI environments and on the Cloud platform. The issue didn’t show up in those environments because they don’t use HA mode, and only HA mode has multiple Octopus Server nodes contending to execute tasks. When 2023.3 was released in September the problem started appearing, and only for self-hosted customers. The fix was to put the write lock back in place on the `TaskQueue`. Replacing the lock was a small change that was quick to test and ship. The work to reduce hung deployments isn’t used in Production environments yet so there was no concern with interactions between the fix and the feature flag. # Remediation and next steps We have removed all affected releases from public availability. The fixed version of 2023.3 is available on our [downloads page](https://octopus.com/downloads). We have also reached out to all potentially affected self-hosted customers. Our next step will be running an incident review to understand where our processes allowed us to ship a critical bug. We have identified that we need to improve our automated testing of HA and our process around how we manage changes to those tests, and will be addressing these as a priority.

resolved2023-09-22T05:16:15.867Z

We have published a fix https://octopus.com/downloads/2023.3.13026 A public incident report will be shared with affected customers. If you would like a copy, please get in touch with us at support@octopus.com.

identified2023-09-22T00:49:35.301Z

We've identified a very likely root cause. We made some changes to our task queues that should have been behind a feature flag, but a change to remove a write lock on the task queue table was accidentally left un-flagged. This means that multiple nodes could pick up the same task accidentally. This confirms that the incident will only affect self-hosted customers using High-Availability mode. We're working on a fix now and should have it available later on today. There are two potential workarounds, although we know that they are not good ones. Moving to a single node instead of HA will be safe as it removes task queue contention. You could also drain all of the nodes, then turn one of them on at a time. Allow each node to pick up some tasks, then drain it, and turn on the next. This approach would be extremely manual and we don't recommend it.

investigating2023-09-21T21:39:51.661Z

We continue to investigate isolated reports from a limited number of self-hosted customers on Octopus Server 2023.3 of this bug: https://github.com/OctopusDeploy/Issues/issues/8356. Out of an abundance of caution, we have temporarily removed the 2023.3 release from our downloads page while we continue to investigate.

investigating2023-09-21T04:28:59.110Z

We are currently investigating reports from a limited number of self-hosted customers on Octopus Server 2023.3 of this bug: https://github.com/OctopusDeploy/Issues/issues/8356. We will update that bug as our investigation progresses.

Sep 4, 2023

Report: "Octopus cloud AU region outage"

Last update 2023-09-04T02:26:40.987Z

postmortem2023-09-04T02:19:28.934Z

# Octopus Cloud Australia East outage - report and learnings Between 10:48am UTC \(8:48pm AEST\) and 6:55pm UTC on Wednesday, August 30, 2023 \(4:55am AEST on Thursday, August 31, 2023\), Octopus Cloud customers in the Australia East region experienced an outage of their Cloud instance. Additionally, between 12:17pm UTC \(10:17pm AEST\) and 3:19pm UTC \(1:19am AEST \+ 1d\), the remaining customers in the Australia East region whose instances were up would have been unable to perform deployments that used Dynamic Workers. This disruption was caused by a cooling issue in one of Microsoft Azure’s Australia East datacenters. ## **Key timings** | **Event** | **Time period** | | --- | --- | | Time to detection | 31 mins | | Time to incident declaration | 40 mins | | Time to resolution | 8 hrs 7 mins | ## **Incident timeline** _\(All dates and times below are shown in UTC\)_ ### Wednesday, August 30, 2023 _**10:48 \(20:48 AEST\)**_ 50% of Cloud Instances in Australia East went down. _**11:19 \(21:19 AEST\)**_ A support engineer acknowledged an automated alert and began investigating. _**11:31 \(21:31 AEST\)**_ Our internal incident response process was initiated. _**11:55 \(21:55 AEST\)**_ Status Page updated: An incident was declared. _**12:17 \(22:17 AEST\)**_ Dynamic Workers in Australia East went down. _**15:06 \(01:06 AEST \+ 1d\)**_ Status Page updated: We are still monitoring. _**15:19 \(01:19 AEST \+ 1d\)**_ Dynamic Workers in Australia East came online. _**17:54 \(03:54 AEST \+ 1d\)**_ Service was restored to 97% of Cloud Instances in Australia East. _**18:11 \(04:11 AEST \+ 1d\)**_ On-call engineer commenced remediation efforts on the remaining instances that were not online. _**18:55 \(04:55 AEST \+ 1d\)**_ All Cloud Instances instances up. _**19:00 \(05:00 AEST \+ 1d\)**_ Status Page updated: All cloud instances are back online. _**21:01 \(07:01 AEST \+ 1d\)**_ Status Page updated: Incident resolved. ## **Technical details** As designed, our services automatically came back online as Microsoft's Azure services were restored. There were a handful of Cloud Instances that required manual intervention, this was expected as these instances were undergoing scheduled maintenance until they were interrupted by the outage. ### Microsoft Azure’s technical details > Starting at approximately 08:30 UTC on 30 August 2023, a utility power surge in the Australia East region tripped a subset of the cooling units offline in one datacenter, within one of the Availability Zones. While working to restore cooling, temperatures in the datacenter increased so we proactively powered down a small subset of selected compute and storage scale units, to avoid damage to hardware. Source: [https://azure.status.microsoft/en-us/status/history/](https://azure.status.microsoft/en-us/status/history/) \(Incident Tracking ID: VVTQ-J98\), retrieved on Thursday, August 31, 2023. ## **Remediation** Octopus takes service availability seriously. Despite the difficulty with upstream cloud provider outages, we fully review and remediate any outages that occur. We do this so that we're continuously improving and maintaining the best possible service we can. We are aiming to reduce the time between a Cloud Instance going down and a human being notified, and reducing the time to publish a Status Page notification to better inform our customers. ## **Conclusion** We deeply value the trust you place in our services, and we understand the importance of maintaining that trust. The recent service disruption was a significant event for us, and it has highlighted areas where we can enhance our processes. We are taking active steps to improve our notification and response mechanisms, ensuring that you are informed promptly and accurately. We appreciate your patience and are committed to delivering the consistent and reliable service you expect from us.

resolved2023-08-30T21:01:50.395Z

This incident has been resolved.

monitoring2023-08-30T19:00:42.961Z

Our upstream provider has mitigated this incident. All cloud instances are back online. Our team will continue monitoring the situation for any issues from the upstream outage.

identified2023-08-30T15:06:31.960Z

Our upstream provider has not yet provided an ETA for resolution for the AU region outage affecting a number of Octopus Cloud customers. We are still monitoring the situation and will continue to provide periodic updates.

identified2023-08-30T11:55:19.888Z

We are aware of an outage affecting our Australian hosted Octopus Cloud customers. Unfortunately, this outage is with our provider in this region. We will continue to monitor the situation and update the status page as more information comes available.

Aug 8, 2023

Report: "Unable to view pages after upgrade"

Last update 2023-08-08T03:41:37.660Z

resolved2023-08-08T03:41:37.645Z

Upgrading to 2023.3.10333 resolves the issue. Cloud instances will be upgraded over the coming days. If you wish to be upgraded sooner, please contact our support team.

investigating2023-08-08T01:46:38.932Z

We are currently investigating. If you are affected, clearing the browser cache and reloading the page should resolve the issue. More details can be found here: https://github.com/OctopusDeploy/Issues/issues/8277

investigating2023-08-08T01:01:06.677Z

We are currently investigating. If you are affected, clearing the browser cache should resolve the issue.

Jul 27, 2023

Report: "Octopus Cloud Connectivity Issue"

Last update 2023-07-27T04:55:29.678Z

resolved2023-07-27T04:55:29.666Z

This incident has been resolved.

investigating2023-07-27T02:28:24.000Z

We're investigating a network connectivity issue with Octopus Cloud. This may cause issues: - when accessing Octopus Cloud instances, including API requests - for polling tentacles over standard ports (443)

Jun 18, 2023

Report: "Error viewing Projects after upgrading to 2023.2.X"

Last update 2023-06-18T22:57:07.310Z

resolved2023-06-18T22:57:07.299Z

Incident resolved, fix is available on GA on 2023.2.12331

monitoring2023-06-16T05:26:12.450Z

A fix has been implemented and is being rolled out to customers.

identified2023-06-15T04:35:41.096Z

We have identified an issue with viewing projects, with an error "Object reference not set to an instance of an object". An issue has been raised and workaround is available. More details can be found here: https://github.com/OctopusDeploy/Issues/issues/8200 Please use the workaround until a patch is released. Contact support@octopus.com with any questions and we'll be happy to help.

Jun 2, 2023

Report: "SQL Timeouts and Errors Impacting Cloud Customers in the West US Region"

Last update 2023-06-02T21:08:54.241Z

resolved2023-06-02T21:08:54.229Z

This incident has been resolved.

monitoring2023-06-02T19:57:43.629Z

This issue looks to have stemmed from an upstream issue with Azure. All impacted instances have recovered, and we will continue to monitor this situation.

investigating2023-06-02T19:26:25.089Z

We are currently investigating some instances experiencing SQL timeouts. This looks constrained to the US West region.

Report: "AWS authentication issue for customers using the "Run an AWS CLI Script" step"

Last update 2023-06-02T04:27:37.987Z

resolved2023-06-02T04:27:37.971Z

We are marking this issue as resolved. A fix has rolled out to customers.

monitoring2023-06-01T12:28:58.680Z

A fix is being deployed to our Cloud instances during their maintenance windows. If you require the fix immediately, please contact support@octopus.com to arrange this.

identified2023-05-31T12:29:56.588Z

We have identified an issue with AWS credentials not being passed while using the AWS CLI Script step. An issue has been raised and workaround is available, more details can be found here - https://github.com/OctopusDeploy/Issues/issues/8177 Please use the workaround until a patch is released. Contact support@octopus.com with any questions and we'll be happy to help.

investigating2023-05-31T11:49:13.956Z

We are currently investigating an AWS authentication issue affecting Cloud customers using the "Run an AWS CLI Script" step

May 18, 2023

Report: "Incorrect Resource Usage values displayed in Control Center for Octopus Cloud instances"

Last update 2023-05-18T05:31:02.855Z

resolved2023-05-18T05:31:02.837Z

This incident has been resolved.

identified2023-05-18T02:02:03.526Z

We have identified and are in the process of fixing an issue where the Resource Usage values displayed for Octopus Cloud instances in Control Center are incorrect.

May 5, 2023

Report: "Issue preventing login with Microsoft/AzureAD on Octopus.com and Octopus Cloud"

Last update 2023-05-05T12:16:29.186Z

resolved2023-05-05T12:16:29.168Z

The fix previously applied caused a mismatch to occur with AzureAD credentials for some users. Steps to resolve: * Login with username/password via the "forgot password" function * Go to the Profile page: https://octopus.com/profile * "Remove" the "Organization account", then re-add it. Please reach out to support@octopus.com for further assistance if needed.

investigating2023-05-05T10:25:15.257Z

Following the resolution of the previous AzureAD sign-in issue we have had reports of some users receiving the following error when attempting to sign-in: "Server Error - We're sorry, an unexpected error occurred whilst processing this request. Please try again later or contact support" Our engineers are investigating, and we will provide updates. Workaround: If you use this mechanism to log in, you can fall back to a username (your email) and a new password. You can follow the "forgot password" mechanism to set up a new password.

Report: "Issue identified with Octopus.com and Octopus Cloud Microsoft/AzureAD sign in."

Last update 2023-05-05T05:27:50.616Z

resolved2023-05-05T05:27:50.602Z

We have re-enabled Azure AD login and verified the system is operating as expected. If you used Azure AD to sign into Octopus ID, you may have been logged out as part of this resolution. You should be able to sign in again smoothly, but if you are having issues or have questions, please reach out to our support team.

monitoring2023-05-05T03:41:15.365Z

As promised, we have re-enabled the AzureAD login and are close to marking this as resolved. Do not hesitate to contact us via support if you need help / have questions/hit issues. We'll monitor for a touch longer to ensure there aren't issues as we're all heading into the weekend. Thank you all for your patience!

identified2023-05-04T07:36:06.335Z

We're ready to move this into "identified" as we had some extended investigation to complete. If all goes well, we can hopefully mark this resolved approx 24 hours from now. If not, we'll keep our wonderful customers updated. Once again, we thank you for your patience and for using the username and password workaround if impacted by our work on the AzureAD/Microsoft auth flow.

investigating2023-05-02T13:49:00.207Z

Thanks for your patience. We're moving forward with changes, but they are taking time as it involves 2+ internal teams coordinating changes. Please continue using the workaround, and rest assured we will restore the Microsoft/Azure login mechanism as soon as possible.

investigating2023-04-29T10:20:30.410Z

Octopus Deploy is aware of an issue with logging into Octopus.com and Octopus Cloud instances for customers who use Microsoft accounts and AzureAD sign-in. Our engineers are investigating and we will provide updates. Workaround: If you use this mechanism to log in, you can fall back to a username (your email) and a new password. You can follow the "forgot password" mechanism to set up a new password.

Mar 27, 2023

Report: "Intermittent errors in West Europe"

Last update 2023-03-27T03:02:30.592Z

postmortem2023-03-26T23:43:02.414Z

# Dynamic Worker Outage in West Europe - report and learnings From 3:03am UTC our Octopus Cloud Infrastructure in West Europe was unable to provision new Dynamic Workers. Customers were impacted between 5:15am to 6:51am UTC Thursday, March 23, 2023. Twenty-three Octopus Cloud customers in West Europe were affected during this time period and could not lease Dynamic Workers to run deployments and runbooks. _We’re sorry, and we’re taking steps to minimize the occurrence and impacts of similar events in the future._ ## Key timings ## Background Octopus Cloud uses [Dynamic Workers](https://octopus.com/docs/infrastructure/workers/dynamic-worker-pools) to execute workloads. During this incident, Dynamic Workers were unavailable for 23 customers, who were therefore unable to execute any of their Deployments and Runbooks that relied on Dynamic Workers. ## Incident timeline _\(All dates and times below are shown in UTC\)_ ### Thursday, March 23, 2023 02:41 One of our upstream dependencies, Azure Resource Manager \(ARM\), started returning 503 responses \([Incident Tracking ID: RNQ2-NC8](https://azure.status.microsoft/en-us/status/history/)\) **03:03 The first Dynamic Worker provisioning failure occurred. At this time, our pre-provisioned pool of Dynamic Workers continued to operate and serve all customer workloads** 04:01 Internal monitoring alerted us about anomalous provisioning failures 04:13 We initiated our incident response process 04:14 We confirmed a sharp rise in 503 responses from ARM 04:17 We disabled automated internal infrastructure functions to limit the number of customers impacted by this issue **04:31 Alerted customers to the incident via** [**status.octopus.com**](http://status.octopus.com) 04:38 We created a ticket with Azure \(Sev A\) **05:15 Our pooled resources were exhausted, leading to the first customer impact** 05:39 As a potential mitigation, we decided to start provisioning additional infrastructure in an alternate region within Europe 06:04 Azure confirmed the outage 06:51 We observed that Dynamic Workers were beginning to recover **06:59 Alerted customers that the incident was mitigated via** [**status.octopus.com**](http://status.octopus.com) 07:10 Azure incident resolved 07:10 We confirmed alternate infrastructure was available for failover if the issue recurred ## Technical details Dynamic Workers makes heavy use of ARM to provision Workers for customer workloads. An outage with ARM meant that we could not provision new Workers in the West Europe region. We maintain a pre-provisioned pool of Workers, but they were depleted after around two and a half hours. ## Remediation and next steps We have identified improvements to our alerting to reduce the time it takes for us to detect similar incidents. We’re prioritizing these improvements using our Risk Treatment Policy. Currently, we rely heavily on single-region availability in Azure. We are evaluating our options to diversify the regions that we use, to mitigate regional availability issues.

resolved2023-03-23T23:17:53.000Z

Azure has advised that this issue has been resolved. A preliminary root cause has been published here: https://azure.status.microsoft/en-us/status/history/ 03/23/2023 Azure Resource Manager - Azure Resource Manager Operations Failures - Mitigated Tracking ID: RNQ2-NC8

monitoring2023-03-23T06:59:39.258Z

Dynamic workers are now provisioning successfully. We are continuing to monitor for any degradation of service.

identified2023-03-23T06:24:37.123Z

Azure are aware of this issue and are actively investigating. See the Azure status page for ongoing updates: https://azure.status.microsoft/en-us/status

investigating2023-03-23T05:58:03.370Z

We are experiencing issues provisioning dynamic workers in West Europe. This may affect deployments or runbooks relying on dynamic workers. We are working with Azure to have this operational as soon as possible. If you have urgent tasks relying on dynamic workers please contact support@octopus.com.

investigating2023-03-23T04:31:58.295Z

We are investigating an issue with our cloud vendor that may affect customers in the the West Europe region

Feb 20, 2023

Report: "Octopus.com sign-in, docs, blogs and downloads unavailable"

Last update 2023-02-20T04:00:55.450Z

postmortem2023-02-20T03:32:49.780Z

Between 8:00am and 10:30am UTC, February 9, 2023, sections of [octopus.com](http://octopus.com) intermittently returned 503 responses. The affected routes were /signin, /blogs, and /docs. ## Background Octopus Deploy recently migrated our DNS management to a new provider to centralize our infrastructure. During the migration, we set the web application firewall \(WAF\) in front of [octopus.com](http://octopus.com) to detection mode. At the same time, we tuned the ruleset to prevent false positives from blocking legitimate customer access to Octopus systems. ## Key timings ## Timeline _\(All dates and times below are shown in UTC.\)_ ### Thursday, February 9, 2023 _08:05_ Our automated systems detected decreased availability in sections of the [octopus.com](http://octopus.com) website. _08:35_ Engineers on call were notified. _08:56_ Status Page updated: An incident was declared. _10:30_ We updated the WAF to block malicious traffic. _10:48_ Status Page updated: Incident status changed to \`Monitoring\`. _12:24_ Status Page updated: Incident status changed to \`Resolved\`. ## What happened? An attacker ran a fuzzing application across our public-facing website during the time the WAF was in “detection” mode. This caused excessive load that would normally have been prevented by the WAF, in turn reducing availability of [octopus.com](http://octopus.com). Engineers mitigated the outage by applying a cut-down implementation of the WAF that protected the website from single origin attacks. ## Remediation and next steps Since this incident, we've completed the migration to our new DNS provider, and the WAF is fully enabled. During our incident review process, we identified and corrected gaps in our defense to reduce the time from detection to mitigation. We identified the internal oversight in risk management that led to this situation: by mitigating one risk, we became susceptible to another risk. We have since updated our project risk assessment process to include more formal internal reviews of our planned changes to core systems. ## Conclusion Octopus Deploy takes service availability seriously. In the past month, we’ve had multiple incidents affecting sign-in infrastructure, which is below our desired standard. We apologize for the disruption to our customers and are working to reduce the likelihood and severity of future disruptions.

resolved2023-02-09T12:24:24.068Z

This incident has been resolved.

monitoring2023-02-09T10:48:55.590Z

We have applied a mitigation that will improve the availability of the affected URL's and are monitoring it's effects.

investigating2023-02-09T08:56:46.224Z

We are aware of issues affecting parts of Octopus.com including /signin, /docs, /blog, and /downloads. Engineers are investigating.

Feb 9, 2023

Report: "Authentication errors when attempting to sign into https://billing.octopus.com"

Last update 2023-02-09T01:55:54.534Z

resolved2023-02-09T01:55:54.518Z

This incident has been resolved.

monitoring2023-02-09T01:29:11.238Z

A fix has been implemented and we are monitoring the results.

investigating2023-02-09T01:13:36.948Z

We are investigation issues signing into https://billing.octopus.com and requests returning HTTP 500's

Feb 3, 2023

Report: "Octopus.com pages (signin/docs/downloads etc.) are returning 404 errors for some users"

Last update 2023-02-03T05:26:59.113Z

postmortem2023-02-03T05:26:15.932Z

**Postmortem** - [Read details](https://status.octopus.com/incidents/1zrm6sb0ngc4).

resolved2023-01-26T15:44:26.717Z

This incident has been resolved.

monitoring2023-01-26T15:07:30.975Z

Our upstream provider has applied a fix that should resolve this issue. We are continuing to monitor this on our side. If you are continuing to see 404 errors navigating to parts of Octopus.com (sign-in, documentation, blog etc.), please contact support@octopus.com

identified2023-01-26T12:38:15.081Z

Mitigations didn't hold. We're back to investigating new options. Signin / docs and blog are still impacted.

monitoring2023-01-26T11:07:24.000Z

**Advice**: go to www.octopus.com (your browser cache will be in your way), and try incognito/private browsing. We are 302 redirecting from octopus.com to www.octopus.com and are seeing signs of improved availability for impacted systems /signin /docs /blog Northcentral US may still experience a partial outage depending on how your traffic is routed. Most regions should be healthy or just experience some degraded performance. The only impact to Octopus Cloud is the log in capabilities.

identified2023-01-26T04:19:43.889Z

We have identified a cause of this incident and are working to remediate.

investigating2023-01-25T22:59:25.821Z

We have escalated with our Cloud provider and are working with them to resolve. Further updates will be posted when they are available.

investigating2023-01-25T21:36:21.318Z

We are continuing to investigate and have escalated internally to specialist engineers for additional support. Further updates will be posted when they are available.

investigating2023-01-25T18:32:13.884Z

We currently believe that this is only affecting certain geographical locations and their related CDN endpoints. We are continuing to investigate and have engaged our CDN provider for further assistance/investigation. Further updates will be posted when they are available.

investigating2023-01-25T17:22:46.628Z

We are continuing to investigate this issue

investigating2023-01-25T15:41:01.541Z

We are currently investigating this issue.

Feb 2, 2023

Report: "Octopus Cloud and Octopus.com log in issues"

Last update 2023-02-02T04:10:58.795Z

postmortem2023-02-02T04:10:04.890Z

We have published an Incident Report to our blog - read it [here](https://www.octopus.com/blog/cloud-connectivity-disruption-report-learnings).

resolved2023-01-25T11:36:02.828Z

Resolved, no longer seeing impacts to customers.

monitoring2023-01-25T11:01:18.585Z

Azure has resolved their major network outage.

investigating2023-01-25T08:40:18.325Z

Azure status is reporting widespread networking impact and issues. We are impacted by Azure's availability now.

investigating2023-01-25T08:39:29.102Z

We are continuing to investigate this issue.

investigating2023-01-25T08:18:08.138Z

We have detected DNS and Azure issues, which are impacting our ability to monitor systems, and impacting customers using Octopus systems.

Jan 24, 2023

Report: "Octopus Control Center (planned outage)"

Last update 2023-01-24T02:26:20.301Z

resolved2023-01-24T02:26:20.289Z

This has been resolved

identified2023-01-24T01:14:10.099Z

To facilitate DNS changes Octopus Control center will be partially/completely unavailable for up to 30 minutes. We apologise for any inconvenience this may cause.

Jan 18, 2023

Report: "Some Cloud Instances in West Europe are showing the "Undergoing Maintenance" page"

Last update 2023-01-18T09:37:43.983Z

resolved2023-01-18T09:37:43.970Z

The affected Cloud Instances are now operational.

investigating2023-01-18T09:16:06.194Z

Some Cloud Instances in the West Europe region are showing the "Octopus Server is Undergoing Maintenance" page. We are currently investigating this issue.

Jan 9, 2023

Report: "Brief interruption in the West US 2 region"

Last update 2023-01-09T05:20:41.257Z

resolved2023-01-06T05:30:00.000Z

Some instances in the West US 2 region experienced a brief database connectivity issue between 05:37-5:43am UTC. Individual Octopus Cloud instances experienced issues for ~60 seconds. We are investigating the root cause.

Jan 5, 2023

Report: "Intermittent Dynamic Worker leasing failures"

Last update 2023-01-05T00:51:24.874Z

resolved2023-01-05T00:51:24.862Z

This incident has been resolved.

monitoring2023-01-04T02:11:46.261Z

A fix has been implemented for a vendor issue and we are monitoring the results.

investigating2023-01-04T01:01:13.776Z

We are currently investigating an issue affecting provisioning of new Dynamic Workers. Leasing of new Dynamic Workers may intermittently fail in all regions.

Nov 3, 2022

Report: "billing.octopus.com control center access control section is unavailable"

Last update 2022-11-03T12:38:29.247Z

resolved2022-11-03T12:38:29.230Z

Control center functionality is back to normal

investigating2022-11-03T11:59:52.940Z

Engineers are investigating this partial outage. All customers with monthly or annual cloud subscriptions are impacted. The impact is isolated to the ability to manage your access control list. Customers cannot add or remove access grants until this is resolved. Access to Octopus Deploy cloud instances is not impacted.