Is amazee.io Down Right Now? Discover if there is an ongoing service outage.

amazee.io is currently Operational

Last checked Jul 29, 2025 17:44 UTC from amazee.io's official status page

Historical record of incidents for amazee.io

Jul 16, 2025

Report: "Delayed log processing"

Last update 2025-07-16T14:55:39.703Z

investigating2025-07-16T14:55:39.700Z

There is a delay in log processing and recent logs are not available in the logging system. Real time logs can be access through the Lagoon CLI: https://docs.amazee.io/cloud/logging/#real-time-container-logs-via-lagoon-cli

Jul 8, 2025

Report: "Dns issues"

Last update 2025-07-08T11:31:09.900Z

identified2025-07-08T11:31:09.898Z

The issue has been identified and a fix is being implemented.

Jun 17, 2025

Report: "US2 - slow workload starts"

Last update 2025-06-17T14:03:10.641Z

monitoring2025-06-17T14:02:43.000Z

Some workloads require more time than expected to start up after the maintenance. We are monitoring the startup processes closely.

Jun 16, 2025

Report: "Degraded performance on UK3 production MySQL databases"

Last update 2025-06-16T16:29:29.558Z

investigating2025-06-16T16:29:29.544Z

We are currently investigating this issue.

Jun 10, 2025

Report: "Partially available workloads after maintenance"

Last update 2025-06-10T16:39:52.941Z

investigating2025-06-10T16:39:52.938Z

After the maintenance on us2, some workloads are only partially available. We are investigating the issue and are working on mitigations to make these workloads available again.

Jun 3, 2025

Report: "Regular Maintenance - EMEA"

Last update 2025-06-03T21:00:00.000Z

Completed2025-06-03T21:00:00.000Z

The scheduled maintenance has been completed.

In progress2025-06-03T16:00:00.000Z

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled2025-05-26T15:00:00.000Z

We are conducting regular maintenance on our systems.

Report: "Regular Maintenance - APAC"

Last update 2025-06-03T12:00:00.000Z

Completed2025-06-03T12:00:00.000Z

The scheduled maintenance has been completed.

In progress2025-06-03T07:00:00.000Z

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled2025-05-26T15:00:00.000Z

We are conducting regular maintenance on our systems.

Report: "Regular Maintenance - AMERICAS"

Last update 2025-06-03T09:00:00.000Z

Completed2025-06-03T09:00:00.000Z

The scheduled maintenance has been completed.

In progress2025-06-03T04:00:00.000Z

Scheduled maintenance is currently in progress. We will provide updates as necessary.

Scheduled2025-05-26T15:00:00.000Z

We are conducting regular maintenance on our systems.

May 23, 2025

Report: "MySQL 8 Update for Switzerland (CH4) Environments"

Last update 2025-05-23T04:00:00.000Z

Completed2025-05-23T04:00:00.000Z

The scheduled maintenance has been completed.

Scheduled2025-05-23T02:28:00.000Z

New environments created in our Switzerland (CH4) region will be provisioned with MySQL 8 by default. What this means for you:- Existing environments will remain on MySQL 5.7 (no action required)- Only applies to newly created environments in CH4- Most applications will experience no issues with MySQL 8This is phase 1 of our MySQL upgrade plan. We'll provide separate communication about upgrading existing environments in the future.If you experience any compatibility issues with your application on newly created environments, please contact our support team.

Apr 28, 2025

Report: "Varying support coverage during holiday season"

Last update 2025-04-28T09:08:55.613Z

resolved2025-04-28T09:08:55.593Z

This incident has been resolved.

monitoring2025-04-18T07:39:46.171Z

From April 18th to April 21st our support coverage varies due to the holiday season. While we monitor the platform as usual, you might experience slower response times on support cases. For critical issues that affect production services, please open a support ticket and call the emergency number noted in your contract.

monitoring2025-04-18T07:38:03.608Z

We are continuing to monitor for any further issues.

monitoring2025-04-18T07:35:51.941Z

Apr 24, 2025

Report: "Fastly Certificate Error"

Last update 2025-04-24T23:24:17.881Z

resolved2025-04-24T23:24:17.866Z

This incident has been resolved.

investigating2025-04-24T23:24:11.765Z

We are continuing to investigate this issue.

investigating2025-04-24T22:52:07.017Z

We are continuing to investigate this issue.

investigating2025-04-24T21:46:10.368Z

We are currently investigating some discrepancies in our Fastly certificate automation. While existing certificates remain unaffected, you may encounter errors during deployment.

Apr 8, 2025

Report: "Global docker image registry errors"

Last update 2025-04-08T11:36:09.102Z

resolved2025-04-08T11:36:09.088Z

This incident has been resolved.

monitoring2025-04-08T11:18:05.520Z

A fix has been implemented and we are monitoring the results.

investigating2025-04-08T11:03:37.764Z

We are continuing to investigate this issue.

investigating2025-04-08T11:03:07.501Z

We are currently investigating this issue.

Mar 14, 2025

Report: "Logging partially available"

Last update 2025-03-14T07:46:34.707Z

resolved2025-03-14T07:46:34.692Z

This incident has been resolved.

monitoring2025-03-13T14:19:52.036Z

We identified a higher than usual load on the logging system and assigned it more resources. The logging system is partially available while it is restarting.

Mar 4, 2025

Report: "Volumes in read-only mode"

Last update 2025-03-04T09:47:55.568Z

resolved2025-03-04T09:47:55.551Z

With the latest applied patch the issue has been resolved.

monitoring2025-02-21T13:46:39.536Z

We applied a patch that resolves the read-only issue which some workloads experienced.

identified2025-02-20T17:48:49.238Z

We are investigating multiple cases of volumes that were only mounted in read-only mode leading to failed writes for applications. While we are working on a permanent solution, please do reach out to our support should you see your application being impacted by this.

Feb 18, 2025

Report: "Volumes in read-only mode"

Last update 2025-02-18T19:14:23.587Z

resolved2025-02-18T19:14:23.567Z

All potentially affected workloads have been restarted.

identified2025-02-18T18:35:45.000Z

We have updated the impacted regions to include au2.lagoon

identified2025-02-18T18:31:36.841Z

We have rolled out a patch and will restart possibly affected workloads.

investigating2025-02-18T17:40:58.296Z

Report: "Volumes in read-only mode"

Last update 2025-02-18T07:07:03.804Z

resolved2025-02-18T07:07:03.788Z

Since we rolled back some components, we have no longer seen issues with volumes getting mounted read-only.

monitoring2025-02-13T21:37:44.321Z

Since the rollback we have no longer seen volumes getting falsely mounted as read-only. We will keep monitoring the situation.

investigating2025-02-13T11:34:19.630Z

As a temporary workaround we are immediately reverting some components to a previous version. The impact of this for workloads is like a regular maintenance.

investigating2025-02-13T10:27:42.052Z

Feb 7, 2025

Report: "GCP MySQL 8 Migration Update"

Last update 2025-02-07T17:47:23.097Z

resolved2025-02-07T17:47:23.079Z

MySQL 8 Upgrade Schedule: - Development environments: February 26, 2025 - Production environments: March 12, 2025 Impact: - Expected database connection interruption: <10 minutes per environment - Affected clusters: CH4, FI2, and US3 What to expect: - No action required from customers - Our team will monitor upgrades and handle any issues - Status updates will be provided during maintenance windows

identified2025-01-29T16:15:22.387Z

We are actively working on implementing a seamless migration to MySQL 8 on our Google Cloud Platform (GCP) environments. Our engineering team is in the final stages of the testing phase, ensuring a smooth transition with minimal impact on our services. We are committed to maintaining system stability throughout this upgrade process. We will announce the specific migration date and detailed timeline next week, along with any relevant instructions for our users.

Jan 28, 2025

Report: "Postgresql US2 upgrade to 14.10"

Last update 2025-01-28T10:22:25.783Z

resolved2025-01-28T10:22:25.769Z

This incident has been resolved.

monitoring2025-01-28T09:24:01.478Z

We are performing the Postgresql upgrade

identified2025-01-27T18:12:26.349Z

We will perform upgrade of Postgresql on US2 cluster to version 14.10

Jan 22, 2025

Report: "GCP MySQL 8 Upgrade"

Last update 2025-01-22T13:51:39.757Z

resolved2025-01-22T13:51:39.743Z

The planned MySQL 8 upgrade for GCP development environments is currently paused while we investigate implementation challenges. This delay affects only test/development environments and does not impact current production systems. We are actively working on resolving these issues and will announce a new upgrade schedule through the status page once our investigation is complete. Customers planning to test their applications with MySQL 8 will be given advance notice before the upgrade resumes.

identified2025-01-07T09:34:37.649Z

During a pre-upgrade check for the upgrade of the development environments to MySQL 8 we identified a technical requirement that we need to test further before we do the upgrade. Therefore we will not be upgrading the development environments during today's maintenance windows.

Jan 8, 2025

Report: "Intermittent SSH access issues"

Last update 2025-01-08T20:46:01.286Z

resolved2025-01-08T20:46:01.272Z

The issue has been resolved. The issue was caused by a communication fault between the remotes and core.

monitoring2025-01-08T09:52:52.556Z

A fix has been implemented and we are monitoring the results.

identified2025-01-08T09:06:19.957Z

The issue has been identified and a fix is being implemented.

identified2025-01-08T08:53:57.278Z

The issue has been identified and a fix is being implemented.

investigating2025-01-08T08:27:12.393Z

We're investigating intermittent SSH access to cloud clusters

Nov 28, 2024

Report: "JSM Assist sync issues"

Last update 2024-11-28T09:54:47.831Z

resolved2024-11-28T09:54:47.815Z

The issue has been resolved by Atlassian.

monitoring2024-11-26T18:05:40.361Z

We are monitoring updates from Atlassian for JSM Cloud customers concerning a sync issue with Assist, you might experience delays awaiting responses from our Support team.

Report: "Failing image builds"

Last update 2024-11-28T09:53:26.024Z

resolved2024-11-28T09:53:26.009Z

The caching issue has been resolved and image builds are fully operational.

identified2024-11-28T09:12:47.240Z

Some image builds on DE3 are timing out due to an issue with the build cache. We are working on a solution.

Nov 1, 2024

Report: "Delayed logs"

Last update 2024-11-01T17:15:56.807Z

resolved2024-11-01T17:15:56.787Z

The backlog of logs was processed completely and logs are showing up as usual in the logging system.

monitoring2024-11-01T10:32:49.065Z

Logs might not appear in the logging system as quick as usual due to a larger backlog that is currently being processed. Real time logs through the Lagoon CLI (https://docs.amazee.io/cloud/logging/#real-time-logs-via-lagoon-cli) are not affected by this.

Oct 30, 2024

Report: "Let's Encrypt Certificate creation issues"

Last update 2024-10-30T10:44:36.179Z

resolved2024-10-30T10:44:36.160Z

New certificates are again issued without any delays.

monitoring2024-10-09T09:17:05.615Z

A fix has been implemented and we are monitoring the results.

investigating2024-10-04T09:49:44.080Z

Customers with valid certificates are not impacted only newly issued certificates seem to take longer than usual to load the certificate onto the route. If you see immediate issues please get in touch with Support.

Oct 15, 2024

Report: "DE3 production MySQL 8 upgrade"

Last update 2024-10-15T21:35:20.506Z

resolved2024-10-15T21:35:20.494Z

For DE3 the upgrade of the production database to MySQL 8 was completed successfully.

monitoring2024-10-15T21:34:39.180Z

A fix has been implemented and we are monitoring the results.

identified2024-10-09T09:16:26.000Z

DE3 cluster is running on MySQL8

investigating2024-10-09T09:14:03.000Z

DE3 cluster is running on MySQL8

Oct 9, 2024

Report: "MySQL 8 upgrade"

Last update 2024-10-09T08:32:25.637Z

resolved2024-10-09T08:32:25.622Z

This incident has been resolved.

investigating2024-10-08T15:33:34.040Z

Hello Team! We are upgrading Mysql 8 from Mysql5.7. We expect limited downtime during maintenance window.

Sep 11, 2024

Report: "Degraded performance on UK3 MySQL production databases"

Last update 2024-09-11T05:51:09.705Z

resolved2024-09-11T05:51:09.692Z

This incident has been resolved.

monitoring2024-09-10T15:28:39.052Z

A database failover was executed to resolve the performance issue and we are monitoring the situation.

investigating2024-09-10T14:59:40.326Z

We are currently investigating this issue.

Sep 5, 2024

Report: "Absent router logs"

Last update 2024-09-05T08:16:51.000Z

resolved2024-09-05T08:16:50.986Z

This incident has been resolved.

identified2024-08-14T12:51:55.389Z

The router logs from ch4 were not shipped to the logging infrastructure from 2024-07-24 15:53 to 2024-08-12 06:16 UTC due to a misconfiguration in the logging system. Application and container logs were not impacted by this misconfiguration and are available as usual.

Jul 31, 2024

Report: "Status update delays of builds and tasks, and webhooks not being processed"

Last update 2024-07-31T13:26:35.608Z

resolved2024-07-31T13:26:35.592Z

The status and webhook delays are now resolved

monitoring2024-07-31T13:17:35.246Z

We are continuing to monitor for any further issues.

monitoring2024-07-31T10:44:07.899Z

A fix has been implemented and we are monitoring. Webhooks may be delayed while the received webhook queue is processed.

identified2024-07-31T10:19:52.833Z

The issue has been identified and a fix is being implemented.

Jun 28, 2024

Report: "Changes in un-idling behavior"

Last update 2024-06-28T14:39:05.677Z

resolved2024-06-28T14:39:05.660Z

This incident has been resolved.

investigating2024-05-13T22:33:51.915Z

Workloads in non-production environments will only be un-idled when a client accessing them can run JavaScript. This change will prevent most cases of undesired un-idling triggered by automated requests. More information on environment idling can be found in the Lagoon documentation https://docs.lagoon.sh/concepts-advanced/environment-idling/.

May 7, 2024

Report: "Degraded database performance on FI2 MySQL production"

Last update 2024-05-07T16:44:46.985Z

resolved2024-05-07T16:44:46.969Z

This incident has been resolved.

identified2024-05-07T13:25:43.310Z

We are continuing to work on a fix for this issue.

identified2024-05-07T08:51:19.144Z

In order to stabilize the performance of the database we will trigger a failover. This will lead to short interruptions for applications connecting to the database.

Apr 18, 2024

Report: "Fastly API Issues"

Last update 2024-04-18T20:56:08.286Z

resolved2024-04-18T20:56:08.272Z

Issue has been resolved on the upstream.

monitoring2024-04-18T16:59:41.713Z

Disruptions in access to manage.fastly.com, configuration propagation, and access to the Fastly API have been fixed. We are monitoring the current API state.

investigating2024-04-18T15:35:25.080Z

We are currently investigating issues caused by an upstream issue with Fastly - https://www.fastlystatus.com/incident/376458

Apr 5, 2024

Report: "Intermittent Workload Restarts"

Last update 2024-04-05T06:03:34.254Z

resolved2024-04-05T06:03:34.239Z

This incident has been resolved.

monitoring2024-02-13T09:39:43.734Z

We're still investigating certain workload reastarts. There seems to be certain workloads that can trigger coditions on the compute nodes which then leads to the workloads on the compute node being rescheduled.

monitoring2024-01-24T22:51:52.854Z

A fix has been implemented and we are monitoring the results.

identified2024-01-24T16:58:09.929Z

Following up from the earlier Incident regarding the intermittent workload restarts: We'll run an additional maintenance window after 21:00 UTC today to move workloads onto a new set of compute nodes. This action should stabilize the intermittent workload restarts we are seeing.

Report: "Lagoon tasks error out"

Last update 2024-04-05T00:57:29.048Z

resolved2024-04-05T00:57:29.035Z

We've resolved the issue now for the majority of users. A previous update identified a workaround in the unlikely event that you may still experience the issue. Reach out to support if you do encounter the error and aren't quite sure how to resolve it.

monitoring2024-04-04T19:37:53.021Z

A fix has been implemented and we are monitoring the results.

identified2024-04-04T18:30:48.618Z

After release of Lagoon 2.18 triggering tasks that require cli pod from UI will result this error `Environment <id> has no service cli` A short term fix is to trigger a deployment OR run this api mutation for the environment that the task is broken on mutation { addOrUpdateEnvironmentService(input: { environment: <environment-id> name: "cli" type: "cli" }) { id name type } } Lagoon team is working on permanent fix

Apr 3, 2024

Report: "Client support availability during Easter Holidays"

Last update 2024-04-03T06:05:44.455Z

resolved2024-04-03T06:05:44.437Z

This incident has been resolved.

monitoring2024-03-27T09:46:13.055Z

During the upcoming Easter holidays (Mar 29th 2024 - Apr 1st 2024), amazee.io will be continuing to offer support albeit at a reduced availability. Our on-call engineers will continue to monitor the platform and the ticketing system. This is a reminder that in case of need of support, you can create a support ticket (via email, Slack if available, the Support portal, or the chat widget within the amazee.io Lagoon dashboard). For critical or high-severity issues that require more immediate attention, please call the emergency number written down in your contract. Full support services will commence again as of Tuesday, Apr 2nd, 2024. From all of us at Amazee.io, we wish you a safe and happy holiday break.

Apr 2, 2024

Report: "Timeouts during log retrieval"

Last update 2024-04-02T20:45:15.279Z

resolved2024-04-02T20:45:15.264Z

This incident has been resolved.

monitoring2024-04-02T19:57:56.376Z

A fix has been implemented and we are monitoring the results.

identified2024-04-02T15:38:20.941Z

We're working on getting the log storage back operational - In the background there's a lot of data loading happening currently which leads to slower answer times while retrieving the logs.

investigating2024-04-02T11:30:17.230Z

Some queries to retrieve logs are currently failing due to timeouts.

Report: "Login to Logs Backend fails with redirect error"

Last update 2024-04-02T15:37:22.492Z

resolved2024-04-02T15:37:22.474Z

The issue has been solved - login to the Logs Backend should work again without issues.

investigating2024-04-02T14:34:18.001Z

We're seeing reports from users that the login to the logs.amazeeio.cloud is failing currently. We're looking into this at the moment.

Mar 14, 2024

Report: "Lagoon API Outage"

Last update 2024-03-14T04:24:04.489Z

resolved2024-03-14T04:24:04.474Z

This incident has been resolved.

monitoring2024-03-14T00:07:49.226Z

A fix has been implemented and we are monitoring the situation

identified2024-03-13T22:46:24.218Z

We are continuing to work on a fix for this issue.

identified2024-03-13T22:46:15.946Z

We've identified the issue and are working on restoration

investigating2024-03-13T22:08:13.694Z

We are currently investigating this issue.

Mar 12, 2024

Report: "Timeouts on Lagoon Logs"

Last update 2024-03-12T07:17:49.836Z

resolved2024-03-12T07:17:49.825Z

The logging system is fully operational again.

investigating2024-03-11T13:50:43.236Z

After rebalancing the data some parts of the logging system failed and we are working on making it fully operational again.

monitoring2024-03-08T10:36:24.282Z

We have rolled out some improvements and the logging system is currently rebalancing data.

investigating2024-03-08T08:23:51.346Z

We have seen an increase in timeouts for log queries and are working on identifying the root cause of this.

Feb 2, 2024

Report: "Emergency Maintenance Window"

Last update 2024-02-02T08:03:45.563Z

resolved2024-02-02T08:03:45.546Z

This incident has been resolved.

identified2024-02-01T07:30:07.798Z

We'll be running an emergency maintenance window today within the usual maintenance window times for clusters on AWS infrastructure.

Jan 24, 2024

Report: "Intermittent Workload Restarts"

Last update 2024-01-24T14:17:21.637Z

resolved2024-01-24T14:17:21.624Z

We've implemented a fix that should lower the impact of workload restarts. Our team is monitoring the situation and taking action where neccessary.

identified2024-01-22T15:50:49.000Z

We're investigating workloads being rescheduled intermittently. This only affects a small subset of projects on amazeeio-ch4. This can lead to availability issues on standard availability projects. We've found a cause of this behavior and are working on rolling out a fix for this issue during the maintenance window.

Jan 12, 2024

Report: "Scaling activities"

Last update 2024-01-12T16:10:56.167Z

resolved2024-01-12T16:10:56.149Z

The situation is stable. We will resolve this incident here and follow up with a post incident review in the coming days.

monitoring2024-01-11T22:27:46.185Z

The original database cluster can only be started in read mode. In accordance with our backup and recovery processes, we promoted the new database cluster with the state of 2024-01-11 03:05 UTC as the new production cluster. We updated all workloads to use this new database cluster. Please note that this does not contain data between 2024-01-11 03:05 UTC and the moment the database cluster went offline (~ 2024-01-11 07:22 UTC). Dumps of the original database with the latest data can be exported and shared on request. A summary of the incident will be shared in the upcoming days. We are sorry for the inconvenience this caused you and your clients. If you have any questions regarding this, please reach out to us.

identified2024-01-11T18:27:10.142Z

Recovering the database was interrupted due to an unforeseen issue. We are working with the AWS RDS team to bring the database back online. As an alternative option for recovery we can point single environments to a new database cluster, containing data up until 2024-01-11 03:05 UTC. Please be aware that this option would lead to data loss. If you would like to pursue this route, please contact us through our support channels.

identified2024-01-11T16:35:24.842Z

We're making good progress on recovering the database cluster. We're expecting the database cluster to be back online within the next 2 hours.

identified2024-01-11T14:31:38.072Z

Recovery is still underway. We're evaluating additional ways to recover from the current situation quicker and restore services.

identified2024-01-11T12:17:44.543Z

We're making progress in recovery - We can't give a firm ETA as the recovery speed hasn't settled fully yet. Still in discussions with the AWS RDS team on timings and additional recovery options.

identified2024-01-11T10:42:33.072Z

We've identified the issue in the meantime and working on recovering from the outage. We can't give an ETA for now and evaluating several options.

investigating2024-01-11T09:39:44.490Z

We're still working with AWS RDS team to investigate the issue and what causes the connectivity issues.

investigating2024-01-11T09:04:42.777Z

We're seeing connectivity issues to the database cluster after the scaling operation. We'll involve our upstream provider to look into this issue aswell.

investigating2024-01-11T08:41:25.264Z

We're seeing issues with the Database Cluster and investigating

monitoring2024-01-11T07:18:07.279Z

We observed an increase in resource usage on the shared MySQL cluster on UK3. To account for the increase we are scaling the cluster which will lead to one failover.

Dec 14, 2023

Report: "FI2 - Database Load"

Last update 2023-12-14T16:07:58.176Z

resolved2023-12-14T16:07:58.164Z

This incident has been resolved.

identified2023-12-14T14:55:43.833Z

We've identified the issue and limiting the impact on customers. We're continuing to monitor the situation.

investigating2023-12-14T14:20:51.855Z

We're currently investigating issues on amazeeio-fi2 related to increased DB load.

Dec 13, 2023

Report: "Cluster Scaling Operations"

Last update 2023-12-13T22:38:25.393Z

resolved2023-12-13T22:38:25.377Z

The scaling operations have finished - We're monitoring the situation but everything looks all clear now.

monitoring2023-12-13T21:49:31.313Z

Some clusters had an increase in node count. In order to lower the compute node footprint, we've enabled down scaling on all clusters. This can have intermittent impact for sites that are not highly available.

Dec 7, 2023

Report: "Image registry issues"

Last update 2023-12-07T10:14:53.049Z

resolved2023-12-07T10:14:53.035Z

This incident has been resolved.

monitoring2023-12-06T10:33:46.705Z

A fix has been implemented and we are monitoring the results.

investigating2023-12-06T07:22:19.399Z

We're seeing some issues on the image registry after yesterday's maintenance - Our team is working on resolving this issue. We're re-running the maintenance tasks to solve those issues.

Dec 4, 2023

Report: "AU2 - Database load issues"

Last update 2023-12-04T10:49:31.983Z

resolved2023-12-04T10:49:31.965Z

This incident has been resolved.

monitoring2023-12-04T10:06:50.687Z

To handle the increased load, we've scaled the database infrastructure. We're continuing to monitor the situation closely.

investigating2023-12-04T09:54:11.640Z

A subset of customers see slowness on database queries. We're looking into the situation and take action where needed

Nov 29, 2023

Report: "SSH Connectivity Issues"

Last update 2023-11-29T11:21:08.509Z

resolved2023-11-29T11:21:08.494Z

This incident has been resolved.

monitoring2023-11-29T09:58:00.547Z

A fix has been implemented and rolled out to all clusters. We're monitoring the situation.

identified2023-11-29T09:06:28.677Z

We've identified an issue with SSH connections, where connections might fail. This issue has already been identified, and we're working on rolling out a fix for this.

Nov 24, 2023

Report: "Increased workload rescheduling"

Last update 2023-11-24T13:08:53.546Z

resolved2023-11-24T13:08:53.526Z

The changes have been effective and rescheduling activities are back to a normal level.

monitoring2023-11-22T08:35:31.285Z

The changes have been rolled out during the last maintenance window. We will monitor the workloads closely during the next few hours to verify that the rescheduling activity stays within an expected range.

identified2023-11-20T16:30:56.471Z

We're working on a mitigation for the issue at hand. The changes will be rolled out in the upcoming maintenance window and should improve scheduling speed as well as lower the possibility of unplanned workload rescheduling - We'll monitor the situation as soon as the change has been rolled out.

identified2023-11-20T14:10:01.144Z

We observed an increase in workload rescheduling and are currently exploring possible fixes for the root cause.

Nov 21, 2023

Report: "Site unavailability on DE3"

Last update 2023-11-21T17:19:02.362Z

resolved2023-11-21T17:19:02.348Z

The incident has been resolved.

identified2023-11-21T14:24:06.083Z

We've identified the issue and added a workaround - affected sites should have recovered. We are monitoring the situation closely.

investigating2023-11-21T14:03:19.041Z

We're seeing reports of sites being unavailable and getting timeouts on amazeeio-de3. Our team is currently investigating based on those reports. This seems to impact only a subset of sites.

Oct 20, 2023

Report: "Development Database Scaling - Finland"

Last update 2023-10-20T08:45:00.688Z

resolved2023-10-20T08:45:00.672Z

The development database instance has been scaled successfully.

identified2023-10-20T08:33:18.622Z

We've identified that there are workloads impacting the development database performance. We'll scale up the resources, which might lead to a temporary unavailability of the development environments for the FI region during the scaling operation.

Sep 27, 2023

Report: "Drupal build failures"

Last update 2023-09-27T00:44:33.428Z

resolved2023-09-27T00:44:33.415Z

This incident has been resolved.

monitoring2023-09-26T21:10:21.000Z

Additional resources have been provisioned and recently failing builds are no longer blocked.

identified2023-09-26T20:43:15.585Z

Some Drupal builds on US2 are currently failing due to networking issues with an upstream provider. We are provisioning additional resources to prevent further build failures.

Sep 21, 2023

Report: "Fastly API Issues"

Last update 2023-09-21T13:55:39.149Z

resolved2023-09-21T13:55:39.134Z

The Upstream issue has been resolved.

monitoring2023-09-21T13:16:16.758Z

A fix has been implemented and we are monitoring the results.

investigating2023-09-21T13:09:12.108Z

We're currently investigating issues caused by an Upstream API issue with Fastly - https://www.fastlystatus.com/incident/376081 Live traffic is not affected; we mostly see this incident causing issues on actions where we integrate with Fastly, e.g. Certificate Updates, Domain Updates or Changes to Fastly Services.

Sep 12, 2023

Report: "Deployments not starting"

Last update 2023-09-12T07:23:54.986Z

resolved2023-09-12T07:23:54.971Z

This incident has been resolved.

monitoring2023-09-10T22:03:14.749Z

A fix has been implemented and we are monitoring the results.

identified2023-09-08T17:18:29.885Z

We're working on a permanent solution for this issue - Customers who see deployments on US2 are stuck in New can contact support to get stuck deployment fixed.

identified2023-09-08T16:42:59.529Z

We have identified the issue - It only affects a small subset of deployments from progressing. Our Engineers are looking into solving the problem.

investigating2023-09-08T16:05:21.773Z

Deployments on us2.lagoon are blocked from starting and stay in "New" status. We are looking into resolving this issue. There is no impact on site availability.

Sep 4, 2023

Report: "Logging Infrastructure not available"

Last update 2023-09-04T19:59:28.626Z

resolved2023-09-04T19:59:28.615Z

The logging infrastructure is fully operational again.

monitoring2023-09-04T15:19:45.409Z

A fix has been implemented and we are monitoring the results.

identified2023-09-04T13:41:51.213Z

We are working on fully restoring the Logging service - Currently, responses might be slow while data is recovering. Recent logging data will become available in the next few hours.

identified2023-09-04T13:27:26.652Z

The issue has been identified and a fix is being implemented.

investigating2023-09-04T12:28:06.655Z

We're currently investigating an issue with the logging infrastructure

Aug 28, 2023

Report: "Lagoon API unavailability and slowness"

Last update 2023-08-28T14:10:46.956Z

resolved2023-08-28T14:10:46.942Z

We're closing the incident - The earlier-mentioned changes show that the API stability is back to normal levels.

monitoring2023-08-23T07:00:00.332Z

We have identified the most likely root cause of the slowness and stability issues over the last couple of weeks. We have rewritten the relevant code and deployed, and are monitoring closely. All signs are currently positive, and services are running normally.

identified2023-08-22T07:00:29.757Z

We're seeing the issue returning, leading to SSH and API timeouts. We will monitor and work on short term improvements, as required.

monitoring2023-08-21T13:32:22.139Z

Performance of the API and Dashboard have improved, cause was a high amount of messages and requests to be handled by the API. We are continuing to monitor and work on improvements to be able to handle the additional load.

identified2023-08-21T12:51:24.416Z

Unfortunately we're seeing problems again with the performance of the API and Dashboard, we're working on identifying the problem and fixing it.

monitoring2023-08-15T05:08:58.491Z

We have implemented a fix, we are continuing to monitor the situation

investigating2023-08-14T11:57:02.356Z

We are continuing to investigate this issue.

investigating2023-08-10T06:39:01.815Z

Issues with the API have started again, we are investigating

monitoring2023-08-09T14:24:45.513Z

We've put measures in place to stabilize API and the Lagoon Dashboard. There might be slow responses, and our team is working on getting everything back to speed. As we focus on fully resolving this issue, the updates regarding this incident may become less frequent.

identified2023-08-09T12:22:14.449Z

We're seeing the issue returning, leading to SSH and API timeouts. Our team is investigating.

identified2023-08-09T12:14:48.000Z

The issue has been identified and a fix is being implemented. Some customers might see intermittent SSH connectivity issues.

monitoring2023-08-09T11:35:04.429Z

The limitations that have been put in place were successful, and we were able to scale up to the standard capacity of the API. The API and Lagoon Dashboard are operating normally. Although customers may encounter intermittent delays in API response times. We continue to monitor the situation and take appropriate action if needed.

identified2023-08-09T09:06:16.235Z

The issue has been identified - Currently, there are a few limitations in place to see how the situation stabilizes and we're working to open up the API, Dashboard and SSH connections to full capacity again.

investigating2023-08-09T07:47:09.087Z

We continue to investigate the issue. There might be temporary issues with SSH connections. There seems to be an unusual amount of API request volume, which causes issues with the Lagoon API, Dashboard and SSH connections. We are looking into isolating the problem and putting limits in place.

monitoring2023-08-09T05:27:02.363Z

API is stable, but may be slow as things recover

investigating2023-08-09T05:04:41.368Z

We are continuing to investigate this issue.

investigating2023-08-09T04:30:43.642Z

Currently experiencing degraded API performance

Jul 31, 2023

Report: "Isolated connectivity issues"

Last update 2023-07-31T07:15:59.718Z

resolved2023-07-31T07:15:59.706Z

This incident has been resolved.

monitoring2023-07-28T14:37:38.107Z

The workloads have been evacuated from the faulty compute host and we are monitoring the connectivity between hosts.

identified2023-07-28T14:34:58.535Z

We identified connectivity issues originating from one of the compute hosts and will evacuate workloads running on this host.

Jun 21, 2023

Report: "Partial Request Failures"

Last update 2023-06-21T13:02:29.247Z

resolved2023-06-21T13:02:29.235Z

This incident has been resolved.

monitoring2023-06-21T08:57:41.839Z

Due to load spikes some requests on uk3 failed. This was mitigated by automated scale ups and we are monitoring the situation.

Jun 8, 2023

Report: "Deployment Failures on New Environments Containing Special Characters"

Last update 2023-06-08T10:43:22.553Z

resolved2023-06-08T10:43:22.538Z

This incident has been resolved.

identified2023-06-07T10:24:05.680Z

We are continuing to work on a fix for this issue.

identified2023-06-06T10:00:08.000Z

## Impact Currently, new environments with consecutive and trailing special characters such as dashes can not be deployed. We have identified the issue and are working on a permanent solution. ## Workaround Remove consecutive and trailing special characters from the environment/branch name ## Example Invalid: feature--new-ui- Valid: feature-new-ui

May 24, 2023

Report: "Intermittent connection issues between CDN and AWS Clusters"

Last update 2023-05-24T14:53:19.626Z

resolved2023-05-24T14:53:19.613Z

After many hours of work together with Fastly and AWS the root cause has been found and resolved. The workaround has been removed in April 2023. Active monitoring over the last weeks shows that the connection issues have been permanently resolved.

monitoring2022-11-03T17:30:09.761Z

We are continuing to monitor and trying to find the root cause of this issue, as there are many different engineering teams involved this takes time. We are though very certain that the current implemented workaround solves the issues and therefore there should be no impact on customer websites from this issue. We will keep this issue open and update it as soon as we found the root cause issue and a permanent resolution.

monitoring2022-10-04T21:28:23.707Z

Over the last 7 days environments that are using the amazee.io CDN (Fastly) and are hosted on AWS clusters have experienced elevated connection issues. While this only affected a very small amount (less than 0.01%) of requests, we started to analyze and investigate this issue together with the teams at Fastly and AWS. While we have not found the exact root cause yet, we found a workaround on the AWS Loadbalancers that reduces the connection issues to regular levels expected of regular internet connection issues. We are continuing to monitor this issue and trying to find the root cause together with Fastly and AWS. We will continue to provide updates here.