Historical record of incidents for Confluent Cloud
Report: "Google Cloud Platform Outage Affecting Multiple Confluent Cloud Services"
Last updateWe are currently experiencing outages on all regions on Google Cloud Platform, affecting several Confluent Cloud services. As of now, customers may experience issues provisioning new resources, creating cluster links across privately networked clusters, connectors hosted on GCP, and reading historical data from Kafka clusters. At this time, we are investigating the scope and impact of the outage. For more information on the GCP outage, please refer to https://status.cloud.google.com/.
Report: "Cluster connectivity issues being observed in AWS US-east-1"
Last updateWe are observing a spike in connectivity issue in the AWS US-east-1 and are currently investigating this issue.
Report: "Customers will encounter issues when creating new Kafka clusters within existing Confluent Cloud Networks on Azure that use Privatelink as the access method"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are continuing to investigate this issue.
We are currently investigating the issue
Report: "Customers will encounter issues when creating new Kafka clusters within existing Confluent Cloud Networks on Azure that use Privatelink as the access method"
Last updateWe are continuing to investigate this issue.
We are currently investigating the issue
Report: "Customers will experience issues creating and accessing new Kafka clusters using PrivateLink networking in Azure"
Last updateWe are currently investigating the issue
Report: "Confluent Cloud Metrics may return incorrect or no data for a few customers"
Last updateThe incident has been resolved. Time of impact is from 12:41 UTC to 12:46 UTC.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
The issue was identified and mitigated at 12:50 UTC.
We are continuing to investigate this issue.
Starting 12:40 UTC, Confluent Cloud Metrics are delayed. Queries for metrics may return incomplete, incorrect or no data for a few customers.
Report: "Confluent Cloud Metrics may return incorrect or no data for a few customers"
Last updateWe are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
The issue was identified and mitigated at 12:50 UTC.
We are continuing to investigate this issue.
Starting 12:40 UTC, Confluent Cloud Metrics are delayed. Queries for metrics may return incomplete, incorrect or no data for a few customers.
Report: "Confluent Cloud Metrics returned incorrect or no data"
Last updateThe issue has been identified and mitigated.
We are continuing to investigate this issue.
Starting 12:40 UTC, Confluent Cloud Metrics were delayed. Queries for metrics may have returned incomplete, incorrect or no data.
Report: "Confluent Cloud Flink - Customers may experience increased rate of degraded statements"
Last updateThe service has been stable since the last update. The incident is now fully resolved.
This issue has been mitigated and the team is monitoring the fix.
The issue is identified and we are working on its mitigation.
Starting 5/14 10::21 AM UTC, customers may be experiencing increased rate of degraded statements and temporary spike in CFU consumption on Confluent Cloud Flink service. We are currently investigating the issue.
Report: "Confluent Cloud Flink - Customers may experience failures with DROP TABLE statements"
Last updateThe rollout completed and no further issues have been observed. The incident is now resolved as of 23:30 UTC on May 14th, 2025.
The team is working on deploying updates for additional components. The ETA for the rollout has been adjusted to May 16 12:00 AM UTC.
The team is working on deploying updates for additional components. The ETA for the rollout has been adjusted to May 15 12:00 AM UTC.
Team identified additional components that need to be updated. ETA for rollout: 05/13 1 AM UTC
Fix is rolled out to all regions
Rollout is expected to finish by 05/10 1 AM UTC
After a detailed investigation and given the issue's limited impact, the team decided to proceed with the regular release process for the fix instead of manual intervention.
Investigation has shown that the issue is localized to a subset of regions in GCP and Azure. The team is working on mitigation.
Starting 5/7 8:39 PM UTC, customers may be experiencing errors while executing DROP TABLE statements on Confluent Cloud Flink service. We are currently investigating the issue.
Report: "Confluent Cloud Flink - Customers may experience increased rate of degraded statements"
Last updateStarting 5/14 10::21 AM UTC, customers may be experiencing increased rate of degraded statements and temporary spike in CFU consumption on Confluent Cloud Flink service. We are currently investigating the issue.
Report: "Cloud UI Homepage Status is degraded"
Last updateThis incident has now been fully resolved.
The Confluent Cloud UI homepage should now be visible, showing the summary of environments/clusters for most customers. Newer organizations created in the last three days may still not be showing correct summaries. The Confluent team expects to have the issue fully resolved for these remaining organizations by May 12 11:00 PM UTC.
The Confluent Cloud UI homepage should now be visible, showing the summary of environments/clusters for most customers. Newer organizations created in the last three days may still not be showing correct summaries. The Confluent team is aware, monitoring the change and the situation.
The Confluent Cloud UI homepage is in a degraded state and is currently unable to properly show the summary of environments/clusters. The underlying health of environments/clusters is unaffected. The team is working on a fix.
Report: "Cloud UI Homepage Status is degraded"
Last updateThe Confluent Cloud UI homepage is in a degraded state and is currently unable to properly show the summary of environments/clusters. The underlying health of environments/clusters is unaffected. The team is working on a fix.
Report: "Degraded experience with Flink queries"
Last updateThis incident has been resolved.
We are observing issues with Flink queries in some cloud regions. Impacted queries could be stuck in Resuming or Pending state. The team is actively investigating.
Report: "Degraded experience with Flink queries"
Last updateWe are observing issues with Flink queries in some cloud regions. Impacted queries could be stuck in Resuming or Pending state. The team is actively investigating.
Report: "Confluent Cloud Flink - Customers may experience failures with DROP TABLE statements"
Last updateStarting 5/7 8:39 PM UTC, customers may be experiencing errors while executing DROP TABLE statements on Confluent Cloud Flink service. We are currently investigating the issue.
Report: "Confluent Cloud Metrics API Delay"
Last updateOn May 1st, 2025 between 17:00 and 17:45 UTC, the Confluent Cloud Metrics API experienced ingestion delay for the "io.confluent.kafka.server/retained_bytes" metric for some customers. As a result, queries for this specific metric may have yielded incomplete, undercounted, or missing data.
Report: "Confluent Cloud Metrics API Delay"
Last updateOn March 1st, 2025 between 17:00 and 17:45 UTC, the Confluent Cloud Metrics API experienced ingestion delay for the "io.confluent.kafka.server/retained_bytes" metric for some customers. As a result, queries for this specific metric may have yielded incomplete, undercounted, or missing data.
Report: "Flink - metrics API and billing are experiencing delays or failures"
Last updateSince the fix was deployed, no further failures or delays have been observed. The incident is considered resolved
A fix has been deployed, successfully mitigating the issues.
The root cause has been identified, and the team is working to mitigate the impact on affected resources.
Flink metrics and billing within us-west-2 may be delayed or incomplete; triaging is ongoing to identify the nature of the issue.
Report: "Flink - metrics API and billing are experiencing delays or failures"
Last updateFlink metrics and billing within us-west-2 may be delayed or incomplete; triaging is ongoing to identify the nature of the issue.
Report: "Clusters not visible in UI, Terraform, and CLI"
Last updateThis problem has been mitigated as of 2025-04-21 15:52 UTC. No further issues have been observed.
A fix has been implemented and we are continuing to monitor APIs for errors.
We have identified a root cause and have put an initial mitigation in place. We are still investigating some additional errors in the API.
We are currently experiencing an issue where multiple customers are unable to view any clusters on the Cloud UI console, Terraform, or API, with Networking Services returning HTTP 500 errors. We are continuing to investigate this issue.
We are continuing to investigate HTTP 500 errors in production networking environments causing Clusters to not be visible in the Confluent UI.
We are continuing to investigate this issue.
We are currently investigating HTTP 500 errors in production networking environments
Report: "Clusters not visible in UI, Terraform, and CLI"
Last updateWe are currently experiencing an issue where multiple customers are unable to view any clusters on the Cloud UI console, Terraform, or API, with Networking Services returning HTTP 500 errors. We are continuing to investigate this issue.
We are continuing to investigate HTTP 500 errors in production networking environments causing Clusters to not be visible in the Confluent UI.
We are continuing to investigate this issue.
We are currently investigating HTTP 500 errors in production networking environments
Report: "Clusters not visible in UI"
Last updateWe are continuing to investigate HTTP 500 errors in production networking environments causing Clusters to not be visible in the Confluent UI.
We are continuing to investigate this issue.
We are currently investigating HTTP 500 errors in production networking environments
Report: "Networking Service 500 Errors"
Last updateWe are currently investigating HTTP 500 errors in production networking environments
Report: "Confluent Cloud incident affecting Control Plane Authorization"
Last updateFrom 21:59 UTC to 22:21 UTC customers observed errors in the Confluent Cloud UI, control plane APIs, CLI, Terraform and Metrics API due to an Authorization Service outage affecting control plane services. The issue has been resolved and there was no impact on data plane services.
Report: "Confluent Cloud incident affecting Control Plane Authorization"
Last updateFrom 21:59 UTC to 22:21 UTC customers observed errors in the Confluent Cloud UI, CLI and Terraform when making calls to control plane API endpoints. The issue has been resolved and there was no impact on data plane services or to other regions.
Report: "Confluent Cloud incident in AWS us-west-2"
Last updateFrom 21:59 UTC to 22:21 UTC customers observed errors in the Confluent Cloud UI, CLI and Terraform when making calls to control plane endpoints. The issue has been resolved and there was no impact on data plane services or to other regions.
Report: "Errors connecting to Kafka in Azure/canadacentral"
Last updateFrom approximately 02:23 - 02:43 UTC on 4/2/25, customers connecting to Kafka clusters in Azure/canadacentral may have experienced errors.
Report: "Errors connecting to Kafka in Azure/canadacentral"
Last updateFrom approximately 02:23 - 02:43 UTC on 4/2/25, customers connecting to Kafka clusters in Azure/canadacentral may have experienced errors.
Report: "Cluster Unavailability in Azure North Europe"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified as being caused by the ongoing Microsoft Azure North Europe outage. The Microsoft team is actively working on recovery.
We are currently investigating this issue.
Report: "Cluster Unavailability in Azure North Europe"
Last updateWe are currently investigating this issue.
Report: "Kafka cluster and network provisioning delays"
Last updateOn March 28, 2025, between 14:23 and 16:40 UTC, the provisioning of Kafka clusters and networks were delayed. The root cause has been resolved and Kafka clusters and networks are being deployed. There is no customer action required.
Report: "Kafka cluster and network provisioning delays"
Last updateOn March 28, 2025, between 14:23 and 16:40 UTC, the provisioning of Kafka clusters and networks were delayed. The root cause has been resolved and Kafka clusters and networks are being deployed. There is no customer action required.
Report: "Delays in provisioning Kafka clusters in Confluent Cloud"
Last updateThis incident has been resolved.
Customers may experience delays in provisioning new Kafka clusters in Confluent Cloud. Confluent engineering has identified the root cause and is working on the fix.
Report: "Delays in provisioning Kafka clusters in Confluent Cloud"
Last updateThis incident has been resolved.
Customers may experience delays in provisioning new Kafka clusters in Confluent Cloud. Confluent engineering has identified the root cause and is working on the fix.
Report: "Confluent Cloud scheduled maintenance"
Last updateThe scheduled maintenance has been completed.
Scheduled maintenance is still in progress. We will provide updates as necessary.
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Confluent Cloud is currently having a scheduled maintenance operation from 1:00 A.M to 3:00 A.M. UTC on March 21, 2025. During this time, managing Stream Shares and topic Access Requests will be disabled. All other Control Plane APIs and Data Plane APIs are not affected. Regular operations will resume shortly after. Learn more here: https://support.confluent.io/hc/en-us/articles/34941970895636-March-21st-2025-1-00-3-00-AM-UTC-Scheduled-Maintenance-of-Confluent-Cloud?ajs_aid=e7adf4e6-b539-43e6-ba66-7055abf695f9&ajs_uid=1122684
Report: "Azure eastus Degradation"
Last updateAzure has mitigated the networking issues and all errors on Confluent's end are recovered. See https://azure.status.microsoft/en-us/status for more details.
Azure is working to restore the networking issues, see https://azure.status.microsoft/en-us/status for more details. We are seeing recovery on our end and actively monitoring the situation.
Azure has identified a networking issue in eastus, see https://azure.status.microsoft/en-us/status for more details. Customers may also run into issues provisioning new clusters in eastus at this time.
Confluent engineering is investigating availability degradation on Azure eastus clusters. Customers using this region may experience unavailability and elevated latencies. In addition, customers may have issues using Schema Registry in this region.
Report: "Azure eastus Degradation"
Last updateAzure has mitigated the networking issues and all errors on Confluent's end are recovered. See https://azure.status.microsoft/en-us/status for more details.
Azure is working to restore the networking issues, see https://azure.status.microsoft/en-us/status for more details. We are seeing recovery on our end and actively monitoring the situation.
Azure has identified a networking issue in eastus, see https://azure.status.microsoft/en-us/status for more details. Customers may also run into issues provisioning new clusters in eastus at this time.
Confluent engineering is investigating availability degradation on Azure eastus clusters. Customers using this region may experience unavailability and elevated latencies. In addition, customers may have issues using Schema Registry in this region.
Report: "Flink Compute Pool provisioning degraded for GCP us-east1"
Last updateThis incident has been resolved.
Hot fix is ready to be deployed across the clusters and we are monitoring the progress.
All the affected workloads are mitigated and compute pools are provisioned.
We are experiencing degraded performance in provisioning for Flink customers in us-east1. Existing workloads should be fine. Engineers have identified the root cause and working on the fix.
Report: "Flink Compute Pool provisioning degraded for GCP us-east1"
Last updateThis incident has been resolved.
Hot fix is ready to be deployed across the clusters and we are monitoring the progress.
All the affected workloads are mitigated and compute pools are provisioned.
We are experiencing degraded performance in provisioning for Flink customers in us-east1. Existing workloads should be fine. Engineers have identified the root cause and working on the fix.
Report: "Flink workspace management degraded"
Last updateA fix has been rolled out and this issue is fully resolved.
A fix has been implemented and we are monitoring the results.
Users may experience issues managing Flink workspaces, and/or running statements from within a workspace. The `Run` button on a statement in a workspace may be greyed out. Confluent engineering is investigating.
Report: "KSQL and Schema Registry creation/deletion degraded performance"
Last updateThis incident has been resolved.
The issue has been mitigated and we are actively monitoring it
The issue has been identified and a fix is being implemented.
Creation/Deletion of Schema Registry and KSQL, as well as API key creation/deletion for new or existing KSQL and Schema Registry clusters is currently impacted. We are currently investigating the issue
Report: "Azure southcentralus availability degredation"
Last updateThis incident has been resolved.
Confluent engineering is investigating availability degradation on Azure southcentralus clusters.
Report: "Egress IPs are not discoverable."
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Egress IP addresses are not currently being displayed in the Confluent Cloud UI as described in https://docs.confluent.io/cloud/current/connectors/static-egress-ip.html. They are also not being returned by the equivalent APIs, CLI, or Terraform. We are currently investigating this issue.
Report: "Experiencing partial outage in AZURE in US EAST 2 region"
Last updateThis incident has been resolved.
The issue has been mitigated, and we are actively monitoring the clusters.
Azure cloud confirmed that the outages have resulted from Unexpected VM reboots. "Impact Statement: Starting at approximately 01:40 UTC on 25 Feb 2025, Azure customers in East US 2 may have experienced VM reboots and/or increased response latencies in the region.Current Status: We are aware of the issue and actively investigating. Initial findings indicate that a subset of VMs in East US 2 may have rebooted. The next update will be provided in 60 minutes or sooner if there are significant developments.This message was last updated at 04:53 UTC on 25 February 2025" We are continuing to monitor the system and mitigate the issue working with Azure.
Recovery is in progress, we are continuing to monitor.
We are currently investigating the issue and attempting to mitigate.
Report: "Apache Flink incident in AWS us-west-2"
Last updateThis incident has been resolved.
We are experiencing availability problems with Flink in AWS us-west-2. The symptoms started at 11:30 UTC. Customers may observe availability issues. We are currently investigating, and will update as we know more.
Report: "Confluent Cloud incident affecting Kafka clusters in Azure norwayeast"
Last updateThis incident has been resolved.
We are experiencing connectivity issues with Kafka in Azure norwayeast. The problems started at 09:45 UTC. We are currently investigating, and will update as we know more.
Report: "Cluster Provisioning - degraded performance"
Last updateProvisioning performance is back to normal
We have deployed a fix for this issue and are monitoring the affected metrics.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We have noticed degraded performance when provisioning dedicated Kafka clusters and other products in a few busiest regions. We are investigating the root cause and will provide another update at 10AM UTC
Report: "connectivity issues in AWS eu-north-1. Customers may experience availability issues in the region."
Last updateWe have fully recovered and operational now.
We have mostly recovered at this time except for 3 clusters that still have some partial impact. Update from the underlying cloud service provider side: We are continuing to see recovery for error rates and latencies for multiple AWS Services in the EU-NORTH-1 Region. We are monitoring the network and as we work towards full recovery, some requests may continue to timeout or be throttled. We recommend customers retry failed requests where possible. We will continue to provide additional information as we have it, or within the next 60 minutes.
We are currently experiencing partial unavailability in the AWS EU-NORTH-1 region, affecting a single Availability Zone (AZ). Some customer workloads may be impacted partially in terms of availability and performance. Some clusters remain operational and can process client workloads on certain brokers. However, external network connectivity issues have been detected. Our automation systems are demoting affected brokers in the impacted AZ to mitigate the issue. Further updates will be provided as we work towards resolution.
We have limited the impact to a single availability zone (AZ) thats partially unavailable. Clusters in this AZ are able to process client workloads on some kafka brokers.
We are continuing to investigate this issue.
We are having connectivity issues in AWS eu-north-1. Customers may experience availability issues.
Report: "Kafka Rest API unreachable for certain dedicated networks"
Last updateThis incident has been resolved.
The fix has been rolled out to all affected sites, and we are currently monitoring the incident.
We have identified an issue due to an incompatible software rollout affecting AWS us-east-1 customers. A fix is being rolled-out in waves and is estimated to take a few hours. We will update this incident as we know more.
We are experiencing an issue where some customers are getting errors while hitting Kafka Rest endpoints. We are currently investigating, and will update as we know more.
Report: "Experiencing connectors provisioning delays"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Customers experiencing HTTP 401 errors on Schema Registry API requests"
Last updateThis incident has been resolved.
This incident has been resolved.
The fix has been rolled out and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Some customers are experiencing HTTP 401 errors on Schema Registry API requests. Engineers are engaged on the issue and are troubleshooting the cause.
Report: "Metrics API is Experiencing Delays"
Last updateMetrics API is now operating normally. The root cause of the issue was identified and a permanent fix applied.
Metrics-API is experiencing higher latency and error rates starting January 11, 12:09 UTC. We are continuing to investigate this issue.
The Metrics API system has been operating with normal behavior since around 21:40 UTC
We are continuing to investigate this issue.
We are continuing to investigate this issue.
Metrics-api is experiencing higher latency and error rates starting 15.15 UTC.
Report: "Some clusters seeing problems provisioning nodes in Azure East US2 deployments"
Last updateThis incident has been resolved. The affected clusters have recovered.
We are currently investigating the issue. It was reported at 16:56 UTC
Report: "Metrics API is Experiencing Delays"
Last updateThis incident has been resolved.
Incident mitigated
We are continuing to investigate this issue.
Metrics-api is experienced higher than usual latencies and error rates from 10.45 to 11.15 UTC Metrics-api is experiencing higher than usual latencies and error rates since 12.10 UTC
Report: "Experiencing network issues within Azure East US2 deployments"
Last updateThis incident has been resolved.
Impact has been identified solely as Azure network outage; more can be found at https://azure.status.microsoft/en-us/status. We have taken all the actions necessary to mitigate affected customers' issues while Azure restores their service availability.
Due to an ongoing Azure network failure, customers may experience connectivity issues within the Azure East US2 region. We are investigating possible mitigation procedures for affected customers while working with the cloud provider to find a resolution.
Report: "Confluent Cloud - Degraded Performance - Azure-South-Central-US"
Last updateAll known issues directly caused by this incident have been mitigated. Any customers still experiencing issues should contact Confluent support. There will be no more updates for this issue after this.
Customer impact is mostly mitigated at this point. We are still validating that all workloads are working correctly, and will provide the next update no later than 4 PM PST.
We are starting to see partial recovery in the region. We are continuing efforts to mitigate impact, and will provide another update no later than 3 PM PST.
Impact appears to be caused by an Azure outage that started at approximately the same time. Confluent engineers are closely attempting to mitigate impact before the restoration of Azure systems. We will provide an update by 2pm PST. More details on the issue can be found at: https://azure.status.microsoft/en-us/status
We are continuing to investigate the issue, and will provide another update by 2 PM PST. Impact remains limited to Azure South Central US at this time.
Some customers may experience degraded performance in the Azure cloud South-Central-US region. The problem started at 11:04 AM AM PST. We have identified the issue and working with the cloud provider for a resolution.
Report: "We are experiencing problems with some Flink jobs."
Last updateThe situation is resolved and we are proactively reaching out to any customers that may have been impacted.
We are experiencing problems with some Flink jobs. We are currently investigating, and will update as we know more.
Report: "Single sign-On (SSO) authentication and authorization flows are not working as expected"
Last updateThis incident has been resolved.
The fix has been rolled out and we are monitoring the results.
The fix is being rolled out to production. We will provide the next status update at 1 pm Pacific Time.
This issue has been identified as impacting Schema Registry operations when group mappings are used as part of Single sign-on. The root cause has been identified and engineers are working on a mitigation.
Confluent is aware of SSO related authentication and authorization issues related to multiple parts of the Confluent Cloud platform. Engineers are actively investigating the issue.
Report: "Some KSQL clusters may experience high saturation"
Last updateThis incident has been resolved.
The fix has been rolled out and we are monitoring the results.
The fix is being rolled out to production. We will provide the next status update at 6am Pacific Time.
The issue has been identified and a fix is being implemented.
We have identified the cause of the issue and in process of rolling out the solution.
Some KSQL clusters may be experiencing Node lag that may cause increased consumer lag. The impact started at Dec 13 2024 and impacts all regions. We are currently investigating and will update as we know more.
Report: "Confluent Cloud API requests may return HTTP 401 errors"
Last updateThe issue is mitigated.
Confluent engineers have applied a mitigation and are monitoring systems. Mitigation was applied at 01:31 UTC.
Users of Confluent Cloud may experience API requests returning HTTP 401 errors. Engineers are engaged and are in the process of remediating the issue.
Report: "Confluent Cloud - Degraded Performance - AWS ap-northeast-1"
Last updateThis incident has been resolved.
The issue has been mitigated.
Some customers may experience degraded performance in the AWS cloud ap-northeast-1 region. The problem started at 10 AM UTC. We identified the root cause and are working on mitigating the issue. We will update once the issue is mitigated.
Report: "Control Plane resources unavailable using UI/CLI"
Last updateThis incident is resolved as of Dec 6th, 1:00 AM UTC.
We have fixed the issue, and customers should see the UI and CLI functioning. We believe the problem should be mitigated by Dec 6th, 1:00 AM UTC.
We are continuing to investigate this issue.
We are currently experiencing UI and CLI issues. These issues impact workflows, such as new cluster provisioning through UI/CLI. The problems started on Dec 5th, 10:20 PM UTC. We are investigating and will update you as we learn more.
Report: "Cluster provisioning and expansion issues in GCP me-central-2 region."
Last updateGCP has confirmed that all storage capacity related issues were resolved by November 22 2024.
Starting on October 25 10:30pm UTC, customers may experience cluster expansion issues in the GCP me-central-2 region. Provisioning of new single zone clusters in me-central-2 region may also fail intermittently and new multi-zone clusters in the same region will have reduced availability due to the underlying cloud provider's storage capacity availability. We have identified the issue and working with the cloud provider for a resolution.
Report: "Provisioning of clusters and network resources is blocked in all cloud regions"
Last updateNo further issues have been observed. The incident is now resolved as of 14:00 UTC on November 25, 2024.
We have identified the cause of the problem and have implemented mitigation steps. Provisioning was unblocked as of 12:16 UTC on November 25, 2024. We will continue to monitor for any issues and plan to resolve this incident in 1 hour.
We are experiencing issues with provisioning clusters and network resources in all cloud regions. The problem started at 11:25 UTC on November 25, 2024. We are currently investigating, and will update as we know more.
Report: "Confluent Cloud UI is unavailable"
Last updateThe issue has been resolved now, the UI is fully functional now.
We are continuing to monitor for any further issues.
The issue has now been fixed. We will continue to monitor the UI availability.
The new fix is being deployed, will provide an update shortly.
An additional fix is needed to resolve this issue, we are working on deploying it.
The fix is being deployed, we will provide an update in around 30 minutes.
The issue has been identify and a fix is being implemented.
We are investigating the issue and will provide an update shortly.
Report: "Confluent Cloud Control Plane API/UI unavailability"
Last updateThis incident has been resolved.
5xx errors from Confluent Gateway service for 5-10 minutes. Status is back to normal, we are monitoring.
Report: "Confluent Cloud - Metrics API is currently experiencing elevated latency and error rate"
Last updateThis incident has been resolved.
We are currently investigating this issue.
A fix has beenimplemented and the systems are recovering.
Confluent Cloud Metrics API is currently experiencing elevated latency and error rate.
Report: "Some Clusters are incorrectly showing `Provisioning` status."
Last updateThis incident has been resolved. All clusters should show their correct status.
Some clusters in an `Up` state were incorrectly showing as in a `Provisioning` state. This issue has been mitigated and we are continuing to monitor it.
Report: "Cluster Creation Delayed - AWS, GCP, Azure"
Last updateTracked in RCCA
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Some customers may experience delayed cluster provisioning.
Report: "docs.confluent.io is unavailable"
Last updateThis incident has been resolved.
We are currently experiencing service distrubtion on our Confluent documentation site docs.confluent.io. Affected users, might not be able to access the online Confluent documentation due to this issue.
Report: "Authentication Issues"
Last updateThis incident has been resolved.
We have identified a mitigation and are applying a fix, clusters should expect to see recovery over the next 30 minutes.
Some customers may experience SSL issues when trying to connect to Kafka clusters in the following clouds and regions: Azure: westeurope, eastus2, azure, centralus AWS: us-west-2 GCP: australia-southeast1, us-west2 We are currently investigating the issue and attempting to mitigate or provide a work-around.
Report: "Some Confluent Cloud users might experience issues with Kafka read/write unavailability in AWS us-east-1 region"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We have fixed the issue and are monitoring it to ensure it does not recur. As of September 28, 2024, 12:07 AM UTC, customers should see normal operations with Confluent systems.
We have identified the issue and are actively working on the mitigation.
Some Confluent Cloud users might experience issues with Kafka read/write unavailability in the AWS us-east-1 region. The problem started at 20:11 UTC. We are currently investigating and will update you as we learn more.
Report: "CCloud UI is down"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating the cause and a fix to restore functionality. Other services should not be affected