Replicated

Is Replicated Down Right Now? Check if there is a current outage ongoing.

Replicated is currently Operational

Last checked from Replicated's official status page

Historical record of incidents for Replicated

Report: "High error rate from upstream cloud provider"

Last update
monitoring

We are aware of upstream issues with Google Cloud Platform. This is presently impacting GKE components of our Compatibility Matrix product, and upstream GCR registries with our Replicated Registry. We are monitoring our cloud providers for updates.

Report: "CMX: Network Issues when creating RKE2 clusters"

Last update
investigating

We're aware of network issues when creating an RKE2 cluster and are investigating

Report: "CMX: Network Issues when creating RKE2 clusters"

Last update
Investigating

We're aware of network issues when creating an RKE2 cluster and are investigating

Report: "Compatibility Matrix: GKE 1.32 clusters failing to create"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We have rolled out a fix, pinning GKE 1.32 to version 1.32.2, and are monitoring the results now.

identified

We have identified an issue with pulling images from Google's registry, and are working with Google to find a resolution.

investigating

We have noticed that GKE 1.32 clusters are not reaching a 'running' state. We are currently investigating this issue.

Report: "Email deliverability spam issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

Check your spam folder. Our transactional email provider is seeing deliverability problems with gmail. https://status.postmarkapp.com/notices/bt3ky3r8zlaapqlo-increased-gmail-spam-reports

Report: "Issues provisioning CMX multi-node vm clusters"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Issues provisioning k3s and kind clusters in CMX"

Last update
resolved

This incident has been resolved.

monitoring

We are seeing k3s and kind CMX clusters spinning up successfully at this time without timing out. We are continuing to monitor this situation.

investigating

This issue is likely impacting kind cluster creation as well

investigating

We are currently investigating issues with provisioning k3s CMX clusters leading to timeouts

Report: "CMX: Openshift connectivity issues"

Last update
resolved

We have run diagnostics across the OpenShift cluster and found the source of the report, however it appears that the limitation is built into OpenShift by design and working as intended.

investigating

We are currently investigating issues with name resolution in some containers in OpenShift clusters, in Compatibility Matrix

Report: "Degraded Performance for Embedded Cluster and kURL in CMX"

Last update
resolved

Our monitoring confirms that remediation efforts are now showing successful k3s, kind, kURL, and Embedded Cluster in CMX.

monitoring

We have deployed changes to help mitigate issues creating k3s, kind, kURL, and Embedded Clusters on CMX. We are currently monitoring these changes.

identified

We believe we have identified the cause of these issues and are working to remediate them.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue

Report: "Increased 5xx rate for Vendor API"

Last update
resolved

This issue should now be resolved

monitoring

A long-term fix for this issue has been released and we are currently monitoring.

identified

We have identified the issues causing a higher than expected 5xx rate for the Vendor API. We have mitigated this by rolling back some recent changes and are working on a long-term solution.

Report: "Vendor Portal functionality may be affected by high rate of HTTP 429 errors for some users"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Errors when creating, updating, and downloading Native scheduler licenses."

Last update
resolved

This incident has been resolved.

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

This issue only impacts EOA distribution of Native applications, KOTS, Helm, kURL, EC distributed applications are not impacted. We are working to address this for impacted vendors.

identified

The issue has been identified and a fix is being implemented.

Report: "CMX Cluster History page erroring"

Last update
resolved

This incident has been resolved.

monitoring

We have confirmed that the CMX Cluster History page is functional once again and are monitoring.

identified

We have identified what we believe are the causes of these issues and are working to remediate them.

investigating

We've observed errors on our CMX Cluster History page and are currently investigating these issues

Report: "Delays with Customer Activation Code Emails"

Last update
resolved

Our email provider has stated the issue is resolved on their platform. We are also seeing successful email delivery at this time.

monitoring

Our email provider has implemented a fix to resolve email deliverability issues. We continue to monitor for resolution of their incident.

monitoring

We are continuing to monitor our email provider for resolution to this issue

monitoring

Our email provider is indicating that they are mitigating the cause of this issue and that emails are beginning to process and send. We continue to monitor for resolution on this issue.

monitoring

We are continuing to monitor our email provider for improvements to this issue

monitoring

Our email provider has updated their incident with information that they have identified the cause of delays and are actively working to resolve the issue.

monitoring

We believe these delays are related to an ongoing incident with our email provider. We are monitoring for resolution on their end at this time.

investigating

Customers are experiencing delays receiving activation code emails

Report: "Issues creating cluster for compatibility-matrix through Vendor Portal UI"

Last update
resolved

This incident has been resolved.

monitoring

We have rolled out a fix for the issue and are monitoring to ensure it is fully resolved.

identified

We have identified the source of the issue, we are working to implement a fix now.

investigating

We are investigating issues related to not being able to create clusters through the vendor portal for the compatibility-matrix service

Report: "Download portal page is degraded"

Last update
postmortem

Beta Feature Functionality Degraded: Embedded Cluster downloads were not available from the Download Portal.

resolved

The fix has been rolled out and tested. Thanks for your patience.

monitoring

The fix has been released and we are monitoring the results.

identified

The issue has been identified and we are working on rolling out a fix.

investigating

Some applications are unable to access the Embedded Cluster download page. We are currently investigating this issue.

Report: "Issues with Custom Hostnames"

Last update
resolved

This incident has been resolved.

monitoring

We have restored this functionality for customers and are monitoring at this time

identified

We are investigating reports that custom hostnames may not be working for some customers. We are currently working to restore this functionality for those affected.

Report: "Issues creating Kots releases in Vendor Portal"

Last update
resolved

This incident has been resolved.

identified

We have rolled back changes that have contributed to this issue. We can confirm that creating Kots releases in Vendor Portal is working as expected at this time. Users affected by this will need to log out and back in.

identified

We are working to roll back changes that may be contributing to this issue

investigating

We are currently investigating reports of issues creating Kots releases in the Vendor Portal

Report: "Registry Proxy Image Pulls Failing"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Usage of Docker Registry is impacted by ongoing incident with Docker"

Last update
resolved

Docker is reporting that the incident affecting the Docker Registry has been resolved

monitoring

We are monitoring progress on the ongoing incident with Docker Registry as this is impacting customer workloads

Report: "System outage"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently experiencing a system wide outage. Our team is looking into resolving it as soon as possible.

Report: "Newly created custom domains for replicated.app not working for Embedded Cluster installs"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Delay in processing audit log events"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

An unrelated internal data migration is taking longer and consuming more resources than anticipated. We’re actively monitoring the progress as the backlog of audit log events continues to catch up to current activity.

investigating

We currently experiencing a delay in processing audit log events.

Report: "Delay in processing reporting data"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented. This does not impact existing installations or new application installs.

Report: "We are experiencing an outage provisioning AKS clusters at this time"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Compatibility Matrix EKS clusters failing to provision"

Last update
resolved

Compatibility Matrix EKS is operational once again

monitoring

The underlying AWS Service incident has been declared as resolved. We are verifying that Compatibility Matrix EKS is fully operational at this time.

identified

Compatibility Matrix EKS is currently experiencing a service disruption as a result of an ongoing incident with AWS Services

Report: "CMX Cluster History Page does not load"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Unable to create and manage CMX clusters"

Last update
resolved

Unable to create and manage CMX clusters. The CMX API down returning 500 errors.

Report: "Compatibility Matrix AKS Clusters partial outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "New events are missing from Audit Log"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are currently experiencing an issue with Audit Log search not returning latest events.

Report: "Users in MFA-required teams are not able to accept invites"

Last update
resolved

This incident has been resolved.

investigating

User invited to team with MFA required are not able to accept invites and create accounts.

Report: "Periodic failures in Audit Log search"

Last update
resolved

This incident has been resolved.

identified

We are still in the process of restoring old events

identified

Audit Log is currently returning recent events only. We are in the process of restoring older events.

identified

The issue has been identified and a fix is being implemented.

investigating

We are investigating periodic errors in Audit Log search.

Report: "New teams may be unable to use the Customer search feature in Vendor Portal"

Last update
resolved

This incident has been resolved.

monitoring

We are now observing normal operations in the search cluster.

investigating

We are currently experiencing errors when updating customer search indices for some teams. This may result in delayed or missing data when using the Customer search feature. All up to date data is available on the individual customer pages.

Report: "Delayed event processing affecting customer search functionality"

Last update
resolved

The backlog of events has been processed.

identified

We have identified the issue and the backlog of events is being processed.

investigating

We are observing a large number of messages waiting to be processed. This causes Customer updates to be delayed, affecting the search functionality.

Report: "Elevated error rates for registry proxy"

Last update
resolved

This incident has been resolved.

monitoring

We have rolled back a change and are monitoring to ensure stability

investigating

We are observing elevated error rates in proxy.replicated.com. We are currently investigating the cause of the errors.

Report: "VM Based distributions in Compatibility Matrix are queueing"

Last update
resolved

We are seeing clusters provision correctly following our updates.

monitoring

We have implemented a fix and are monitoring to ensure clusters are provisioning correctly

identified

We have identified the cause these issues and are releasing changes to mitigate them

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating the causes of this issue

Report: "OpenShift clusters in CMX have incorrect DNS"

Last update
resolved

We have verified that this issue is now resolved and OpenShift is available

monitoring

We have finished rolling back these changes and are monitoring at this time

identified

We have identified the cause of this issue and are rolling a new feature back

investigating

We are investigating an issue where Openshift clusters in CMX have incorrect DNS records

Report: "Instances are not appearing on customer pages in Vendor Portal"

Last update
resolved

We have released a fix for this issue and have confirmed that instances are appearing on customer pages in Vendor Portal again.

identified

We have identified the root cause of this and are releasing a fix.

investigating

We are currently investigating this issue.

Report: "Timeouts occurring with lint.replicated.com"

Last update
resolved

We have thoroughly monitored this and are confident that timeouts for lint.replicated.com are no longer occurring at this time.

monitoring

We believe this is related to a transient networking issue with an upstream provider. We are continuing to monitor this.

monitoring

We are continuing to monitor this issue

monitoring

We have determined that this issue is not widespread. We are continuing to monitor this issue.

investigating

We are currently investigating this issue

Report: "Vendor Portal is showing blank screen"

Last update
resolved

This incident has been resolved.

monitoring

We have identified the cause of Vendor Portal failing to properly load and have rolled these back. We are currently monitoring the availability of the portal.

investigating

We are currently investigating an incident where the Replicated Vendor Portal is showing a blank screen.

Report: "Issues provisioning OpenShift 4.14 clusters in Compatibility Matrix"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

OpenShift 4.14 clusters in Compatibility Matrix are currently under maintenance and will not be available to provision

investigating

OpenShift 4.14 clusters in Compatibility Matrix are failing the verification step and not provisioning at this time. We are investigating.

Report: "Vendor UI customers page showing no active instances"

Last update
resolved

This incident has been resolved.

identified

We have identified a recent change that affects the list of active instances on the top-level Customers page. We are reviewing a potential fix and will post an update shortly.

investigating

We are currently investigating this issue.

Report: "Intermittent issues with Vendor Portal, Vendor API, and CMX"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We experienced an issue with resource contention in our primary database. The database cluster has been restarted and is actively responding to requests. We continue to monitor and will provide further updates.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating intermittent service issues around the Vendor Portal, Vendor API. The Compatibility Matrix has dependencies on these services and is also experiencing intermittent issues as a result.

Report: "Replicated Registry Error Rate"

Last update
resolved

The fix has been deployed and metrics indicate increased error rates have subsided.

monitoring

We have deployed a fix for this issue and are monitoring error rates to ensure stability.

identified

Customers may be experiencing an increased error rate when interacting with the registry. The cause has been identified and we are introducing a fix for it.

Report: "Vendor Portal API currently experiencing slow queries, affecting the Vendor Portal UI, CMX, and other products"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

The Vendor Portal API is currently experiencing slow queries, affecting the Vendor Portal UI, API services, CMX, and other products that consume the API. We are actively investigating this issue.

Report: "Vendor Portal and API Experiencing Issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating issues with the Vendor Portal API

Report: "Compatibility Matrix OpenShift Clusters Failing to Provision"

Last update
resolved

This incident has been resolved.

monitoring

The incident has been resolved beed marked as resolved by Cloudflare.

identified

Degraded performance in the Cloudflare DNS service and delayed propagation is causing Compatibility Matrix OpenShift Clusters to fail to provision with timeouts.

investigating

We are currently investigating this issue.

Report: "Support bundles cannot be uploaded in Vendor Portal"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

This is affecting Troubleshoot and Support pages. Support issues can still be submitted without support bundles.

Report: "Compatibility Matrix Partial Outage due to ongoing Cloudflare Incident"

Last update
resolved

All services have been restored to operational status. We're continuing to monitor stability and edge cases.

identified

Replicated's Compatibility Matrix product is experiencing ongoing issues for VM based distros. This is related to an ongoing Cloudflare incident. We are monitoring this and will provide updates.

Report: "Cloudflare Incident affecting several Replicated services"

Last update
resolved

This incident has been resolved.

identified

Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident

identified

Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident

identified

Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident

identified

Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident

identified

Cloudflare has partially restored services and is continuing to investigate further remediation of the incident

investigating

Cloudflare is continuing to investigate this issue

investigating

Cloudflare is continuing to investigate this issue

investigating

Cloudflare is continuing to investigate this issue

investigating

Cloudflare is continuing to investigate this issue

investigating

Cloudflare is continuing to investigate this issue

investigating

An ongoing Cloudflare incident is affecting several Replicated services. We are monitoring this situation closely.

Report: "Customer pages are inaccessible for some customers."

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Proxy image pulls from quay.io are not working"

Last update
resolved

This incident has been resolved.

identified

There is currently an ongoing issue with quay.io registry that is preventing proxied images from being pulled.