Historical record of incidents for Replicated
Report: "High error rate from upstream cloud provider"
Last updateWe are aware of upstream issues with Google Cloud Platform. This is presently impacting GKE components of our Compatibility Matrix product, and upstream GCR registries with our Replicated Registry. We are monitoring our cloud providers for updates.
Report: "CMX: Network Issues when creating RKE2 clusters"
Last updateWe're aware of network issues when creating an RKE2 cluster and are investigating
Report: "CMX: Network Issues when creating RKE2 clusters"
Last updateWe're aware of network issues when creating an RKE2 cluster and are investigating
Report: "Compatibility Matrix: GKE 1.32 clusters failing to create"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We have rolled out a fix, pinning GKE 1.32 to version 1.32.2, and are monitoring the results now.
We have identified an issue with pulling images from Google's registry, and are working with Google to find a resolution.
We have noticed that GKE 1.32 clusters are not reaching a 'running' state. We are currently investigating this issue.
Report: "Email deliverability spam issue"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Check your spam folder. Our transactional email provider is seeing deliverability problems with gmail. https://status.postmarkapp.com/notices/bt3ky3r8zlaapqlo-increased-gmail-spam-reports
Report: "Issues provisioning CMX multi-node vm clusters"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Issues provisioning k3s and kind clusters in CMX"
Last updateThis incident has been resolved.
We are seeing k3s and kind CMX clusters spinning up successfully at this time without timing out. We are continuing to monitor this situation.
This issue is likely impacting kind cluster creation as well
We are currently investigating issues with provisioning k3s CMX clusters leading to timeouts
Report: "CMX: Openshift connectivity issues"
Last updateWe have run diagnostics across the OpenShift cluster and found the source of the report, however it appears that the limitation is built into OpenShift by design and working as intended.
We are currently investigating issues with name resolution in some containers in OpenShift clusters, in Compatibility Matrix
Report: "Degraded Performance for Embedded Cluster and kURL in CMX"
Last updateOur monitoring confirms that remediation efforts are now showing successful k3s, kind, kURL, and Embedded Cluster in CMX.
We have deployed changes to help mitigate issues creating k3s, kind, kURL, and Embedded Clusters on CMX. We are currently monitoring these changes.
We believe we have identified the cause of these issues and are working to remediate them.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating this issue
Report: "Increased 5xx rate for Vendor API"
Last updateThis issue should now be resolved
A long-term fix for this issue has been released and we are currently monitoring.
We have identified the issues causing a higher than expected 5xx rate for the Vendor API. We have mitigated this by rolling back some recent changes and are working on a long-term solution.
Report: "Vendor Portal functionality may be affected by high rate of HTTP 429 errors for some users"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Report: "Errors when creating, updating, and downloading Native scheduler licenses."
Last updateThis incident has been resolved.
We are continuing to work on a fix for this issue.
We are continuing to work on a fix for this issue.
This issue only impacts EOA distribution of Native applications, KOTS, Helm, kURL, EC distributed applications are not impacted. We are working to address this for impacted vendors.
The issue has been identified and a fix is being implemented.
Report: "CMX Cluster History page erroring"
Last updateThis incident has been resolved.
We have confirmed that the CMX Cluster History page is functional once again and are monitoring.
We have identified what we believe are the causes of these issues and are working to remediate them.
We've observed errors on our CMX Cluster History page and are currently investigating these issues
Report: "Delays with Customer Activation Code Emails"
Last updateOur email provider has stated the issue is resolved on their platform. We are also seeing successful email delivery at this time.
Our email provider has implemented a fix to resolve email deliverability issues. We continue to monitor for resolution of their incident.
We are continuing to monitor our email provider for resolution to this issue
Our email provider is indicating that they are mitigating the cause of this issue and that emails are beginning to process and send. We continue to monitor for resolution on this issue.
We are continuing to monitor our email provider for improvements to this issue
Our email provider has updated their incident with information that they have identified the cause of delays and are actively working to resolve the issue.
We believe these delays are related to an ongoing incident with our email provider. We are monitoring for resolution on their end at this time.
Customers are experiencing delays receiving activation code emails
Report: "Issues creating cluster for compatibility-matrix through Vendor Portal UI"
Last updateThis incident has been resolved.
We have rolled out a fix for the issue and are monitoring to ensure it is fully resolved.
We have identified the source of the issue, we are working to implement a fix now.
We are investigating issues related to not being able to create clusters through the vendor portal for the compatibility-matrix service
Report: "Download portal page is degraded"
Last updateBeta Feature Functionality Degraded: Embedded Cluster downloads were not available from the Download Portal.
The fix has been rolled out and tested. Thanks for your patience.
The fix has been released and we are monitoring the results.
The issue has been identified and we are working on rolling out a fix.
Some applications are unable to access the Embedded Cluster download page. We are currently investigating this issue.
Report: "Issues with Custom Hostnames"
Last updateThis incident has been resolved.
We have restored this functionality for customers and are monitoring at this time
We are investigating reports that custom hostnames may not be working for some customers. We are currently working to restore this functionality for those affected.
Report: "Issues creating Kots releases in Vendor Portal"
Last updateThis incident has been resolved.
We have rolled back changes that have contributed to this issue. We can confirm that creating Kots releases in Vendor Portal is working as expected at this time. Users affected by this will need to log out and back in.
We are working to roll back changes that may be contributing to this issue
We are currently investigating reports of issues creating Kots releases in the Vendor Portal
Report: "Registry Proxy Image Pulls Failing"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Usage of Docker Registry is impacted by ongoing incident with Docker"
Last updateDocker is reporting that the incident affecting the Docker Registry has been resolved
We are monitoring progress on the ongoing incident with Docker Registry as this is impacting customer workloads
Report: "System outage"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are currently experiencing a system wide outage. Our team is looking into resolving it as soon as possible.
Report: "Newly created custom domains for replicated.app not working for Embedded Cluster installs"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Delay in processing audit log events"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
An unrelated internal data migration is taking longer and consuming more resources than anticipated. We’re actively monitoring the progress as the backlog of audit log events continues to catch up to current activity.
We currently experiencing a delay in processing audit log events.
Report: "Delay in processing reporting data"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented. This does not impact existing installations or new application installs.
Report: "We are experiencing an outage provisioning AKS clusters at this time"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Compatibility Matrix EKS clusters failing to provision"
Last updateCompatibility Matrix EKS is operational once again
The underlying AWS Service incident has been declared as resolved. We are verifying that Compatibility Matrix EKS is fully operational at this time.
Compatibility Matrix EKS is currently experiencing a service disruption as a result of an ongoing incident with AWS Services
Report: "CMX Cluster History Page does not load"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Unable to create and manage CMX clusters"
Last updateUnable to create and manage CMX clusters. The CMX API down returning 500 errors.
Report: "Compatibility Matrix AKS Clusters partial outage"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "New events are missing from Audit Log"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently experiencing an issue with Audit Log search not returning latest events.
Report: "Users in MFA-required teams are not able to accept invites"
Last updateThis incident has been resolved.
User invited to team with MFA required are not able to accept invites and create accounts.
Report: "Periodic failures in Audit Log search"
Last updateThis incident has been resolved.
We are still in the process of restoring old events
Audit Log is currently returning recent events only. We are in the process of restoring older events.
The issue has been identified and a fix is being implemented.
We are investigating periodic errors in Audit Log search.
Report: "New teams may be unable to use the Customer search feature in Vendor Portal"
Last updateThis incident has been resolved.
We are now observing normal operations in the search cluster.
We are currently experiencing errors when updating customer search indices for some teams. This may result in delayed or missing data when using the Customer search feature. All up to date data is available on the individual customer pages.
Report: "Delayed event processing affecting customer search functionality"
Last updateThe backlog of events has been processed.
We have identified the issue and the backlog of events is being processed.
We are observing a large number of messages waiting to be processed. This causes Customer updates to be delayed, affecting the search functionality.
Report: "Elevated error rates for registry proxy"
Last updateThis incident has been resolved.
We have rolled back a change and are monitoring to ensure stability
We are observing elevated error rates in proxy.replicated.com. We are currently investigating the cause of the errors.
Report: "VM Based distributions in Compatibility Matrix are queueing"
Last updateWe are seeing clusters provision correctly following our updates.
We have implemented a fix and are monitoring to ensure clusters are provisioning correctly
We have identified the cause these issues and are releasing changes to mitigate them
We are continuing to investigate this issue.
We are currently investigating the causes of this issue
Report: "OpenShift clusters in CMX have incorrect DNS"
Last updateWe have verified that this issue is now resolved and OpenShift is available
We have finished rolling back these changes and are monitoring at this time
We have identified the cause of this issue and are rolling a new feature back
We are investigating an issue where Openshift clusters in CMX have incorrect DNS records
Report: "Instances are not appearing on customer pages in Vendor Portal"
Last updateWe have released a fix for this issue and have confirmed that instances are appearing on customer pages in Vendor Portal again.
We have identified the root cause of this and are releasing a fix.
We are currently investigating this issue.
Report: "Timeouts occurring with lint.replicated.com"
Last updateWe have thoroughly monitored this and are confident that timeouts for lint.replicated.com are no longer occurring at this time.
We believe this is related to a transient networking issue with an upstream provider. We are continuing to monitor this.
We are continuing to monitor this issue
We have determined that this issue is not widespread. We are continuing to monitor this issue.
We are currently investigating this issue
Report: "Vendor Portal is showing blank screen"
Last updateThis incident has been resolved.
We have identified the cause of Vendor Portal failing to properly load and have rolled these back. We are currently monitoring the availability of the portal.
We are currently investigating an incident where the Replicated Vendor Portal is showing a blank screen.
Report: "Issues provisioning OpenShift 4.14 clusters in Compatibility Matrix"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
OpenShift 4.14 clusters in Compatibility Matrix are currently under maintenance and will not be available to provision
OpenShift 4.14 clusters in Compatibility Matrix are failing the verification step and not provisioning at this time. We are investigating.
Report: "Vendor UI customers page showing no active instances"
Last updateThis incident has been resolved.
We have identified a recent change that affects the list of active instances on the top-level Customers page. We are reviewing a potential fix and will post an update shortly.
We are currently investigating this issue.
Report: "Intermittent issues with Vendor Portal, Vendor API, and CMX"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We experienced an issue with resource contention in our primary database. The database cluster has been restarted and is actively responding to requests. We continue to monitor and will provide further updates.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating intermittent service issues around the Vendor Portal, Vendor API. The Compatibility Matrix has dependencies on these services and is also experiencing intermittent issues as a result.
Report: "Replicated Registry Error Rate"
Last updateThe fix has been deployed and metrics indicate increased error rates have subsided.
We have deployed a fix for this issue and are monitoring error rates to ensure stability.
Customers may be experiencing an increased error rate when interacting with the registry. The cause has been identified and we are introducing a fix for it.
Report: "Vendor Portal API currently experiencing slow queries, affecting the Vendor Portal UI, CMX, and other products"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
The Vendor Portal API is currently experiencing slow queries, affecting the Vendor Portal UI, API services, CMX, and other products that consume the API. We are actively investigating this issue.
Report: "Vendor Portal and API Experiencing Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating issues with the Vendor Portal API
Report: "Compatibility Matrix OpenShift Clusters Failing to Provision"
Last updateThis incident has been resolved.
The incident has been resolved beed marked as resolved by Cloudflare.
Degraded performance in the Cloudflare DNS service and delayed propagation is causing Compatibility Matrix OpenShift Clusters to fail to provision with timeouts.
We are currently investigating this issue.
Report: "Support bundles cannot be uploaded in Vendor Portal"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
This is affecting Troubleshoot and Support pages. Support issues can still be submitted without support bundles.
Report: "Compatibility Matrix Partial Outage due to ongoing Cloudflare Incident"
Last updateAll services have been restored to operational status. We're continuing to monitor stability and edge cases.
Replicated's Compatibility Matrix product is experiencing ongoing issues for VM based distros. This is related to an ongoing Cloudflare incident. We are monitoring this and will provide updates.
Report: "Cloudflare Incident affecting several Replicated services"
Last updateThis incident has been resolved.
Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident
Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident
Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident
Cloudflare is seeing gradual improvement to services and is continuing to investigate further remediation of the incident
Cloudflare has partially restored services and is continuing to investigate further remediation of the incident
Cloudflare is continuing to investigate this issue
Cloudflare is continuing to investigate this issue
Cloudflare is continuing to investigate this issue
Cloudflare is continuing to investigate this issue
Cloudflare is continuing to investigate this issue
An ongoing Cloudflare incident is affecting several Replicated services. We are monitoring this situation closely.
Report: "Customer pages are inaccessible for some customers."
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "Proxy image pulls from quay.io are not working"
Last updateThis incident has been resolved.
There is currently an ongoing issue with quay.io registry that is preventing proxied images from being pulled.