Codefresh

Is Codefresh Down Right Now? Discover if there is an ongoing service outage.

Codefresh is currently Operational

Last checked from Codefresh's official status page

Historical record of incidents for Codefresh

Report: "GCP Incident"

Last update
monitoring

We are aware of the Google Cloud Platform (GCP) Incident (https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1SsW). The analytics portion of Codefresh is currently experiencing degradation. Build Logs are still operational but rely on GCP services. Builds that utilize GCP integrations may be impacted by this incident.

Report: "The issue with running promotions & git push operations from the codefresh UI"

Last update
postmortem

**Impact:** Git commit operations via GitOps-Runtime were temporarily non-functional. **Detection:** The issue was identified by our internal development team. **Root Cause:** A change in the GitHub API altered a response relied upon by the underlying Git library, causing push actions to fail. **Resolution:** We implemented a temporary fix and provided an upgrade path. GitHub has since resolved the issue on their end, and no action is currently required from users.   GitHub's Status Page - \[Retroactive\] Incident with Git Operations: [https://www.githubstatus.com/incidents/tyjjp463pg91](https://www.githubstatus.com/incidents/tyjjp463pg91)

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

The problem occurs during the commit phase. Our team is actively working to identify the root cause and resolve the issue. We will provide updates as soon as more information is available.

Report: "The issue with running promotions & git push operations from the codefresh UI"

Last update
Postmortem
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Identified

The issue has been identified and a fix is being implemented.

Investigating

The problem occurs during the commit phase.Our team is actively working to identify the root cause and resolve the issue. We will provide updates as soon as more information is available.

Report: "Degraded performance on some UI pages and builds"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Degraded performance on some UI pages and builds"

Last update
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Investigating

We are currently investigating this issue.

Report: "g.codefresh.io not available for North America Region"

Last update
resolved

Related services should be restored as per our WAF's most recent status update. https://status.imperva.com/incidents/3ffjxyln9pjt

monitoring

Our WAF provided had a brief disruption and their services are getting back to normal across the North America Region. https://status.imperva.com/incidents/3ffjxyln9pjt

identified

We are continuing to work on a fix for this issue.

identified

We have identified the issue. There is an issue with our WAF provider for the North America. https://status.imperva.com/incidents/3ffjxyln9pjt

investigating

Some geographical location are unable to connect to g.codefresh.io.

Report: "Several hosted GitOps runtimes are unavailable or missing from the UI"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

The team has discovered that some hosted GitOps runtimes are missing or no longer visible in the UI. We are currently investigating this issue and will provide an update shortly.

Report: "Codefresh marketing site is down"

Last update
postmortem

**Impact:** The marketing website for Codefresh, `https://codefresh.io/`was unavailable globally for around 6 hours. **Root Cause:** A configuration issue led to a traffic routing problem. Because of the nature of intermediary responses, the routing loop was not immediately diagnosed. **Resolution:** The routing configuration was restored. The site is now fully accessible. **Detection:** The issue was identified internally through monitoring and also reported by our team.

resolved

This incident has been resolved.

identified

We have identified the issue with an external provider, and are working with them on a resolution.

investigating

We are continuing to investigate this issue. All Codefresh platform functionalities remain unaffected and fully functional, and the UI console is fully accessible at g.codefresh.io.

investigating

We are currently investigating the issue. Please note that the problem is not affecting the product itself, which remains fully functional. You can access the platform at g.codefresh.io

Report: "g.codefresh.io is down"

Last update
postmortem

**Impact:** The platform was unavailable for approximately 15 minutes. **Root Cause:** Human error during the execution of a deployment pipeline. **Trigger:** The pipeline was executed with incorrect parameters. **Resolution:** The error was fixed, and the platform was restored **Detection:** The issue was identified internally through monitoring and also reported by our team.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented

investigating

We are currently investigating the issue.

Report: "Some builds fail to start on Codefresh SaaS environment"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Builds stuck on Validating Connection to Docker Daemon"

Last update
resolved

This incident has been resolved.

monitoring

We have applied a fix, and builds should resume normal operation. Please allow some time for the retry mechanism to initiate the builds.

identified

Hybrid Runtimes are now working normally for builds. We are still working on resolving SaaS builds not starting.

identified

The work around of using Hybrid Runtime is no longer valid. All Builds are now stuck on "Validating Connection to Docker Daemon". We are continuing to work on resolving this issue.

identified

SaaS Builds are currently being stuck on "Validating Connection to Docker Daemon". We have identified the issue and currently working on a solution. Work around would be using the Hybrid Runtime for the builds.

Report: "Increasing number of delayed builds in some accounts"

Last update
postmortem

**Impact**: The issue affected customers on the CUSTOM plan **Detection** : Internal monitoring system **Root Cause**: Following the release of a new pricing model, a miscommunication between our systems caused the payment plan for some Custom plan accounts to reset to default. **Resolution**: Changes were reverted. Corrupted data was restored from the backup.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are still continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Some accounts are experiencing issues with viewing Home Dashboard data"

Last update
resolved

This incident has been resolved.

investigating

The Home Dashboard data related to CI pipelines is only inaccessible for accounts with active GitOps runtimes. CI pipeline-only accounts are not affected.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Build Start Delay"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating the issue

Report: "Pipeline Variable Loss After Search Filtering"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we're monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating the issue. Workaround: Do not use the search filtering option to update pipeline variables. Instead, use the scroll feature to locate and update variables.

Report: "Some Codefresh Classic builds failed to start or terminated incorrectly"

Last update
resolved

This incident has been resolved.

monitoring

We have detected some issues with connections to Firebase from our classic build manager from 01:22 to 01:33 AM UTC, Nov 14th. Some builds during this time may have failed to start, or terminated incorrectly. The issue subsided after this time period, and all systems are currently operational. Our team is monitoring and investigating.

Report: "General UI Slowness"

Last update
postmortem

**Impact:** Following changes made during recent scheduled maintenance, some customers experienced slower load times in the Codefresh SAAS platform UI. Builds were unaffected.   **Detection:** Our monitoring systems alerted us to unusual activity, and our team quickly initiated an investigation. Customer reports of slower performance further confirmed the issue, allowing us to prioritize and address it promptly.   **Root Cause:** The traffic imbalance was due to a combination of technical configurations that resulted in uneven resource allocation across our zones immediately after the maintenance. This led to one zone handling a disproportionate amount of traffic, impacting website responsiveness for some users.   **Resolution:** Our team implemented several measures to redistribute traffic evenly across all zones. These adjustments restored balanced performance, with monitoring systems ensuring stability.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results

identified

The issue has been identified and a fix is being implemented

investigating

We are currently investigating the issue

Report: "GitOps Pages Not Loading"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

The Classic UI is still accessible. If you need to access Pipelines and Projects, please navigate to https://g.codefresh.io/projects/ directly.

investigating

We are currently looking into the issue where https://g.codefresh.io/2.0/ is not loading and showing a white screen.

Report: "Some Classic builds are stuck in Pending state"

Last update
postmortem

**Impact**: Some accounts sporadically experienced longer pending times than usual on a portion of their builds for a day. **Detection**: Issue was reported by a customer, and shortly after confirmed by Codefresh’s platform monitoring alerts. **Root Cause**: This issue was caused by a bug in MongoDB driver. The MongoDB driver was upgraded in Codefresh services as part of our efforts to improve performance, but this version contained a bug that caused Mongoose queries to hang when under heavy load without returning or throwing errors. This resulted in the Codefresh build manager randomly getting stuck when enough queries were hanging under certain conditions. **Resolution**: A temporary solution to improve build queries queue behavior was initially implemented to alleviate the issue for affected customers. The actual root cause was identified the following week, and the issue was resolved by downgrading the MongoDB driver to a version that did not contain the bug.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Codefresh UI doesn't open for some users"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Codefresh GitOps performance degradation"

Last update
resolved

This incident has been resolved

monitoring

A fix has been implemented and we are monitoring the results.

investigating

A fix has been implemented and we're monitoring the results.

investigating

Users may potentially see delays in processing events from runtimes, and increased general load time for GitOps pages We are currently investigating the issue.

Report: "Partial Outage: Pipeline builds are stuck in pending due to expired certificate's"

Last update
postmortem

**Impact**: We had a 10 hybrid runners \(no more than 10\) that were unable to communicate with our API for a day, and therefore were unable to fetch and run pipelines. **Detection**: We were informed of this issue by customers. **Root Cause**: We identified an issue with our certificate rotation which failed to generate new certificates as required for this subset of runners. **Resolution**: We were able to resolve the issue by manually recreating the certificates required, which were then updated to the runners on the next build, restoring the service for all impacted customers. Further mitigation was done to ensure the issue with certificate rotation was also rectified. We are working on monitoring improvements in this area

resolved

We had a small number of hybrid runners (no more than 10) that were unable to communicate with our API for a day, and therefore were unable to fetch and run pipelines. We identified an issue with our certificate rotation which failed to generate new certificates as required for this subset of runners. We were able to resolve the issue by manually recreating the certificates required, which were then updated to the runners on the next build, restoring the service for all impacted customers.

Report: "Long loading times at GitOps applications dashboard with Hosted GitOps Runtimes"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Some accounts are experiencing issues with viewing desired/live state of GitOps platform objects"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Some accounts are experiencing issues with classic builds"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Support Portal is unavailable for some users"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating the issue. You still can open a ticket on this page using the 'Submit request' button: https://support.codefresh.io/hc/en-us/ or by sending an email to support@codefresh.io

Report: "Some accounts have an issue with access to CFCR Helm Registry"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "GitOps UI Degraded Performance"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating the issue.

Report: "Hosted runtimes are unavailable for some customers"

Last update
resolved

This incident has been resolved.

monitoring

We have resolved this issue. Users will need to add their Personal Git Tokens at https://g.codefresh.io/2.0/git-personal-access-token for the Hosted Runtime. Please reach out to support if you have any additional questions.

identified

We are continuing to work on a fix for this issue. We estimate this will be resolved in 2-3 hours

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

Report: "Degraded performance in Codefresh Classic builds for some customers"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Degraded performance for Codefresh Classic"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

Some users might notice degraded performance for Codefresh Classic. We're investigating the issue at the moment.

Report: "Slow SAAS Builds due to AWS Issue"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently seeing an impact on builds and services on our SAAS platform due to an API error currently impacting provisioning new nodes in US-EAST-1. Some builds and services are experiencing delays due to this resulting resource constraint. We will update our incident here as we see measurable changes, and the AWS incident can also be followed via the below link: https://health.aws.amazon.com/health/status.

Report: "Partial UI outage on GitOps pages"

Last update
resolved

No new complaints have been identified during the monitoring period.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating an issue with GitOps UI pages that rely on Runtime information. Please bear with us as we investigate this issue.

Report: "Classic build and pipelines pages for some customers are experiencing long load times"

Last update
resolved

Work to improve performance in these cases has been implemented. Further improvements are planned in the next few days. Although we do not anticipate this reoccurring, if you do encounter any issues with extremely slow page load times, please contact Support.

monitoring

A fix has been implemented to address this issue for all accounts, and we are monitoring the results.

monitoring

If you are experiencing long page load times (10 sec+) and have a high number of builds pr day, please contact Support who can apply a patch to your account. Our engineers have found the cause, and we are working to resolve the root issue.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor UI performance for selected customers with a high number of builds.

monitoring

A fix has been implemented for the affected accounts, and we are monitoring the results. This bug is only present in accounts with a very high number of daily builds, and support is able to apply the same fix on an account-by-account basis for any additional customers as needed.

investigating

We are currently investigating an issue for some customers where the build and pipelines pages can take an inconsistent time to load, at times experiencing a significant inconsistent delay. We have identified one issue, and currently implementing changes to resolve this issue and improve performance. This is only impacting the UI loading on some pages. Pipeline execution is unaffected.

Report: "Build pages intermittently not loading"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating an issue where some build pages may be having trouble loading.

Report: "We are experiencing intermittent issues with loading git-ops pages"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "The steps catalog is not available"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are working on fixing the issue

Report: "Slow Response Times"

Last update
resolved

We have identified the database-related case of the API slowdown, and have now resolved the issue.

investigating

This incident is only impacting our GitOps API. Classic Pipelines are expected to continue to work with no interruption.

investigating

We are currently investigating slow response times in some parts of our platform. If you are impacted, please subscribe to this page for updates.

Report: "General UI Slowness"

Last update
resolved

This incident has been resolved.

monitoring

We have applied a fix that resolved the issue.

identified

We have identified an issue that is causing UI Slowness. We are currently working on resolving this issue

Report: "We are experiencing issues with viewing GitOps-related pages in UI"

Last update
postmortem

We have completed our RCA for this incident, for which the summary is below: **Impact:** We had significant disruption to any UI page that relied on displaying runtime-related information, leading to incomplete or unavailable data for users. **Detection:** This issue was reported to us by customers. **Root Cause:** An unexpected side effect of an API change which caused the event handler to not recognize runtime events as runtimes and instead treat them as generic-entities. When the change was reverted the entries in the generic-entities collection were no longer updated, and an automatic cleaning function then resulted in some UI data queries returning incorrect data. **Resolution:** After resolving the root cause, we rebuilt the required data and reinitialized the runtime information. We have identified improvements to our E2E testing process and monitoring systems as a result of this incident that we will be implementing.

resolved

This incident has been resolved.

monitoring

We have implemented additional fixes to restore UI functionality, and we are monitoring the results.

identified

We have identified an additional issue and are working on a fix.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Quay.io incident is impacting some image pulls"

Last update
resolved

Quay.io is operating correctly for pushes and pulls. This issue is resolved for all Codefresh related operations.

monitoring

Quay.io has implemented fix for the issue and is operating for image pulls. Codefresh default image pulls are functional at this time. We are continuing to monitor Codefresh builds.

identified

We are seeing improved success rate at this time in image pulls from Quay for Codefresh build images. We are continuing to monitor the incident status from Quay.io.

investigating

As a partial workaround to help alleviate the issue, users can set the default marketplace registry to pull from Docker Hub. This setting will make all public typed-steps (excluding direct Freestyle steps) pull images from the specified registry. To configure this, you will first need a registry integration in your Codefresh account for Docker Hub. Then go to account settings -> Pipeline Settings -> Advanced Options -> Public Marketplace Registry, and select your Docker Hub integration in the dropdown. Note that this setting only affects public typed-step image pulls such as codefresh-run, and will not resolve Quay image pull issues for other cases. Image pulls will also be subject to any Docker Hub rate limits associated with your credentials.

investigating

An incident at Quay.io (https://status.quay.io/incidents/z7sbjqmb34p1) is impacting some pipeline builds, causing failures when the required images are unable to be obtained.

Report: "Quay.io is under maintenance"

Last update
resolved

Quay.io: The scheduled maintenance has been completed

investigating

Impact: sometimes it's impossible to pull images from quay.io. More details: https://status.quay.io/incidents/10b6w5v1w7ql

Report: "Sporadic connection timeouts while operating with cm://h.cfcr.io"

Last update
resolved

This incident has been resolved.

investigating

Currently some users might encounter issues while operating with default Helm registry cm://h.cfcr.io. We're investigating this issue. Some errors that might be seen in build: Error: Get https://h.cfcr.io/account/default/index.yaml: net/http: request canceled (Client.Timeout exceeded while awaiting headers) Error: looks like "cm://h.cfcr.io/account/default/" is not a valid chart repository or cannot be reached: plugin "bin/helmpush" exited with error or Error: 500: unknown error Error: looks like "cm://g.codefresh.io/api/helm/repos/account/default/" is not a valid chart repository or cannot be reached: plugin "bin/helm-cm-push" exited with error ~~ Workaround: As a temporary workaround, we recommend adding retry options to your helm-related steps as described here: https://codefresh.io/docs/docs/pipelines/what-is-the-codefresh-yaml/#retrying-a-step

Report: "We're experiencing an issue with Codefresh pipelines"

Last update
postmortem

**Impact**: We had a 64 minute window where the CI pipeline view had no data in our SAAS platform. **Detection**: Our monitoring systems immediately alerted us to the issue. **Root Cause**: One of our database collections went offline. **Resolution**: We were able to sync and restore connectivity to this database collection. Some parts of this process took longer than expected and we have implemented improvements to a number of processes to avoid a similar incident in the future.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we're currently monitoring results. All systems are operational now.

identified

In order to speed up the issue resolution, Codefresh platform went into maintenance mode. Expected resolution time is 30 minutes.

investigating

In order to speed up the issue resolution, Codefresh platform went into maintenance mode. Expected resolution time is 15 minutes.

investigating

Codefresh platform is currently under maintenance, expected downtime is up to one hour

investigating

We are continuing to investigate this issue.

investigating

Pipelines are missing from the Pipelines list. We are currently investigating this issue.

Report: "Pending builds on SaaS and Hybrid"

Last update
postmortem

**Impact**: We had a partial outage \(some requests could not access the platform at all\) and some builds were stuck in pending for 30 mins. **Detection**: We manually detected this issue before our automated check \(every 10 minutes\) alerted us **Root Cause**: We had a parallel issue with Firebase logging and the combination of a number of small issues as a result caused some pods to become unresponsive. **Resolution**: We reverted our last push to production to test if this was code related. Once the revert triggered services to restart, the issue was then resolved.

resolved

This has now been resolved

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Classic Pipeline Logging Delays"

Last update
postmortem

**Impact**: Some builds had significant delays in logs appearing in the UI. The build completion state and build time was not affected. **Detection**: Customers reported the impact to us. **Root Cause**: We had saturated the incoming capacity of our Firebase instances causing an inability to write new data into it. Due to this builds of customers were not able to report all logs and in many cases had delayed logs. The increased load spiked due to an increase in logging in production from another issue that caused builds to stay in pending. This caused an additional surge and therefore extended delays in logging. **Resolution**: We have doubled our Firebase instances to better handle spikes in demand. We will also be implemented some targeted monitoring of our Firebase instances and have improved monitoring of our overall platform state.

resolved

This incident has been resolved.

monitoring

A fix has been implemented to address this issue and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating some occurrences where Classic Pipeline steps are experiencing significant delays in the logs of the step showing in the UI. Pipelines are continuing to operate as expected.

Report: "GraphQL API outage for GitOps Platform"

Last update
postmortem

**Impact** Our GitOps Platform’s API was unresponsive for an hour. **Detection:** We detected the issue via our automated monitoring. **Root Cause:** Some customer requests to our API triggered an error, which then caused a crashloop in some API pods. **Resolution:** We have updated our error handling to avoid this error in future.

resolved

This incident has been resolved.

monitoring

We have identified and resolved the issue with the API for the GitOps Platform. We are continuing to monitor this issue.

investigating

We are continuing to investigate this issue.

investigating

Our testing has detected an issue with one of the API's used for our GitOps platform. We are currently investigating and will update this incident as work develops. Our Classic platform is unaffected.

Report: "Potential AWS Outage"

Last update
postmortem

This was confirmed as an AWS outage, and during our debugging and monitoring investigations we were able to see and confirm that builds had resumed without issue.

resolved

This incident has been resolved.

monitoring

We are continuing to see builds successfully resume on our SAAS clusters. We are actively monitoring as our runtimes work through the build backlog bottleneck created by this incident, and making adjustments as necessary to help expedite recovery.

identified

Volumes for customer builds are now able to be provisioned. Pending builds should be slowly returning to normal. We are continuing to monitor the behavior for this incident.

identified

We have confirmed that AWS endpoints are timing out and we cannot provision volumes used for builds. We are continuing to investigate what appears to be a widespread AWS issue with us-east-1.

investigating

We are currently investigating an issue with AWS (in particular us-east-1) which is impacting volume provisioning for some customers.

Report: "Builds are failing on some accounts"

Last update
postmortem

**Impact**: We had 40 minutes where API calls relying on context objects were failing. **Detection**: Our internal monitoring detected this issue. **Root Cause**: We introduced a new security model to our context API, and as a result, caused a callback loop between that API and another service. **Resolution**: We reverted the code within 40 minutes of our internal monitoring raising an alert and have reimplemented the new security model in a way that doesn’t cause the callback loop.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Codefresh API documentation doesn't work"

Last update
resolved

This incident has been resolved. If you still see a blank screen when accessing https://g.codefresh.io/api/, please hard-refresh the page to clear the cache (Shift + Reload button in Chrome).

monitoring

A fix has been implemented and we are monitoring the results.

investigating

Codefresh API documentation (https://g.codefresh.io/api/) doesn't work. We're currently investigating this issue. 💡Only the documentation is affected, the API itself is fully functional.

Report: "h.cfcr.io is not reachable"

Last update
postmortem

**Impact**: Codefresh default helm repository was unreachable from [h.cfcr.io](http://h.cfcr.io). Classic users relying on this were unable to run their builds until they switched to our backup endpoint. **Detection**: This was discovered by a user. **Root Cause**: There were issues with our DNS provider. The workaround that was provided to users during this outage can be considered a permanent fix.

resolved

At this time, all DNS issues have been resolved for cfcr.io. To reiterate, if you have implemented the workaround which uses the https://g.codefresh.io/api/helm/repos endpoint, you can safely leave this workaround in place, or revert back to https://h.cfcr.io if you wish to do so. If you are seeing any issues with cfcr.io please reach out to Codefresh Support for assistance.

monitoring

DNS issues for cfcr.io are resolved as DNS propagation has been completed. All customers (both SaaS and Hybrid) should now be able to access the Codefresh Chart Musemum using cfcr.io. Please note: If you have implemented the workaround which uses the https://g.codefresh.io/api/helm/repos endpoint, you can safely leave this workaround in place, or revert back to https://h.cfcr.io if you wish to do so. We are going to continue to monitor the situation prior to marking this incident as resolved.

identified

DNS issues for cfcr.io are now resolved for all non-hybrid customers. Hybrid customers may still experience some issues, therefore we are continuing to monitor the situation.

identified

The issue was identified with our DNS provider. The current timeline for a resolution could be up to three days. In the meantime you can change all references to the helm registry (https://h.cfcr.io) to the direct endpoint of: https://g.codefresh.io/api/helm/repos These changes should be made in the following places: 1. Codefresh helm integration (https://g.codefresh.io/account-admin/account-conf/integration/helmNew) 2. Pipelines with freestyle step 3. External systems to codefresh which are referencing the helm registry Please feel free to reach out to support with any questions.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Quay IO APAC CDN Slow Downloads"

Last update
postmortem

This was a 3rd party incident.

resolved

Download speeds in APAC have been restored. https://status.quay.io/incidents/h993c5nwlnj1

monitoring

Download speeds are restored. Quay.io are continuing to monitor https://status.quay.io/incidents/h993c5nwlnj1

identified

Users in the APAC region may be impacted by Quay.io's current CDN issue in the region. Immediate updates and history can be seen on Quay.io's status page here: https://status.quay.io/ This is a regional issue for APAC users only and is likely to impact the startup time of pipelines as images used internally are impacted.

Report: "We are experiencing issues with loading Pipelines view page"

Last update
postmortem

**Impact**: Codefresh Classic pipelines were inaccessible for around 15 minutes. **Detection**: The issue was immediately reported to Codefresh by customers **Root Cause**: Changes to Codefresh platform components were tested and validated prior to deployment, however an inconsistency during deployment caused an issue with the platform, which resulted in a brief outage that was quickly resolved.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.