Historical record of incidents for Chef
Report: "Issue with habitat builder database"
Last updateIssue is resolved, all Habitat service is functional now.
We have an isolated incident with our habitat database and we are investigating the issue. We will update when we are back online. Please check our Slack channel for updates. https://chefcommunity.slack.com/archives/CAW6DQBV2
Report: "Issue with habitat builder database"
Last updateIssue is resolved, all Habitat service is functional now.
We have an isolated incident with our habitat database and we are investigating the issue. We will update when we are back online. Please check our Slack channel for updates. https://chefcommunity.slack.com/archives/CAW6DQBV2
Report: "Interim issue with hosted chef"
Last updateThere was an incident with Hosted Chef today morning at 6:00AM UTC, the Hosted Chef application (manage.chef.io) was down for 15 mins. We identified the root cause and fixed the issue immediately. The systems are now healthy and up and running.
Report: "Authentication broken for Supermarket"
Last updateWe have identified and fixed the issue, sign in is working as expected now. We will continue to monitor the system.
We have identified that sign in for supermarket is not working. We are currently investigating the issue and we will keep you posted
Report: "Hosted Chef Extended Maintenance"
Last updateWe have completed the Database upgrade and all systems are operational now. Thanks you for your patience.
We have completed the DB optimization successfully and have opened Hosted Chef for traffic. We are monitoring the system performance now. Consider the system as Operational .
We are continuing to work on the DB optimization and have seen significant progress in the last hour. We will keep you updated with our progress.
We are continuing to work on the DB optimization, it is taking longer than we anticipated. We will keep you posted as and when we have an update to share, Thanks for your patience.
We are seeing some issues with the upgraded DB performance after the upgrade today and working to optimize it , We will keep you posted about our progress with the upgrade completion.
Report: "Hosted Chef Increased error rate"
Last updateThis Incident is resolved and services are back to normal. Thanks for your patience.
The issue is fixed and both Hosted Chef Web console and Hosted Chef API are performing normally now. We will continue to monitor for a while.
We have identified our internal Search Service to be the problem and this is because of on-going AWS us-east-1 incident. Our Engineers are working to restore the service with normal performance.
We are seeing increased Error rates in Hosted Chef , this seems to be because of an on going AWS network issue in us-east-1 region. Our Engineers are working on this and we will keep you posted.
Report: "Timeouts and failed requests for package distribution infrastructure"
Last updateVendor has resolved their incident.
Error rates have improved for package distribution infrastructure. We will continue to monitor the upstream outage.
We have implemented a workaround which should alleviate timeouts and error rates. We are continuing to monitor the upstream vendor outage.
We are experiencing network performance degradation due to a vendor.
Report: "Chef Package Distribution Outage"
Last updateThis incident has been resolved.
We are facing issues with our Package Distribution Infrastructure. We are Working on identifying the issue and a resolution.
Report: "Increased Error rate 5xx on Hosted Chef (api.chef.io/manage.chef.io)"
Last updateWe have fixed the issue and Error rate is close to 0. We will continue to monitor for a while. Thank you for your patience.
We have not been able to fix the issue yet and are continuing our work to resolve the issue. Thanks for being patient.
We are facing increased 5xx Error rate on Hosted Chef. We have identified the problem and are working to resolve the issue. We will keep you updated.
Report: "Increased Error rate 5xx on Hosted Chef (api.chef.io)"
Last updateThe issue has been resolved. We were facing error with our indexing cluster and that lead to increased 5xx for the API calls.
We are currently investigating increased error rate in Hosted Chef. We will keep you posted.
Report: "Hosted Chef - Manage UI is not saving/loading attributes"
Last updateWe have deployed a previous release of Manage to Hosted Chef to fix the UI bug until we can fix it in the latest version of manage.
We are investigating an issue with Hosted Chef - Manage UI functionality. Please note this does not affect the Hosted Chef API and Knife functionality is still available.
Report: "Elevated connection rate and 500's"
Last updateThis incident has been resolved.
Traffic patterns and service have normalized to regular levels observed prior to the maintenance window. We will conduct an incident analysis and write up a blog post for this next week. Thank you for your patience and I'm sorry that this impacted your workflows.
We've implemented a short term workaround to restore service. We're monitoring the service.
We're isolating the issue with authz service's database queries that is taking an abnormally long time to complete.
We're investigating an unexpected elevation in fetches by the authz service from the database.
After upgrading PostgreSQL we are seeing database connections and CPU spikes. We're resizing the database to get more system resources and will provide additional updates as we have them.
Report: "Increased supermarket errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The load balancers removing hosts sporadically was a symptom. We're continuing to investigate what appears to be regularly occuring spikes in cache gets at the :00 and :30 minute marks each hour.
We're investigating what is causing the load balancer to remove hosts sporadically.
We are investigating increased error rates and timeouts with supermarket.chef.io
Report: "Increased error rates"
Last updateThis incident has been resolved.
Backend services have stabilized, engineers are monitoring
We are investigating increased error rates for Hosted Chef services
Report: "Hosted Chef Incident"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating problems in the Hosted Chef backend. Unfortunately, we expect chef-client communications to be severely impacted at this time. We will post further info regularly, as we find it.
Report: "omnitruck services high error rate"
Last updatehttps://omnitruck.chef.io/install.sh serves up bootstrap scripts for use by unattended installs like autoscaling and `knife bootstrap ...` initiated bootstraps. We have become aware of an omnitruck.chef.io issue affecting downloads of the bootstrap install script from opscode.com/chef/install.sh as well. Please upgrade if running older chef-client/knife code that would point at opscode.com We expect the issue to have been cleared at 2019-10-11 8am Pacific US
Report: "Customers may be experiencing a disruption to their Chef services"
Last updateWe've received independent verification from customers that remediation steps have resolved the issue for their use case(s) and are calling this incident resolved. Please reach out to Chef Support if you experience further problems. The affected gems (chef-sugar, chef-api, stove) have been restored to the rubygems.org site under their original namespaces.
We have identified a fix and released new versions of relevant cookbooks and libraries. Remediation steps will be shared to customers via our customer success teams.
We will shortly have remediation steps for repairing environments affected by this issue.
We have isolated the issue to be failures in Chef client runs whose cookbooks rely on libraries which are not currently available. Mirrors that sync these libraries will also encounter issues. These issues do not impact customer data, privacy or data integrity. We are working to resolve these issues ASAP and will provide a further update shortly.
Please note that some customers may be experiencing a disruption to their Chef services. We are actively investigating this issue and working on a resolution. We apologize for the disruption and will regularly update this page with a status and resolution timeline.
Report: "Package Distribution maintenance window extended"
Last updateWe ran into an issue and will resume this maintenance tomorrow
Our maintenance is running long, but still no impact is expected
Report: "Intermittent Omnitruck Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating intermittent issues with omnitruck.chef.io.
Report: "Hosted Chef Increased Error Rate"
Last updateThis incident has been resolved.
The error rate with hosted chef has returned to normal after after an issue with the search backend. We will continue to monitor for further issues.
We are seeing increased error rates with Hosted Chef and are investigating.
Report: "Hosted Chef Increased Error Rate"
Last updateWe have identified the issue and have implemented a fix. Hosted Chef errors have returned to normal levels.
We are seeing increased error rates with Hosted Chef and are investigating.
Report: "Hosted Chef increased error rate"
Last updateThis incident has been resolved.
Error rates appear to be returning to normal. We will continue to monitor for further issues.
We have identified an issue with the search backend and are working on a fix.
We are seeing increased error rates with Hosted Chef and are investigating.
Report: "Hosted Chef Issues"
Last updateThis incident has been resolved.
We have implemented the fix and things appear to be returning to normal. We will continue to monitor for further issues.
We have implemented the fix and things appear to be returning to normal. We will continue to monitor for further issues.
We have identified the issue and are working on a fix.
We are investigating an issue with Hosted Chef availability
Report: "downloads.chef.io increased errors"
Last updateThe problem has been identified and resolved
Engineers have identified an issue leading to increased error rates on downloads.chef.io and are working to get a fix in place
Report: "Hosted Chef email notification issue"
Last updateEmails are now being sent out correctly once more.
We have identified an issue where some email notifications are not being sent out correctly from hosted chef. We are deploying a fix for the issue now.
Report: "Hosted chef service issues"
Last updateThis incident has been resolved.
The maintenance is complete and we are seeing error rates/latency return to normal. We are monitoring to verify that the maintenance has fixed the issue.
We are performing emergency maintenance. There will be a brief outage to Hosted Chef while we do so.
We are seeing increased hosted chef latency and error rates, and are working to apply a fix.
Report: "Elevated hosted chef error rates"
Last updateThis incident has been resolved.
A fix has been applied and error rates are back to normal. We will continue to monitor for issues.
We have identified a networking issue that is causing slow/hung connections and are working with a provider on a fix as soon as possible
We have been alerted to elevated error rates with hosted chef. Engineers are investigating.
Report: "Omnitruck SSL"
Last updateThe correct certificate has been put in place.
We have identified an issue with an expired SSL certificate that is affecting the omnitruck service. We are working to restore service. This will have impact on many process that install the chef-client.
Report: "AWS us-west-2 Network Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
packages.chef.io and downloads.chef.io are currently experiencing intermittent connectivity issues because of network connectivity issues in AWS us-west-2.
Report: "Hosted Chef API down"
Last updateThis incident has been resolved.
API service is now currently available and being monitored.
Hosted Chef's API is currently unavailable. Engineers are investigating the issue.
Report: "Increased Hosted Chef Error Rates"
Last updateThis incident has been resolved.
Engineers identified a Database issue, remediated, and will continue monitoring for any other issues.
We've been alerted to increased error rates for some customers using Hosted Chef, engineers are investigating
Report: "Increased hosted errors"
Last updateThis incident is resolved
We have identified a cause of the errors and hosted chef is stabilizing. We will continue to monitor.
We are investigating an increase in error rates with hosted chef
Report: "Errors with manage signup"
Last updateThis incident has been resolved.
Engineers have identified an issue affecting hosted chef new org/user signups and are working to prepare a fix.
Report: "Increased Hosted Chef Error Rates"
Last updateEngineers investigated and found a brief increase in error rates related to database vacuum. Error rates returned to normal within 15 minutes of 20:00 UTC. Sorry for the delay in resolution!
Engineers are investigating increased error rates when using Hosted Chef APIs
Report: "Hosted chef outage"
Last updateThis incident has been resolved. An upstream provider experienced an issue with the underlying service our search functionality uses, and as a precaution, application servers were removed from use due to failing healthchecks to prevent from service to prevent partial writes.
The Hosted Chef API is available once more, we are monitoring the situation.
We are experiencing a service outage with hosted chef's search backend.
Report: "Hosted chef performance issues"
Last updateHosted chef is operating normally.
We have brought additional instances online to address the issue and latency/errors are now normal. We will continue to monitor the situation.
We are investigating increased latency and errors with hosted chef.
Report: "Issue resetting chef keys via id.chef.io, and issues logging into supermarket"
Last updateWe have deployed a fix for the issue and both key resets and supermarket for newly logged in users should be working normally.
A fix for the issue has been identified and we are in the process of deploying it.
We have identified an issue where users are unable to reset their chef keys on id.chef.io. Users attempting to do so will obtain an empty key instead. We are working on deploying a fix ASAP, but in the meantime you can reset your key from inside chef manage at manage.chef.io. In addition to this, users logging into supermarket are currently receiving errors when viewing their profile page and will be unable to publish to supermarket. Users who are already logged in should be unaffected.
Report: "Bootstrapping Issues"
Last updateAt this time, we believe all previous issues with certificates to be cleared up. Bootstrapping and other activities depending on website certificates should now be successful. We'll go ahead and resolve the incident now, but will continue to monitor and have folks ready to respond to any recurrence.
We're experiencing intermittent ssl certificate issues for chef package downloads. Engineers are currently investigating.
We have identified the problem and believe we have a solution. We'll continue monitoring the situation for a short while to make sure things are back to normal. We apologize for the hold up in your bootstrapping activities.
The following are currently affected by a configuration issue in our caching infrastructure. We have folks on the scene investigating now and will provide updates * Bootstraps that refer to https://www.chef.io/chef/install.sh * Test-Kitchen bootstraps on windows * Usage of omnitruck.chef.io For more detail https://github.com/chef/omnitruck/issues/249
Report: "Emergency Search Maintenance"
Last updateMaintenance is complete
Engineers have identified a database situation that requires immediate maintenance. Only minor chef search delays are expected while this occurs. This should take under an hour to complete.
Report: "Elevated hosted chef errors"
Last updateHosted chef is available once more. We apologize for the outage and will continue to monitor.
Hosted chef is currently unavailable. We are working to bring it back up as soon as possible.
We are investigating elevated errors in hosted chef.
Report: "Hosted Chef Increased Error Count"
Last updateThis incident has been resolved.
Hosted Chef's error rate has fallen to normal levels. We are continuing to monitor the situation.
We are investigating an increase in errors for hosted chef.
Report: "Increased hosted chef errors"
Last updateA fix has been applied and the error rate has returned to normal levels. We will continue to monitor the situation.
We have identified the cause of the increased errors and are working on putting a fix in place.
We are investigating an increase in error rates with Hosted Chef.
Report: "S3 issues"
Last updateAll services should now be operating normally.
Services are beginning to recover. Websites are now available, and hosted chef cookbook downloads are recovering. Users may still experience errors with cookbook uploads however.
We are seeing issues with Amazon S3 that is affecting a variety of services including hosted chef cookbook uploads and downloads. Our engineers are investigating.
Report: "Hosted Chef Issues"
Last updateIssues were identified and have been resolved. Engineers will continue to monitor.
We are investigating issues with the hosted chef API
Report: "Issues with hosted chef search"
Last updateThis incident has been resolved.
The emergency maintenance is now complete and Hosted Chef services are available once more. We are monitoring for any further issues.
The emergency maintenance has now started. Hosted Chef will be unavailable while the maintenance is in progress.
We are seeing issues with hosted chef search and will be performing some emergency maintenance.
Report: "Elevated Hosted Chef Errors"
Last updateThis incident has been resolved.
We have identified an issue with the search backend and have applied a fix. We are monitoring the situation and hosted is operating normally.
We are investigating an increased error rate with Hosted Chef
Report: "Hosted Chef Service Interruption"
Last updateHosted Chef's APIs and services are all operating normally.
Hosted Chef functionality has been restored, and we're currently monitoring to ensure that things have fully resolved. - Fri Oct 21 14:44:53 UTC 2016
Investigating - We are investigating issues with our Hosted Chef service and are working to get it back online as soon as possible. - Fri Oct 21 13:57:32 UTC 2016
Report: "Supermarket API Errors"
Last updateProblem was solved and Supermarket is totally back to business.
We are investigating Internal Server 500 errors from the Supermarket API
Report: "Hosted Chef search outage"
Last updateThe Hosted API is now operating as expected.
An issue with Chef Search has been identified on Hosted Chef
Report: "Hosted Chef Search"
Last updateAn issue with Hosted Search was identified and has been resolved.
Investigating increased error rates with Hosted Chef Search
Report: "Hosted Chef Search"
Last updateThe issue with search appears to be resolved. We will continue to monitor for issues.
We are investigating slowness with hosted chef search.