Historical record of incidents for packagecloud.io
Report: "Website unavailable"
Last updateThe website continues to be available, and we're confident the root cause has been addressed.
Website has been restored, but we're keeping an eye on things.
Our marketing website is currently unavailable and we are investigating. The product is unaffected — the management dashboard and all package management functions are fully available.
Report: "Intermittent 50X Errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have seen some intermittent 50X errors, which we think may be related to the change in traffic profile that we're seeing. We will update once we have more details. Thank you for your understanding.
Report: "Unable to upload and delete packages via the user interface."
Last updateThe incident has been resolved.
We are currently facing issues with uploading and deleting packages via the user interface. We are actively investigating this issue.
Report: "Extended indexing times"
Last updateThis incident has been resolved.
We are aware of extended indexing time of repos and are actively investigating the issue.
Report: "Packages uploaders not available to handle upload"
Last updateThis incident has been resolved.
The base image required by our uploaders has been rebuilt and redeployed. All package uploads should be working. We're monitoring the situation.
We've identified the issue as the uploader services not being able to load the docker images required to start the uploader services.
Report: "Duplicate entry for certain package uploads"
Last updateThe workaround has resolve the issue of 'Duplicate Error' and repo install script issue that affected private repositories. We will work on expanding the namespace limits for private read tokens to ensure future scalability.
We've released a fix to address the 'Duplicate Error, Aborting' issue encountered by users trying to upload packages. We're now monitoring customer reports and feedback.
We have released a fix to bypass the issue of repo install scripts not working as expected. We're monitoring customer feedback and report for that. We have started working on a bypass for duplicate entry for package upload, with the ETA still expected to be 1400 UTC.
We have identified the issue to be related to the identifier limits in our database. We're in the process of releasing a fix within the next hour 1400 UTC.
Some customers have reported issue with uploading packages, suspected to be previously uploaded and deleted. Some customers have also reported repo install scripts not working as expected as it cannot pull down the repo config for the system where packages are to be installed.
Report: "AWS us-west-1 connectivity issues may impact packagecloud.io"
Last updateAWS us-west-1 has fully recovered connectivity and any connectivity issues to packagecloud.io services is fully restored. For more details please also refer to: https://status.aws.amazon.com/
AWS us-west-1 is having connectivity issues that may cause intermittent connectivity issues to packagecloud.io services. For more details please also refer to: https://status.aws.amazon.com/
Report: "Intermittent 50X Errors"
Last updateThere has not been any 50Xs occurrence alerts nor any customer report in the last 10 hours. We removed one stat that was showing total downloads, which is a bit compute intensive as precaution.
We have seen some intermittent 50X errors, which we think may be related to the change in traffic profile that we're seeing. We will update once we have more details. Thank you for your understanding.
Report: "Fallout from Maintenance Window"
Last updateStill monitoring site health as usual, but marking this particular incident as resolved.
The site appears to be fully restored, however we will be closely monitoring the situation to make sure it remains operational before resolving this incident.
We have started an alternative contingency plan in order to restore website functionality, it should finish soon. Thanks again for your patience while we work to resolve this issue.
We are still dealing with some fallout from our scheduled maintenance. The issue has been identified and all required parties are working on it.
Report: "Increased Error Rates"
Last updateThe issue has been resolved and the service is operating normally.
We're experiencing increased error rates for Amazon S3 requests in the US-WEST-1 Region.
Report: "DNS Outage"
Last update## TL;DR This past Monday, our DNS provider, DNS Simple, experienced a distributed denial of service attack which took down their DNS resolution service. You can find more information about the DNS outage at our provider here. Our monitoring alerted us that there was a problem with domain resolution and we began investigating. Our DNS provider is both our registrar and our DNS provider, so there was, unfortunately, nothing that we could do during the outage. During this time, some customers were unable to resolve our domain name packagecloud.io. Customers who had our DNS cached or added an entry to their /etc/hosts file were unaffected by the outage. We’ve made some changes to help mitigate our DNS provider having an outage in the future. ## More info We were alerted by our monitoring services at 19:21 UTC on December 1, 2014 that DNS resolution was failing. We immediately began investigating the issue and found that DNS Simple was experiencing a distributed denial of service attack. You can find more information about the DNS outage at our provider here. Our DNS provider is both our registrar and our DNS provider. Their service was down in its entirety, so we were unable to login to switch our namserver settings to an alternate provider during the outage. Customers with our DNS cached on their systems were unaffected by the outage and we saw several customers downloading and uploading packages during the DNS outage. Once the service at our DNS provider was restored, we made some changes to help mitigate potential outages like this in the future. ## Changes It’s possible to configure your DNS settings to use more than one provider to protect against a particular DNS provider having an outage. In order to do this, you need two DNS providers which support DNS zone transfers. We researched our options and selected two providers that support DNS zone transfers, migrated our DNS zones to the new providers, and updated our nameservers at our DNS registrar. We sincerely apologize for the outage our customers experienced and hope that the changes we made to our infrastructure help protect customers against future outages of this nature. If you have any questions, please feel free to email us at support@packagecloud.io.
Denial of Service on DNSimple causing the packagecloud.io domain to intermittently resolve.
Report: "Database Incident"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being deployed.
Report: "Degraded Performance"
Last updateIncreased capacity to help with load issues/degraded performance
Report: "Increased Load/Timeouts"
Last updateCapacity has been increased, resolved
Increased load is causing issues
Report: "Corrupted Deployment"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "Host Down/Unresponsive"
Last updateEverything is back to normal
Fully failed over, catching up on unprocessed jobs
Failing over
Report: "Degraded S3 Uploads"
Last updateAll affected repositories have been reindexed and uploads appear to be working normally.
AWS has rolled out a fix and all affected repositories are being reindexed.
AWS has confirmed that the issue is internal to AWS and they are working on deploying a fix.
In contact with AWS support to identify and resolve issue.
Amazon S3 is throwing intermittent 500 Internal Server Errors on upload.
Report: "Hardware Failure"
Last updateEverything is up and running normally
Failover complete.
Currently failing over.
Report: "Looking into increased connection timeouts"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Bad deploy has triggered some service failures"
Last updateService has been restored.
A bad deploy has triggered some service failures, the code is being reverted and redeployed now.
Report: "Queues Backed Up"
Last updateEverything is now running smoothly, sorry for any delays!
Queue pressure has been relieved, queues are starting to move along
Package uploads result in a reindex job being run to regenerate repository metadata. The index queues are currently backed up as we have sustained a large number of package uploads in a very, very short period of time. We are working on some fixes to relieve pressure on our indexer processes.
Queues are slow/backing up
Report: "Performance issues and response times"
Last updateThis incident has been resolved.
Performance issues and response times.
Report: "Increased Response Times"
Last updateThis incident has been resolved.
Increased response times.
Report: "Potential hardware failure"
Last updateBroken hardware was disabled and services have been restored. We are still monitoring the situation closely.
We are currently investigating a potential hardware failure and taking steps to fail over.