Historical record of incidents for Kraken.io
Report: "Callback Service Outage"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
All systems operational with the exception of our Callback Service (Webhooks)
We are currently investigation a major connectivity issue with out datacenter. We will keep you posted.
Report: "Networking Issue"
Last updateThe networking issue was ISP-related, and has been resolved.
We are experiencing some strange network flapping with our main datacenter. All hands are on deck, and we hope to get this resolved shortly. Please bear with us.
Report: "DDoS ongoing"
Last updateFresh HTTPS load balancers have been deployed and everything is back to normal.
We are shipping and deploying further load balancers now.
Currently the API is subjected to an ongoing DDoS attack, which is causing delayed response times on our SSL/TLS load balancers. Our engineers are looking for a solution.
Report: "Elevated API Errors"
Last updateThis incident has been resolved.
We have resolved the issue and implemented a fix. Back to optimizing images.
The issue has been identified and a fix is being implemented.
We're experiencing an elevated level of API errors and are currently looking into the issue.
Report: "API workers degraded"
Last updateThis incident has been resolved.
A new switch has been deployed allowing API workers to communicate with you, we will keep on monitoring for a bit.
Defective ports on a network switch caused API workers to "flap". We switched over to new switch hardware, API is fully functional.
We are seeing issues with some API workers having trouble to rejoin the pool to accept images. Investigating.
Report: "Service unavailable"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "API unavailabity"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Website unavailable temporarily"
Last updateWe have been resolving an issue with Kraken.io that interrupted website availability. Issue has been resolved, and Kraken.io is functional again.
Report: "API upload issues"
Last updateClosing this, as monitoring did not reveal any further issues.
A fix has been deployed and API usage is fully functional. We'll keep monitoring the situation.
Problem has been identified. We'll issue rolling restarts on our processing pools.
We are seeing issues with uploads to the API. URL fetching is not impacted. Investigating the issue.
Report: "API access for older accounts"
Last updateOld accounts not having gone through verification have now been properly grandfathered. API is now friendly again to senior accounts.
An issue has been identified affecting old user accounts where email validation had not been in place. We're working on grandfathering those accounts.
Report: "Kraken.io database connectivity"
Last updateAn automatic upgrade of database connectors broke connectivity to the internal database for the web frontend. A fix has been deployed, functionality is restored.
Report: "Account management issues"
Last updateA fix has been deployed, account management is functional again. Sorry for the inconvenience.
We have identified an issue affecting account management, and are implementing a fix for it. To avoid damage to the account data we have temporarily disabled account changes.
Report: "Network hardware failure"
Last updateNetworking issues have been resolved, no further issues detected during monitoring.
Operations have been restored, we'll keep on monitoring all systems.
We shipped and replaced new hardware for a bunch of systems in our processing pipeline. API is back up, restoring web interface next.
We have deployed a fix to the networking hardware, moving on to restoring other services.
We are continuing to work on a fix for this issue.
Due to a failure in our networking hardware, kraken.io is unavailable. An engineer has been paged and is driving to the data center for repairs. ETA: 1h
Report: "Web Interface Pro"
Last updateA fix has been deployed. Image, archives, page crunches and URL lists are happy again
We are working on resolving reports of limited functionality within the Web Interface Pro.
Report: "Callback service unavailable"
Last updateThis incident has been resolved.
We rolled out a bunch of platform updates to resolve the issue. Post-mortem to follow after some sleep.
A fix has been implemented and we are monitoring the results.
We are provisioning replacement services for the Callback Service.
Due to a faulty machine the Kraken.io Callback Service used to deliver image processing notifications is down.
Report: "Maintenance extended"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Due to some steps of the network maintenance requiring more time, we need to extend maintenance by an hour.
Report: "Webhooks timeouts due to maintenance issue"
Last updateThis incident has been resolved.
A fix has been implemented and we are currently monitoring for potential issues. If you are still experiencing any problems you can reach us on chat here: https://lc.chat/now/4912161/
We are aware that some users might still experience delays with webhooks. Our engineers are working to resolve the issue.
A fix has been implemented and since that we received no failing logs. We are continuing to monitor for potential issues.
Some users might experience some delays with webhooks. We are currently investigating the issue. Stay tuned!
Report: "Webhooks Delivery Issue"
Last updateI can confirm the problem with Webhooks Delivery is now resolved and we would like to apologize for the troubles caused by this. You can reach us on chat in case of any questions. https://lc.chat/now/4912161/
Report: "Investigating API issue"
Last updateThis incident has been resolved.
We are experiencing intermittent problems with hanging upload requests and are working to resolve this issue ASAP. Please stay tuned.
Report: "API back up"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Update 1: resolved an issue on the storage servers, where NodeJS was locked down on incompatible version during deployment, which broke actual storage in file system for things using our short-term storage
We're back to fixing a failed deployment.
Report: "Maintenance on API"
Last updateThis incident has been resolved.
We're currently deploying fixes to the API
We're currently deploying fixes to the API
Report: "API - Degraded Performance"
Last updateThis incident has been resolved.
Everything is running normally once again. We are continuing to monitor the situation and will publish a postmortem in the next 24 hours.
Due to unusually high demand, our cluster's performance has been degraded, with some requests failing. We are in the process of provisioning new machines which should be completed within the next two hours.
Report: "DDos attack - Servers up again"
Last updateSaturday night around 11PM someone using a hacking tool to find attack vectors for our API with a fake IP address \(see [https://en.wikipedia.org/wiki/IP\_address\_spoofing](https://en.wikipedia.org/wiki/IP_address_spoofing) for more context\) usually assigned to the African IP address registry. The effect of this was limited but kicked of a fatal chain. In our processing queue for the API, there is a pool of NodeJS based workers sitting behind nginx load balancers. Now those load balancers tried to keep up with the incoming requests and at some point simply could not due to hard limits set for the amount of requests a load balancer should be allowed to serve. In the end the load balancers stopped serving requests since the were simply not allowed to open further connections to the API workers. So we set the workers free and allowed more connections to be handled by each, monitored the situation for a while and decided, all is fine in Kraken.io Land. Spoilers: it was not. An hour later requests were piling up again and the API was refusing to service anything. It turned out requests were hugely delayed because of unanswered DNS requests. Since we use a private DNS infrastructure this was rather weird. Internal requests were being answered within acceptable times just fine. But: external requests were no longer answered. At this time, most of our systems started reaching the End-Of-Life for cached DNS requests. For the outside world, this culminated in [kraken.io](http://kraken.io) no longer being available, while for our private infrastructure the rest of the world ceased to exist. Further investigation showed that our own DNS servers were no longer receiving DNS zones from upstream servers, e.g. us asking for [s3.amazon.com](http://s3.amazon.com) on our upstream servers got denied. A few cables further the issue was finally spotted: the DNS systems within the data center - to which we turn when requesting information about public DNS zones - were no longer available.Mitigation efforts undertaken: * load balancer scaling has been changed to automatically adapt to incoming load instead of hard defaults * our pool of DNS servers for external requests has been expanded \(edited\) [**Wikipedia**](https://en.wikipedia.org) [**IP address spoofing**](https://en.wikipedia.org/wiki/IP_address_spoofing) In computer networking, IP address spoofing or IP spoofing is the creation of Internet Protocol \(IP\) packets with a false source IP address, for the purpose of impersonating another computing system.\(92 kB\)
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Our Backend is back up - DDoS attack investigation"
Last updateThis incident has been resolved.
DDoS attack originating from spoofed IPs in Africa
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are working diligently to find the root cause.
Report: "Emergency maintenance - Server downtime to be expected"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Internal DNS failure caused an outage.
We are continuing to investigate this issue.
We are currently investigating.
Report: "Our API Backend is currently down. We are working diligently to find the root cause."
Last updateThis issue has been solved. We shall update this space with a post mortem within the next 24 hours
A fix has been implemented and we are monitoring the results.
Our API is back up, meanwhile we continue to build a picture of what happened.
We are currently investigating this issue.
Report: "[Billing] Some customers are unable to update their existing credit card information"
Last updateThis incident has been resolved.
We are continuing to work on a fix for this issue.
We have identified an intermediate workaround while Recurly and our card processor are still analysing the root cause. Customers who have valid cards rejected should contact support@kraken.io for a resolution.
We can reproduce the issue and have traced it to the payment gateway service. We are working with their support to locate the root cause.
Some customers have reported that perfectly valid credit cards are being rejected. This matter has been escalated to our payment billing provider, @recurly, who admit that the problem is on their end and are working on a solution.
Report: "Delays in Webhooks delivery"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Customers are experiencing delays in webhooks delivery. We're working on a fix.
Report: "Web Interface PRO not responding"
Last updateAn internal component unexpectedly locked up and disconnected the service from load balancing. Normal service has been restored. Kraken.io API services have not been affected.
We are currently investigating this issue.
Report: "Webhooks: processing delays in emptying in the queue"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are observing occasional large delays (up to 30 minutes in cases) in processing callbacks and are currently investigating.
Report: "Platform outage"
Last updateWe're back online.
Our datacenter in Frankfurt is having a major outage. Updates will follow.
Report: "Platform outage"
Last updateWe're back online
Our datacenter in Frankfurt is having a major outage (yes, again). Updates will follow.
Report: "Platform outage"
Last updateWe're back online
Our datacenter in Frankfurt is having a major outage. Updates will follow.
Report: "API Storage Outage"
Last updateAll issues with Storage machines have been now resolved.
We're currently investigating intermittent connectivity issues with API Storage machines.
Report: "Recurly transit outtage"
Last updateAll issues with our billing provider have been resolved. Please contact support@kraken.io with any questions.
Our payment provider is currently facing issues, you may have problems accessing your Kraken.io account at this time. We'll keep you posted.
Report: "Kraken.io API - Degraded Performance"
Last updateAll DB-related issues have been now resolved
We are currently investigating an issue with db connections timing out
Report: "WP plugin - degraded performance"
Last updateThe issue with the API has been fixed now.
We're currently investigating an issue with our WP plugin being unstable. Details will follow.
Report: "API Slow Response Times"
Last updateAll the issues with Kraken API are now resolved.
We're investigating an issue with long response times from the API
Report: "Partial outage of Kraken Account"
Last updateAll services are back online.
Our payment service provider (Recurly) is experiencing issues with the API. Kraken Account which relies on Recurly's API will be unavailable for now.
Report: "API instability issues"
Last updateAll the issues with the API machines have been resolved.
We're investigating API instability issues due to a faulty hardware.
Report: "Network issues when connecting to Recurly"
Last updateRecurly's networking issues appear to be resolved.
Recurly (our payments provider) has identified the routing issue in their Data Center and is working towards a solution.
Report: "We're experiencing connectivity issues with our worker and API machines."
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Problems with our database machines"
Last updateSituation is back to normal. Failing machine has been removed from the pool.
We're experiencing problems with our database machines. API might be unstable at the moment.
Report: "Worker machines are becoming unresponsive"
Last updateOur DC came under a network attack which has been taken care of. All services are operational at the moment.
We are currently experiencing and investigating network problems in our datacenter. API and Web Interface might not be reachable at the moment.
Report: "API and Web Interface issues"
Last updateAll issues are now resolved. Sorry for the inconvenience.
Contacted our hosting provider. They have a power malfunction and are currently working on resolving the issue.
We're experiencing a networking issue with API and Web Interface machines.