Historical record of incidents for Fort Awesome
Report: "Kits Package Building"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are investigating an issue with Kits npm packages failing to build.
Report: "Kits infra down"
Last updateThis incident has been resolved.
We discovered an issue with one of our cloud providers. We are working to get it rectified.
Our kits infrastructure is currently down. We are investigating the issue and hope to have this resolved as quickly as possible.
Report: "Brief Site and API Outage"
Last updateThe fontawesome.com website and API service were offline for around 10 minutes due to a DDoS. Service was restored after the issue was mitigated by our CDN provider.
Report: "Slow response times from fontawesome.com"
Last updateThe root fix has been applied and metered billing is also available again. Things look to be stable and humming along fine now.
A partial fix has been implemented now and the site is fully operational. There is one final fix we need to make that will bring our new metered billing feature back online.
We've identified the issue and are working on a root cause fix. In the meantime, the site is in maintenance mode to keep pressure off our database server.
We are currently investigating this issue.
Report: "Unable to issue access tokens on api.fontawesome.com"
Last updateWe've got this all fixed up.
We've missed a detail in our deployment of some new features and token management is not currently functional. We are working on the fix now.
Report: "Investigating timeouts and performance issues with Kits service"
Last updateThis issue has been resolved. We had a Load Balancer that had a quickly changing IP that caused other systems to be out-of-sync with the Kit origin servers. We'll be working to address the root cause using a different infrastructure solution.
We are currently investigating this issue.
Report: "530 DNS Error (or 1016 Origin DNS Error) from Cloudlfare integrated services"
Last updateNo further issues have been found over the last 24 hours. We're going to consider this one resolved.
Cloudflare has a fix in place and they are monitoring.
We have a small number of requests (less than 1%) that are returning either a 530 DNS Error or 1016 Origin DNS Error. We've spoken with Cloudflare on this issue and they have acknowledged the issue and are investigating. https://www.cloudflarestatus.com/ We'll keep this issue up-to-date as we learn more.
Report: "Unable to save or modify a Kit through fontawesome.com/kits"
Last updateA database which stores information for serving Kits through our CDN had a failover event which elected a new primary. During this time there was no interruption in serving Kits through the CDN but unfortunately it started preventing Kits from being saved through our web management at fontawesome.com/kits. We'll begin investigating why this issue affected the service in this way and plan changes necessary to fix the root cause.
Report: "Service interruption for kit.fontawesome.com"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Partial outage"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Kits service down"
Last updateOn May, 30th we attempted to deploy some new features to our Kits service which required a multi-step deployment procedure we've been using for over 3 years. Our Kits service runs on Amazon EC2 instances. We have the service distributed globally in 7 regions. Those instances function as origin servers for our CDN service. We are using Cloudflare's Load Balancer product in order to serve traffic at the edge. When we upgrade the software we systematically take regions offline, upgrade them, and them bring them back online. For our customers this normally results in zero downtime and the process is unnoticeable and seamless. Up until recently our deployment procedure has been stable and we haven't had any major downtime related to the deployment process itself. However, we saw different behavior today that led to a significant degradation of the service. During the last phases of the deployment which usually takes about an hour we noticed that a region became overloaded during the transition from out-of-service to in-service. The load on the now in-service region jumped to unexpected levels well above the normal traffic patterns seen. This load caused the individual servers to become unresponsive which then led to load shedding to other regions. Unfortunately, this only compounded the issue as the pattern repeated in the fallback region. With the increased and surging loads in various regions a cycle of in-service, surge load, instance failure continued until the entire Kits service was unstable and no viable origin servers were available to service requests. During the Kit service failure we also began to see load increase on one of the database servers that is used by [fontawesome.com](http://fontawesome.com). The reason for this is unknown right now but we suspect there is some indirect tie that needs to be found and corrected. The additional load on the database caused the [fontawesome.com](http://fontawesome.com) website to stop responding to most requests that require database connectivity. At this point our mitigation strategy was two-fold: 1. Increase the number of origin servers to support the Kits service 2. Upgrade the database server to a larger instance Our team began performing the steps necessary to scale up to handle the surge load. After these steps were complete both the Kits service and [fontawesome.com](http://fontawesome.com) site began functioning as normal. We are still investigating the link between Kits service load and the [fontawesome.com](http://fontawesome.com) database server. We will also be working with Cloudflare to understand what changes might have contributed to this issue. Over the next few days and weeks \(however long it takes\) we will look at this issue as a team and determine what steps we need to take in order to prevent this type of failure in the future. We understand that our customers rely on us and the high availability-especially the Kits service. When we are down we lose trust and fail to provide the level of service that we've pledged to you. That’s not acceptable to us and we know it’s not acceptable to our customers either. If you have any questions please feel free to email us at [hello@fontawesome.com](mailto:hello@fontawesome.com).
This incident has been resolved.
This service has been restored to operational and we are now investigating the cause.
We are currently investigating this issue.
Report: "fontawesome.com site partially unavailable"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Kits service degraded"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Downtime for fontawesome.com"
Last updateThis incident has been resolved.
A distributed attack was launched against the site that has been mitigated. We will continue to monitor the situation.
We are currently investigating this issue.
Report: "fontawesome.com inaccessible in some regions with an "Error 1016""
Last updateThis incident has been resolved. Our website and API service are once again fully operational worldwide.
This issue with 1016 errors applies primarily to traffic routed through Cloudflare's São Paulo and Johannesburg datacenters. We're continuing to work with Cloudflare to find a resolution to the issue. Thank you for your patience.
We've had a handful of reports that attempts to visit fontawesome.com fails with a Cloudflare error about DNS resolution to the origin server. We're looking into this but think this is just a temporary condition.
Report: "Slow searches from api.fontawesome.com"
Last updateWe've deployed the fix for this and searches should be speedy fast once again for all queries.
We've identified the root cause of this and are working to address a fix. The impact of this issue is low enough that we're going to deploy the fix next week (Sep 6th or after). We'll be keeping an eye on it through the weekend to make sure that the situation doesn't degrade.
We've released version 4.3.1 of the WordPress plugin, increasing the network request timeout to allow those longer-running searches to complete. Continuing to work on the root cause to improve performance for these slower searches.
We've seen increased response times from the GraphQL API server for some queries. This impacts primarily the Font Awesome WordPress plugin and the icon chooser user interface. Search may be very slow or may actually be failing altogether. At this time the only impact is for WordPress site admins who are using the icon chooser to select icons as they build or manage content. It does not affect the performance of the public-facing sites. We have ideas on the root cause but still need to investigate to come up with a fix.
Report: "Slow responses from the API"
Last updateWe've got a patch in place and the API is now operational again. However, if you are using WordPress and seeing 429 Too Many Requests in your WP admin reach out to us at hello@fontawesome.com and let us know.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Intermittent failures when trying to sign up for a free account on fontawesome.com/start"
Last updateThis appears to be specific to some edge cases for certain customers. We don't really consider this a system-wide issue and are going to handle them one-at-a-time as they come in. That being said: reach out to use at hello@fontawesome.com if you are experiencing issues.
We are currently investigating this issue.
Report: "Small number of Kit icon uploads failing to load"
Last updateThis incident has been resolved.
This fix has been deployed and we are going to continue monitoring it for a few hours.
We think we've identified a fix and are in the process of testing.
We've had a few reports of uploaded icons failing to load. We're investigating the issue.
Report: "Icon search unavailable"
Last updateWe got in contact with Algolia and have resolved the issue.
We are currently investigating this issue.
Report: "Slow responses and 525/522 errors on fontawesome.com"
Last updateIt looks like we've got this resolved but we'll continue to monitor.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Slow response times on fontawesome.com"
Last updateThis incident has been resolved.
Performance issues seem to have been resolved currently. We'll continue to monitor the situation.
We are currently investigating this issue.
Report: "Intermittent 403 errors in some areas of Europe downloading packages from npm.fontawesome.com"
Last updateFixes have been implemented for this issue. If you're still experiencing problems let us know hello@fontawesome.com.
Our provider, Cloudsmith, is dealing with an issue that is affecting some users in Europe. We've had reports of 403 errors when attempting to install packages. You can follow along here: https://status.cloudsmith.io/incidents/8jb9c2rmrng7
Report: "Issues connecting to kit.fontawesome.com"
Last updateA misconfiguration in a database that securely stored site certificates caused requests to kit.fontawesome.com to fail with an old certificate. This has been fixed and all errors resolved.
We are currently investigating this issue.
Report: "Errors downloading packages"
Last updateThis incident has been resolved.
The service has been operational for awhile now but we're going to keep an eye on it for a bit longer before we consider it resolved.
Our provider has put another fix in place and is keeping their eyes on it. We'll continue to watch this during the day.
We've had some reports of 502 Bad Gateway errors when downloading packages. We are working with Cloudsmith to investigate.
Report: "Issues downloading package from npm.fontawesome.com"
Last updateThis incident has been resolved.
Looks like they have a fix in place. Cross your fingers everyone.
Cloudsmith has identified the issue and is working on the fix now.
We see a report from our package service provider, Cloudsmith, that they have some internal API errors that they are investigating. We have some failed status checks that indicate this affects package downloads from npm.fontawesome.com. We'll report back when we have more information. https://status.cloudsmith.io/incidents/nqskfqgmvyyr to follow the issue.
Report: "Database upgrade taking a bit longer than expected"
Last updateAll done! Everything should be back to normal (and a bit faster now too).
We are getting close to finishing the scheduled database migration but we have exceeded our estimate. Things are still going according to plan, just taking a bit longer.
Report: "Slow response times from fontawesome.com"
Last updateThis mitigation is looking good for the moment. We'll continue to watch this and work with Cloudflare to keep these types of attacks proactively shut-down before they gather steam in the future.
We've gone ahead and added Cloudflare as the CDN in front of the site. They should be able to catch and mitigate another DDoS attack. As this change was a bit ahead of schedule, if you run into any issues being able to access the site send us an email hello@fontawesome.com.
From our monitoring, the site is under a DDoS attack. We're working on some steps to mitigate now.
We are currently investigating this issue.
Report: "500 errors trying to install NPM packages from npm.fontawesome.com"
Last updateThis incident has been resolved.
Cloudsmith reports that a fix has been implemented and they are monitoring results.
Our provider, Cloudsmith, is currently investigating issues installing packages. You can follow their progress: https://status.cloudsmith.io. We're going to keep an eye on this as the incident progresses.
Report: "Failure to serve some kits in Australia"
Last updateCloudflare identified the root cause and have shared the following incident report with us: > Cloudflare was performing scheduled maintenance in our Sydney facility. Before Cloudflare can perform maintenance, engineers must remove the datacenter from production by disabling anycast. As part of disabling anycast, a series of flags are set which notify all products and services that the facility is going into maintenance and the datacenter should not receive traffic. > > Due to a bug in the automation to disable anycast, the flags to notify our Tiered Cache and Argo products were not set accordingly. As a result, when Sydney was removed from service, any request that had Sydney as an upper tier would continue to send traffic to Sydney despite the fact that the facility was not supposed to receive traffic. This caused increases in origin connectivity failures for surrounding locations. We’re confident that they will take some steps to mitigate this same problem in the future.
We heard back from Cloudflare that they did see some network connectivity problems with origin servers in Amazon Web Services ap-southeast-2 (this is where our origin server exists). They have mitigated the issue and we'll follow up with them on pro-active steps we could take to avoid this in the future.
At approximately 4:00 PM Eastern Time some customers in Australia were experiencing service interruption for our Kits service. The most common error seen was a 522 (origin server timeout). After investigating, it appears that dynamic load balancing shifted origin servers which caused the timeouts to occur. We are following up with Cloudflare, our provider, to identify if this is correct and figure out ways to mitigate this in the future. At this time, the problem has been corrected and we will continue to monitor the situation and follow up with Cloudflare. Throw any questions you have to hello@fontawesome.com.
Report: "Integrity check errors when downloading packages from npm.fontawesome.com"
Last updateThe fix has now been applied for this and it looks likes it's resolved. Please let us know via hello@fontawesome.com if you are still experiencing issues.
Cloudsmith believes they've identified the issue and are working to get a fix in place as soon as possible.
We are getting reports that both yarn and npm package tools are either downloading the incorrect package or getting integrity hash errors when installing from our private NPM repos. We are in contact with our package service, Cloudsmith, and they are investigating the issue.
Report: "Slow responses from our GraphQL API"
Last updateAt this point we are in good shape. The API server is now behind Cloudflare's edge servers with an upgraded caching strategy. As a bonus, performance for cacheable queries will be a lot faster for anyone outside the U.S.
We have put in place some caching infrastructure. We will continue to monitor the API but at this point we are seeing all associated metrics behaving normally.
The GraphQL API is experiencing increased load and we are having difficulty keeping response times down. We are fully focused on getting this service back up just as soon as we can.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Partial service interruption"
Last updateWe've identified the root cause and the service is not fully operational again. We'll have a small bug fix that we will deploy later today to deal with the underlying issue. Have any questions? Send us an email at hello@fontawesome.com
Our Kits service (kit.fontawesome.com) is currently experiencing some partial downtime. We are investigating the cause.
Report: "Downtime for kit.fontawesome.com"
Last updateAt approximately 1:05 a.m. Central, the master Redis database which is hosted by [Sendgrid.io](http://Sendgrid.io) reported a failover event. This event indicated that the primary Redis instance was no longer serviceable and that the secondary instance was being promoted. The Kits service is deployed in 7 different geographical regions around the globe. Each region has multiple application servers and each of those has a replica of the primary Redis database. While the service was designed to handle intermittent disconnection to the primary Redis database it looks like the promotion after a failover event caused the replicas to go offline. Once the Redis replica went offline for a particular region, our monitoring and disaster recovery tools begin trying to work around this situation. We use Nomad for scheduling jobs and after health checks started failing that tool restarted the job. Without any intervention from our operations team the service came back online 6 minutes after it went offline. During the downtime some requests would succeed as the Cloudflare cache still held valid cache records for some resources. Our team has identified that Redis failover events are not handled in the most ideal way. Optimally, the distributed Redis replicas would continue operating until the new primary Redis database is elected and takes over for the old one. If you have any questions please feel free to email us at hello@fontawesome.com.
This incident has been resolved.
At approximately 1:05 Central Time we began seeing alerts from our monitoring systems indicating downtime for kit.fontawesome.com. Upon investigating we saw that our Redis database cluster was experiencing a failover event which led to about 6 minutes of downtime for the service as a replica was being promoted to primary. Systems self-healed around this issue and came back online without operator intervention. We'll be investigating the root cause of this over the next few days.
Report: "500 errors on fontawesome.com"
Last updateA temporary database connectivity issue was causing 500 errors. This has been resolved and all services appear to be operating normally again.
We are currently investigating this issue.
Report: "Degraded performance for our GraphQL API server"
Last updateWe have resolved the slow requests at this time.
We are investigating issues with our API server. Users have reported extremely long request time, mostly when using our Font Awesome Wordpress Plugin.
Report: "Slow connections to fontawesome.com and kit.fontawesome.com"
Last updateThis incident has been resolved.
How are things looking? We've posted a poll on Twitter to get your feedback on this issue: https://twitter.com/fontawesome/status/1313516318784716801
Thank you all for the mtr records. StackPath is asking for help to grab some response header information. To do this: 1. Load your site or https://fontawesome.com 2. Look in the "Network" tab of your browser developer tools 3. Find the request to either fontawesome.com or kit.fontawesome.com 4. Locate the "x-hw" header (it will look something like "x-hw: 1601997698.cds132.ch4.hn,1601997698.cds023.ch4.sc,1601997698.cds023.ch4.p") Send this to hello@fontawesome.com so we can forward this on to StackPath.
With StackPath's help we've checked several networks in France and haven't located the problem yet. So here is where we need your help, folks. If you are experiencing connection issues please gather up the following: 1. Send a report using "mtr" and run it against "fontawesome.com". Like "mtr -rw fontawesome.com". You can find docs on using mtr: https://www.linode.com/docs/networking/diagnostics/diagnosing-network-issues-with-mtr/ 2. Tell us who your Internet Service Provider is (if you know) Send these goodies to hello@fontawesome.com.
It looks like these issues are affecting folks who are located in France. We are continuing to investigate with our CDN provider.
We are investigating reports of slow connections to fontawesome.com and kit.fontawesome.com.
Report: "Email notifications and new account confirmations not being sent"
Last updateThis incident has been resolved.
We have replaced Mailgun with SendGrid. We will continue to monitor this service transition for a bit.
We are currently having issues with our Mailgun integration, the service we use to send emails of all sorts. We are actively working with them to get this remedied but during this time no automated emails will arrive in your inbox.
Report: "Duotone icons broken with 5.15.0"
Last updateOur fix has been deployed in the form of 5.15.1. Please let us know if you have any further issues.
A regression in 5.15.0 of Font Awesome caused duotone icons to stop rendering one of their layers. This may affect any kit using the "latest" version or if you've explicitly chosen "5.15.0" as the version. The current workaround is to choose "5.14.0" as the version and re-save your kit. We already have a fix for this that will be released in 5.15.1 on Monday, October 5th.
Report: "Investigating degrade performance for NPM package downloads"
Last updateThis incident has been resolved.
Cloudsmith is reporting things are all fixed up. They are continuing to monitor.
Cloudsmith team has identified the issue, a database availability event in one of their regions led to the degraded performance. A fix has been deployed and they are currently monitoring.
We are getting reports our npm.fontawesome.com package service is having trouble. Our provider CloudSmith is investigating. Follow updates directly: https://status.cloudsmith.io/incidents/k0q5t8wlrtrn.
Report: "Degraded performance for NPM downloads"
Last updateFrom CloudSmith: "The infrastructure for the US regions has been restored; if you're located in or near the US and experience an outage, thank you for your patience, and apologies for any interruption to your build/delivery services this Friday afternoon!"
The issue with the load spike has been identified, and CloudSmith is working on a fix; traffic for the US has been re-routed to other regions for now.
Our package provider, CloudSmith, is reporting degraded performance in U.S. regions. They are currently investigating.
Report: "Slow response times due to increased load"
Last updateThis incident has been resolved.
We are continuing to get sporadic reports of extremely slow responses from our services that are powered by StackPath. We've been in contact with their support engineers this week and they've let us know that they are aware of the situation and have diagnosed higher than usual load in certain regions (including Europe). Their network team is working around the clock to optimize the performance and deal with the elevated traffic rates due to the COVID-19 virus. As we get more details we'll continue to pass them along. If you have any specific questions feel free to send them to hello@fontawesome.com.
Report: "Intermittent problems saving existing Kits"
Last updateAt this time the service interruptions from StackPath seem to be resolved. Normal Kit functionality should be restored.
Our CDN provider is currently having some service interruption that affects our Kits product. Saving an existing kit may result in an error as we try and purge the CDN cache. StackPath is aware of the issues and is actively working on a solution. https://status.stackpath.com/incidents/6ss8qc459m9p
Report: "Font Awesome Pro Subsetter fails to create archive file"
Last updateThis has been resolved and builds from the subsetter should be working as expected.
Our Subsetter relies on the Font Awesome API (api.fontawesome.com) to create Zip files when you click "Build". Our API relies on the Quay.io Docker registry which is currently having service interruption. You can follow updates at Quay.io's status page: https://status.quay.io/incidents/qdn3wz80kzww We are investigating ways in which we can work around this issue. But for the moment the Subsetter is down.
Report: "Connection errors (502, 504 HTTP status codes) for our NPM package repository service"
Last updateWe just got the alert from CloudSmith that the outage has been resolved. The service should be fully operational now.
CloudSmith is reporting some trouble with their backend processor. They are working on it and you can follow their progress here: https://status.cloudsmith.io/incidents/5bgvc2yzxy8s
Report: "Investigating slow responses to fontawesome.com"
Last updateA fix is now in place and normal login functionality has resumed. We've beefed up our rate-limiting on some of the critical paths of the site.
We've identified the issue and are working on a fix. In the mean-time the site is back up but attempts to login may be rate-limited.
We are continuing to work on getting fontawesome.com back up.
Our automated monitoring has alerted us that fontawesome.com is responding slowly. We're investigating.
Report: "SSL errors"
Last updateThis incident has been resolved.
Updates are being rolled out now to address the SSL errors. You should start to see services coming back online shortly.
StackPath is reporting they have identified the root cause of the SSL errors. https://status.stackpath.com/incidents/22c999cbl1nh
We are continuing to investigate this issue.
We are seeing SSL errors on these services. We are investigating and trying to reach StackPath now.
Report: "Slow responses from kit.fontawesome.com and fontawesome.com"
Last updateNo further reports have come in but we'll continue to watch this.
Just had a chat with StackPath and they've checked some of the European PoPs and they are performing well. So the current thought is that it might be some network conditions in the region. We'll keep investigating. If anyone has additional information please send us an email to hello@fontawesome.com.
We've had several reports that Kits and fontawesome.com are slow or inaccessible. We're looking into it now. If you can send trace routes and/or HAR files (HTTP archives) to hello@fontawesome.com that will help us narrow down the issue.
Report: "Transitioning fontawesome.com to a new anycast IP address"
Last updateThings have gone pretty smooth so far with the migration. We'll continue to monitor but at this time we consider it resolved.
We are currently monitoring the migration of fontawesome.com's IP address to a new StackPath anycast IP address. While we do not anticipate any problems we'll keep an eye on things over the next few days. If over the weekend we do not see any issues next week we'll begin transitioning the other CDN StackPath-backed services (Kits and Pro CDN) to the newer IP.
Report: "Errors from Amazon's CloudFront CDN service"
Last updateAmazon is reporting that they have this all fixed up.
Our package provider, CloudSmith, is currently experiencing some degrade performance due to an underlying issue with AWS CloudFront. Amazon's status: https://status.aws.amazon.com CloudSmith: https://status.cloudsmith.io Amazon is working on the issue.