
Fly.io Status
Fly.io is currently operational with all systems functioning normally.
Incident History
Showing incidents from the last 15 days
Report: "Elevated Sprites error rates in SIN"
Last updateThis incident has been resolved.
A fix has been implemented and we are seeing error rates for sprites in SIN normalize. We are continuing to monitor to ensure full recovery.
We are investigating elevated 500 / internal server error rates with Sprites in the SIN region. Users may see increased errors when accessing sprites located in this region, or for requests to the Sprites API originating from SIN
Report: "Increased network latency in North America"
Last updateThis incident has been resolved.
Our upstream paths have been fixed. We are monitoring the results.
We are working with our upstream network provider to address periodic loss of connectivity over transit in ord
We are seeing a recurrence of elevated latency across some hosts in ORD impacting the MPG control plane and a subset of clusters there. We are working to address this.
The network between our regions has been performing well for the majority of traffic. We're still continuing to monitor a few impacted routes in North America that may be seeing elevated latency and packet loss.
Impacted NA backbones have been sidestepped where possible, and we're continuing to monitor network health.
Managed Postgres in ORD has returned to normal operation. We continue to see slightly elevated latencies and loss over transits in North America. We are working with our upstream network providers to improve performance.
We are investigating elevated network instability across some hosts in ORD. Apps and managed postgres clusters on impacted hosts may see elevated latency or networking errors at this time.
Report: "Ingress Traffic issues in GRU"
Last updateThis incident has been resolved.
Some of our edge nodes in GRU has suffered an error that crashed some of the critical services. We're currently working to bring them back online. Some traffic entering through GRU (i.e. users connecting from around GRU) may be temporarily affected: connections may see increased latency or be occasionally dropped.
Report: "Emergency maintenance of Petsem causing some control plane errors"
Last updateThe maintenance has been completed and control plane functions should recover to normal. We're monitoring for any further complications.
We're performing an emergency maintenance on Petsem, our secrets management service. Some control plane write operations may temporarily fail, for example, creating new apps or secrets. Existing apps and machines should keep functioning without issues.
Report: "Managed Postgres Control Plane Issues in IAD"
Last updateAn initial fix has been implemented and connectivity to all impacted clusters has been restored. We are continuing to monitor to ensure stable recovery.
We are continuing to address this issue. Some clusters in IAD are unavailable at this time, some users may have seen unexpected cluter restarts. We are working on restoring normal performance for all clusters in IAD
We are investigating MPG control plane instability in a subset of the IAD region. A small number of clusters in the region may have seen unexpected failovers or connection issues over the past 30m.
Report: "egress ips are broken in ORD"
Last updateA fix has been implemented and we are monitoring the results.
Egress ips are broken in most of ORD, we are currently investigating this issue
Report: "Capacity issues in ARN region"
Last updateThe ARN region is low on available host capacity. Creating new machines, or starting currently stopped/suspended machines, may fail at this time. We are working on provisioning new host capacity in the region. Please consider using nearby regions if possible.
Report: "Consul cluster degradation"
Last updateOne of our Consul clusters is in degraded state due to a failed node. This can cause issues with LiteFS primary node selection, Unmanaged Postgres (14.x and older *only*), and creation of new Unmanaged Postgres clusters. Impact is limited to these legacy products and does not affect deployments, running Fly applications in general, or Managed Postgres clusters.
Report: "Issues with flyctl ssh console and Machines OIDC"
Last updateA fix has been implemented and we are monitoring the results.
We're currently investigating an issue affecting flyctl ssh console functionality and machines' OIDC tokens.
Report: "IPv6 outage for some machines in ORD"
Last updateWe're working with our upstream providers to investigate an IPv6 networking failure in ORD.
Report: "Private networking issues in SYD"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Due to an upstream provider issue, Private Networking (6PN) is currently degraded in SYD region. Communication between Machines in SYD region and Machines in other regions may fail at this time. Newly created Machines in SYD may fail to sync to other regions (may not show up in Machines API List endpoint, or state may be incorrect). Additionally, TLS certificate resolution and Machines API authentication may currently be degraded in the SYD region. We are working with our upstream providers to resolve this issue.
Report: "Elevated deployment errors"
Last updateWe're investigating an increase in deployment errors affecting some users. At this time, creating or updating Machines may erroneously fail with the message: "We require your billing information, please add it at https://fly.io/dashboard/<org>/billing".
Report: "Networking issues in ORD"
Last updateWe are aware of increased latency and connection drops for clients located near Chicago (ORD) and are currently working on a fix.
Report: "Networking issues in ORD"
Last updateWe are currently investigating increased latency and dropped connections in ORD (Chicago).