Is Render Down Right Now? Discover if there is an ongoing service outage.

Render is currently Operational

Last checked Jul 29, 2025 14:32 UTC from Render's official status page

Historical record of incidents for Render

Jul 22, 2025

Report: "The Render dashboard and REST API are unavailable"

Last update 2025-07-22T15:43:57.288Z

investigating2025-07-22T15:41:49.238Z

We are currently investigating this issue.

Jul 18, 2025

Report: "Inability to create or update Postgres services for some users in Oregon"

Last update 2025-07-18T19:56:34.175Z

monitoring2025-07-18T19:56:33.852Z

We have identified the issue, implemented a fix, and are monitoring for full restoration. This affects some users in the Oregon region, but not all.

Jul 14, 2025

Report: "Connectivity Issues with Render Services and Dashboard"

Last update 2025-07-14T15:49:59.167Z

investigating2025-07-14T15:49:58.782Z

We are investigating reports of connectivity issues affecting customers globally, including the Philippines.

Jul 10, 2025

Report: "Dashboard is slow to load"

Last update 2025-07-10T15:24:22.589Z

investigating2025-07-10T15:24:22.243Z

We are currently investigating this issue.

Jul 8, 2025

Report: "Blueprint syncs impacted"

Last update 2025-07-08T18:49:19.212Z

identified2025-07-08T18:49:18.886Z

Blueprint syncs are encountering delays or failing due to timeouts. We are currently working to resolve this issue.

Jul 7, 2025

Report: "Connectivity issues for Oregon hosted services"

Last update 2025-07-07T23:07:40.366Z

resolved2025-07-07T13:00:00.000Z

Beginning at approximately 6:15am Pacific, some services hosted in Oregon were unable to send requests to any other Render-hosted web service in any region. Private Network connections were unaffected. This issue was resolved as of 10am Pacific.

Jun 26, 2025

Report: "Postgres metrics may be missing in all regions"

Last update 2025-06-26T23:19:00.597Z

resolved2025-06-26T23:00:00.000Z

Postgres services that have had maintenance in the last 8 days (since 2025-06-18) may be missing some, but not all, metrics during this period (e.g. active connections). We have implemented a fix and services will have maintenance scheduled to apply the fix. Metrics will resume after maintenance for a service is complete but metrics that are already missing are not able to be recovered.

Jun 22, 2025

Report: "Workloads slow to start for some users in Oregon"

Last update 2025-06-22T14:19:24.224Z

investigating2025-06-22T14:19:23.788Z

We are experiencing a delay in starting workloads for a subset of users in the Oregon region. This includes builds, crons, and service (including free tier) scale-ups.

Jun 21, 2025

Report: "Workloads slow to start for some users in Oregon"

Last update 2025-06-21T15:51:37.037Z

resolved2025-06-21T15:00:00.000Z

Workloads were delayed in starting for a subset of users in the Oregon region. This includes builds, crons, and service scale-ups.

Jun 20, 2025

Report: "We are investigating an issue causing slowness in the Render Dashboard and API"

Last update 2025-06-20T13:23:42.412Z

investigating2025-06-20T13:23:42.021Z

We are currently investigating this issue.

Jun 17, 2025

Report: "GitHub based deploys erroring"

Last update 2025-06-17T20:02:24.925Z

identified2025-06-17T20:02:24.582Z

GitHub is experiencing system issues that will result in failures with operations where Render interacts with GitHub. Please see https://www.githubstatus.com/incidents/y7lb2rg4btd7 for GitHub specific details, this incident will remain open at least as long as GitHub's status incident remains an an open state.

Jun 12, 2025

Report: "Degraded performance for some users in Oregon"

Last update 2025-06-12T18:25:16.974Z

investigating2025-06-12T18:25:16.601Z

Engineers have been alerted to and are investigating an issue causing performance degradation for some users in Oregon.

Jun 2, 2025

Report: "Connectivity issues for free Key Value services in Frankfurt"

Last update 2025-06-02T20:40:28.723Z

resolved2025-06-02T20:40:28.412Z

This incident has been resolved.

monitoring2025-06-02T20:29:17.714Z

We've implemented a fix and are monitoring.

investigating2025-06-02T19:02:15.000Z

We are currently investigating this issue.

Report: "Connectivity issues for free Key Value services in Frankfurt"

Last update 2025-06-02T20:00:00.000Z

Investigating2025-06-02T20:00:00.000Z

We are currently investigating this issue.

May 30, 2025

Report: "Increased restarts in all regions"

Last update 2025-05-30T17:31:58.627Z

resolved2025-05-13T17:30:00.000Z

Between 2025-05-07 and 2025-05-13, services may have experienced an increase in instances being evicted and restarting. This was due to a failure of a routine cleanup task, which failed in a way that did not trigger our monitoring. We have fixed that task as well as improved monitoring and alerting to prevent this from recurring.

May 20, 2025

Report: "Service unavailability in the Oregon region"

Last update 2025-05-20T22:11:41.160Z

resolved2025-05-04T22:00:00.000Z

Due to an issue with our infrastructure provider, some services may have experienced downtime between 21:58 and 22:04 on 2025-05-04. Builds and deploys may have been impacted during this time.

May 9, 2025

Report: "Issues accessing the Render dashboard"

Last update 2025-05-09T03:30:58.315Z

resolved2025-05-09T03:30:57.809Z

This incident has been resolved.

monitoring2025-05-09T03:03:57.043Z

Our team has mitigated this issue and is monitoring the situation. The impact time for this incident is 2025-05-09 02:35 to 02:51 UTC. During this time the following will have been impacted: - Dashboard - REST API - Builds - Deployments

investigating2025-05-09T02:54:53.720Z

We are currently investigating issues accessing the Render dashboard. Our team is investigating. This should only be impacting access to the dashboard. Render services should not be impacted.

Report: "Issues accessing the Render dashboard"

Last update 2025-05-09T02:54:00.000Z

Investigating2025-05-09T02:54:00.000Z

We are currently investigating issues accessing the Render dashboard. Our team is investigating. This should only be impacting access to the dashboard. Render services should not be impacted.

Apr 28, 2025

Report: "Logging instability and slow builds in the Oregon region"

Last update 2025-04-28T20:39:42.822Z

resolved2025-04-28T20:39:42.806Z

This incident has been resolved.

monitoring2025-04-28T20:32:24.317Z

A fix has been implemented and we are monitoring the results.

identified2025-04-28T19:16:07.000Z

We have identified an infrastructure issue that may result in logs being delayed in the Oregon region. We are working on a resolution.

Report: "Logging instability and slow builds in the Oregon region"

Last update 2025-04-28T20:39:00.000Z

Resolved2025-04-28T20:39:00.000Z

This incident has been resolved.

Monitoring2025-04-28T20:32:00.000Z

A fix has been implemented and we are monitoring the results.

Identified2025-04-28T19:16:00.000Z

We have identified an infrastructure issue that may result in logs being delayed in the Oregon region. We are working on a resolution.

Apr 21, 2025

Report: "Build failures in Oregon and Virginia"

Last update 2025-04-21T23:31:47.263Z

resolved2025-04-21T23:11:32.000Z

This incident has been resolved. Impact to any services was mitigated as of 20:36.

monitoring2025-04-21T20:57:33.984Z

We've identified that builds for services other than static sites were also affected. We continue to monitor for any further issues.

monitoring2025-04-21T20:36:39.000Z

A fix has been implemented and we are monitoring the results.

investigating2025-04-21T20:32:24.000Z

We are investigating static sites failing to build in the Oregon and Virginia regions.

Report: "Build failures in Oregon and Virginia"

Last update 2025-04-21T23:11:00.000Z

Resolved2025-04-21T23:11:00.000Z

This incident has been resolved. Impact to any services was mitigated as of 20:36.

Update2025-04-21T20:57:00.000Z

We've identified that builds for services other than static sites were also affected. We continue to monitor for any further issues.

Monitoring2025-04-21T20:36:00.000Z

A fix has been implemented and we are monitoring the results.

Investigating2025-04-21T20:32:00.000Z

We are investigating static sites failing to build in the Oregon and Virginia regions.

Apr 16, 2025

Report: "Intermittent 502s have been reported on some services"

Last update 2025-04-16T21:32:50.408Z

resolved2025-04-16T21:32:50.100Z

This incident has been resolved.

monitoring2025-04-16T17:04:00.875Z

We’ve rolled out a mitigation that’s successfully cleared up all the 502s tied to this specific incident. Things are looking stable so far, but we’re still monitoring and working on a more permanent fix.

identified2025-04-16T16:48:23.371Z

We’re still working on a fix and starting to see some improvements from the mitigation steps we’ve taken. We're now focusing on putting a more solid, permanent solution in place and will update this status page as soon as we’ve got more to share.

identified2025-04-16T15:20:12.794Z

We have identified the issue and are currently working on mitigating and fixing it.

investigating2025-04-16T15:19:15.692Z

We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.

investigating2025-04-16T14:34:39.248Z

We’re still actively looking into it. So far, it seems like this is primarily affecting some newly created services in the Frankfurt region.

investigating2025-04-16T13:45:03.247Z

Some services — mostly in the Frankfurt region — have been reported to return 502s on certain requests. It’s not hitting all services, and we can’t confirm yet if it’s limited to just one region. We’re on it and investigating.

Report: "Intermittent 502s have been reported on some services"

Last update 2025-04-16T21:32:00.000Z

Resolved2025-04-16T21:32:00.000Z

This incident has been resolved.

Monitoring2025-04-16T17:04:00.000Z

Update2025-04-16T16:48:00.000Z

Identified2025-04-16T15:20:00.000Z

We have identified the issue and are currently working on mitigating and fixing it.

Update2025-04-16T15:19:00.000Z

We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.

Update2025-04-16T14:34:00.000Z

We’re still actively looking into it. So far, it seems like this is primarily affecting some newly created services in the Frankfurt region.

Investigating2025-04-16T13:45:00.000Z

Report: "Increased 404s on services in Oregon"

Last update 2025-04-16T20:55:50.240Z

resolved2025-04-16T20:55:49.950Z

This incident has been resolved.

monitoring2025-04-16T19:29:07.988Z

This issue has been mitigated.

identified2025-04-16T19:04:00.085Z

We are seeing increased rates of 404s for valid URLs on services hosted in Oregon, including Static Sites, we are working on resolving this issue.

Report: "Increased 404s on services in Oregon"

Last update 2025-04-16T20:55:00.000Z

Resolved2025-04-16T20:55:00.000Z

This incident has been resolved.

Monitoring2025-04-16T19:29:00.000Z

This issue has been mitigated.

Identified2025-04-16T19:04:00.000Z

We are seeing increased rates of 404s for valid URLs on services hosted in Oregon, including Static Sites, we are working on resolving this issue.

Apr 15, 2025

Report: "Intermittent latency spikes and 520 errors for web services in Ohio"

Last update 2025-04-15T18:51:10.748Z

resolved2025-04-15T18:51:10.423Z

We've seen no further symptoms of network congestion in this region.

monitoring2025-04-15T01:03:22.725Z

Our upstream provider has allocated more network capacity in the congested region.

identified2025-04-14T16:42:34.099Z

We've received confirmation from an upstream provider that they have been experiencing networking congestion in this region during these periods of impact. They are now working on a remediation.

investigating2025-04-11T18:58:39.933Z

Despite increased networking resource allocation, we've see another instance of this issue. We're now collaborating with upstream providers to identify the source of these transient networking errors

monitoring2025-04-10T23:48:32.034Z

We have not seen a reoccurrence of the issue after making changes yesterday afternoon, but are continuing to monitor as the problem seems to be intermittent.

investigating2025-04-09T22:53:57.892Z

We are continuing to investigate while working on provisioning more networking resources and improving observability into the issue.

investigating2025-04-08T23:05:26.482Z

We are currently investigating this issue.

Report: "Intermittent latency spikes and 520 errors for web services in Ohio"

Last update 2025-04-15T18:51:00.000Z

Resolved2025-04-15T18:51:00.000Z

We've seen no further symptoms of network congestion in this region.

Monitoring2025-04-15T01:03:00.000Z

Our upstream provider has allocated more network capacity in the congested region.

Identified2025-04-14T16:42:00.000Z

We've received confirmation from an upstream provider that they have been experiencing networking congestion in this region during these periods of impact. They are now working on a remediation.

Investigating2025-04-11T18:58:00.000Z

Despite increased networking resource allocation, we've see another instance of this issue. We're now collaborating with upstream providers to identify the source of these transient networking errors

Monitoring2025-04-10T23:48:00.000Z

We have not seen a reoccurrence of the issue after making changes yesterday afternoon, but are continuing to monitor as the problem seems to be intermittent.

Update2025-04-09T22:53:00.000Z

We are continuing to investigate while working on provisioning more networking resources and improving observability into the issue.

Investigating2025-04-08T23:05:00.000Z

We are currently investigating this issue.

Apr 11, 2025

Report: "Elevated rates of 404s in Ohio and Oregon regions"

Last update 2025-04-11T21:49:01.091Z

resolved2025-04-11T21:49:00.626Z

This incident has been resolved.

identified2025-04-11T21:32:16.768Z

Engineers are fixing an issue causing elevated rates of 404s for services in the Ohio and Oregon regions.

Apr 5, 2025

Report: "Logs may be slow to load for some services in Oregon"

Last update 2025-04-05T01:05:18.820Z

resolved2025-04-05T01:05:18.500Z

This incident has been resolved.

monitoring2025-04-04T22:01:43.170Z

We have implemented a fix and continue to monitor the situation.

investigating2025-04-04T20:23:22.128Z

We are currently investigating this issue.

Mar 28, 2025

Report: "Slow deploys for some users in Frankfurt"

Last update 2025-03-28T23:10:30.060Z

resolved2025-03-28T23:10:29.686Z

This incident has been resolved.

investigating2025-03-28T16:51:23.199Z

We are currently investigating this issue.

Report: "Dashboard logs unavailable"

Last update 2025-03-28T04:29:54.272Z

resolved2025-03-28T04:29:53.936Z

We have resolved the issue and logs are now working for all customers.

investigating2025-03-28T02:57:22.673Z

We are currently investigating reports of customers unable to view logs in our dashboard. The message displayed will be "Something went wrong while loading your logs. Try searching again. Internal Server Error"

Mar 27, 2025

Report: "Slow builds in Frankfurt"

Last update 2025-03-27T22:11:31.726Z

resolved2025-03-27T22:11:31.311Z

This incident has been resolved.

monitoring2025-03-27T17:45:21.601Z

A fix has been implemented and we are monitoring the results.

identified2025-03-26T23:43:08.065Z

We have made initial improvements but are continuing to work on a complete fix.

identified2025-03-26T19:23:46.106Z

The previous mitigation did not fully address the issue, so degraded performance is still being observed. We are working on a fix to fully resolve the issue.

monitoring2025-03-26T00:09:01.719Z

We will continue to monitor builds & deploys in Frankfurt for the next 16-18 hours.

monitoring2025-03-25T19:26:45.984Z

A fix has been implemented and we are monitoring the results.

investigating2025-03-25T17:48:29.209Z

We are currently seeing degraded (slow) builds in the Frankfurt region.

Report: "Dashboard and Redis not responsive, builds and deploys are delayed"

Last update 2025-03-27T19:50:36.419Z

resolved2025-03-27T19:50:35.871Z

This incident has been resolved.

identified2025-03-27T18:26:34.832Z

Dashboard is mostly recovered, but Shell access is still impacted. Redis/KeyVal was also impacted, but is now recovered.

identified2025-03-27T18:11:24.991Z

Builds and deploys are now fully operational.

identified2025-03-27T18:00:45.925Z

We have identified the issue and are working on resolution.

monitoring2025-03-27T17:37:24.757Z

We've identified an issue that caused Dashboard to be non-responsive for ~10 minutes (between 17:19 and 17:30 UTC). A fix has been put in place and we are monitoring results.

Mar 25, 2025

Report: "Free web services partially unavailable in Frankfurt"

Last update 2025-03-25T23:35:45.541Z

resolved2025-03-25T23:35:45.215Z

This incident has been resolved.

monitoring2025-03-25T22:50:08.803Z

A fix has been implemented and we are monitoring the results.

investigating2025-03-25T18:14:35.437Z

Free web services are unavailable for some customers in Frankfurt. We are investigating the issue.

Mar 17, 2025

Report: "Render Maintenance Period"

Last update 2025-03-17T23:31:01.521Z

resolved2025-03-17T23:30:03.618Z

This maintenance has been canceled. We were able to workaround the need for a maintenance period.

investigating2025-03-12T20:50:18.336Z

We will be upgrading critical infrastructure on March 19th at 4:00 pm PDT (March 19th 11:00 pm UTC). For up to 30 minutes, you will be unable to view, edit, create or deploy services and databases. There will be no interruptions to deployed services and databases. If you need help, please get in touch at support@render.com or talk to us on our community forum, https://community.render.com

Mar 13, 2025

Report: "Builds and deploys affected in Frankfurt"

Last update 2025-03-13T19:52:06.066Z

resolved2025-03-13T19:52:05.571Z

This incident has been resolved.

investigating2025-03-13T16:40:13.861Z

Between approximately 14:00 and 14:40 UTC, builds and deploys may have failed for some services located in the Frankfurt region. Services are no longer affected and engineers are investigating.

Report: "Logins requiring 2FA"

Last update 2025-03-13T19:41:32.588Z

resolved2025-03-13T19:41:32.287Z

This incident has been resolved.

investigating2025-03-13T19:32:30.457Z

If 2FA is enabled, users may be unable to enter their one-time password. We are investigating.

Mar 11, 2025

Report: "SSH for services in all regions"

Last update 2025-03-11T21:58:26.884Z

resolved2025-03-11T20:00:00.000Z

Between 19:57 and 21:08 UTC, users would have been unable to SSH into hosts. This has now been resolved.

Mar 6, 2025

Report: "Free Tier Services disrupted in all Regions"

Last update 2025-03-06T03:20:29.241Z

resolved2025-03-06T03:20:28.913Z

All services have recovered. Resolving.

identified2025-03-06T03:17:45.270Z

Render engineers have rolled out the fix and free tier has recovered in all regions except for Frankfurt.

identified2025-03-06T01:14:12.151Z

Engineers are rolling rolling out a fix now to free tier in all regions.

investigating2025-03-06T01:05:57.813Z

We are continuing to investigate this issue.

investigating2025-03-06T01:05:30.677Z

Render engineers are fixing an issue disrupting Free Tier Services.

Feb 28, 2025

Report: "Degraded deploys in all regions"

Last update 2025-02-28T19:52:35.543Z

resolved2025-02-28T19:52:35.209Z

This incident has been resolved.

monitoring2025-02-28T19:14:28.725Z

A fix has been implemented and we are monitoring the results.

identified2025-02-28T18:49:45.533Z

We've implemented a mitigation for Builds and Deploys. We are continuing to investigate Free Tier scale-ups.

identified2025-02-28T18:01:17.897Z

Builds and deploys are degraded for all services. Free tier services are also impacted when spinning up from idle.

investigating2025-02-28T17:15:13.042Z

We are currently investigating this issue.

Feb 24, 2025

Report: "Deploys failing with "Internal server error""

Last update 2025-02-24T22:19:51.750Z

resolved2025-02-24T22:19:51.259Z

This incident has been resolved.

monitoring2025-02-24T22:05:04.402Z

A fix has been implemented and we are monitoring the results.

investigating2025-02-24T21:49:16.251Z

We are currently investigating this issue.

Report: "Slow deploys for some Oregon services"

Last update 2025-02-24T21:26:37.298Z

resolved2025-02-24T21:26:36.976Z

This incident has been resolved.

monitoring2025-02-24T19:06:28.776Z

A fix has been implemented and we are monitoring the results.

investigating2025-02-24T18:15:31.254Z

We are currently investigating this issue.

Feb 20, 2025

Report: "Increased HTTP 404 and 5xx errors"

Last update 2025-02-20T01:16:33.636Z

resolved2025-02-20T00:30:20.000Z

We have rolled out our mitigation as of 4:25p PST. Resolving.

monitoring2025-02-19T20:01:42.646Z

Engineers have identified a mitigation to prevent this from occurring in the future and we will leave this incident in a state of Monitoring until it has been fully rolled out. This is expected to be complete within a few hours.

monitoring2025-02-19T19:24:26.440Z

We're investigating an increase in errors in our HTTP routing layer from 10:40 to 11:00 PST. The impact is over and we're working on a mitigation.

Feb 3, 2025

Report: "Dashboard logins failing"

Last update 2025-02-03T14:18:47.266Z

resolved2025-01-31T02:25:16.000Z

As of 18:25 Pacific (Jan 31 02:25 GMT), this issue has been resolved.

identified2025-01-31T01:56:51.000Z

Logins to dashboard.render.com are currently failing, attempting to do so returns you to the main login page. Engineering has already begun diagnosing the issue.

Jan 21, 2025

Report: "Dashboard GitHub Login Failures"

Last update 2025-01-21T19:00:04.911Z

resolved2025-01-21T19:00:04.624Z

This incident has been resolved.

monitoring2025-01-21T18:25:48.172Z

A fix has been implemented and we are monitoring the results.

investigating2025-01-21T18:09:24.000Z

Some users are reporting an issue logging into the Render Dashboard with their GitHub login credentials.

Jan 14, 2025

Report: "Services relying on GitHub may fail to build"

Last update 2025-01-14T00:32:21.305Z

resolved2025-01-14T00:32:20.928Z

This incident has been resolved.

investigating2025-01-13T23:58:34.152Z

Due to an ongoing outage on GitHub, services may fail to build and consequently deploy.

Dec 31, 2024

Report: "Missing metrics in Oregon region"

Last update 2024-12-31T05:11:43.706Z

resolved2024-12-31T05:11:43.451Z

This incident has been resolved.

monitoring2024-12-31T04:39:00.222Z

We've identified and resolved the issue and metrics should be appearing now as expected.

investigating2024-12-31T04:03:39.026Z

We are investigating an issue with metrics affecting some users in our Oregon region.

Dec 12, 2024

Report: "Deployments in Frankfurt not completing"

Last update 2024-12-12T15:50:49.953Z

resolved2024-12-12T15:50:49.936Z

This is now resolved.

monitoring2024-12-12T15:21:52.372Z

Deployments are now succeeding. Please manually trigger any stuck builds.

investigating2024-12-12T15:06:52.206Z

We are continuing to investigate the cause of deployments not succeeding in Frankfurt.

investigating2024-12-12T14:30:47.835Z

We are continuing to investigate the cause of deployments not succeeding in Frankfurt.

investigating2024-12-12T14:01:59.321Z

We're investigating reports of deployments in our Frankfurt region not completing and getting stuck at "Build Successful"

Dec 6, 2024

Report: "Partial service disruption for web services and static sites"

Last update 2024-12-06T01:10:21.368Z

postmortem2024-12-05T23:00:57.827Z

# Summary Beginning at 10:03 PST on December 3, 2024, Render's routing service was unable to reach newly deployed user services, resulting in 404 errors for end users. Some routing service instances also restarted automatically, which abruptly terminated HTTP connections and reduced capacity for all web traffic. The root cause was expiring TLS certificates on internal Render components, which created inconsistent internal state for Render's routing service. The affected certificates were refreshed and the routing service was restarted beginning at 10:24 PST and was fully recovered by 10:37 PST. # Impact _Impact 1. Starting at 10:03 PST, many services that deployed in this time period experienced full downtime. Clients to those services received 404 errors with the header no-server._ _Impact 2. Starting at 10:08 PST, the routing service started abruptly terminating connections, but was otherwise able to continue serving traffic normally._ _Recovery. By 10:37 PST, all routing service instances were reconnected to the metadata service and full service was restored._ # Timeline \(PST\) * 10:03 - Certificates in some clusters begin to expire resulting in Impact 1. * 10:08 - Some routing service instances begin to restart resulting in Impact 2. * 10:15 - An internal Render web service becomes unavailable after a deploy and an internal incident is opened. * 10:18 - Render engineers are paged because the routing service has stopped getting updates in some clusters * 10:20 - Render engineers identify routing service is failing to connect to metadata service * 10:24 - Render engineers restart the metadata service to refresh the mTLS certificate, routing service begins to recover * 10:37 - Restarts are completed and routing services in all clusters are recovered # Root Cause ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXef1PHnbrcmO9Zk0hXPewPE3suRSs5oo2s6rgl1tj1ef3TAjMTIAvjQ6Lj1YgGKd-C-O_fgxps2JRUKbqpS35JLhZKGjdKucnuTSb-egpCbJM1qb2csJfhkj1-0m0v9bTgkzk4cTw?key=9lzD6B69oZMukipDXCVY5A) The Render HTTP routing service uses an in-memory metadata cache to route traffic to user services. It relies on the Render metadata service for updates to this cache when changes are made to user services. This incident was triggered when certificates for this metadata service expired. The certificates were previously refreshed on restarts. But, as the metadata service has stabilized, we have been redeploying it less frequently. Although the system is designed to continue serving traffic when the metadata service is unavailable, it failed to account for partial connectivity failure. The certificates expiring caused a partial connectivity failure where updates for newly deployed services were only partially processed, reconciling to an inconsistent state that was unable to route traffic. In an attempt to fail fast, the routing service is designed to crash and restart to resolve any client-side connectivity issues after several minutes of stale data. These restarts did not solve the issue and long-lived connections or in-flight requests to those instances were abruptly terminated. # Mitigations ## Completed * Restart all metadata services to refresh certificates ## Planned * Automatically refresh metadata service TLS certificates. * Update our alert on missing metadata updates to fire sooner. * Add an alert monitoring reachability of the metadata service. * Increase the threshold to tolerate stale metadata before intentionally restarting the HTTP routing service. * Update the routing service metadata cache logic to handle this mixed connectivity state correctly.

resolved2024-12-03T19:33:43.462Z

This incident has been resolved.

monitoring2024-12-03T18:38:57.158Z

A fix has been implemented and we are monitoring the results.

investigating2024-12-03T18:37:00.856Z

We are continuing to investigate this issue.

investigating2024-12-03T18:25:54.881Z

We are currently investigating this issue.

Nov 13, 2024

Report: "Services using the latest Node 18 release fail to deploy"

Last update 2024-11-13T00:28:25.881Z

resolved2024-11-13T00:28:25.867Z

In order to prevent further failures, engineering has temporarily published Node 18.20.5 using 18.20.4's resources for Render-hosted services. This will be undone as soon as 18.20.5 is normally available. This issue has been resolved.

identified2024-11-12T23:43:36.738Z

Node v18.20.5 was released a bit over an hour ago, but its download directories do not contain the necessary data. Services using Node that specify this version, typically by specifying use of the latest release in the v18 series, will fail to deploy, or in the case of Cron Jobs, fail to execute even without a new deployment. Engineering is investigating alternatives to prevent services from failing in this manner.

identified2024-11-12T22:51:11.403Z

Cron Job execution logs indicate the environment being set up, but the cron job's command is never executed. Engineering is actively investigating.

Nov 1, 2024

Report: "DockerHub Image Deploys Failing"

Last update 2024-11-01T20:38:00.504Z

resolved2024-11-01T20:38:00.489Z

Our mitigation is working and the errors have stopped.

monitoring2024-11-01T19:35:54.566Z

We have applied another mitigation and are no longer seeing errors. We will continue monitoring error rates.

identified2024-11-01T17:54:54.521Z

The issue has recurred. We are working on implementing a more permanent fix.

monitoring2024-11-01T16:30:11.839Z

We have mitigated the issue and are monitoring failures to ensure it doesn't reoccur.

investigating2024-11-01T16:01:19.595Z

There is an issue pulling public images from DockerHub. This means that deploying a public DockerHub image that doesn't specify a registry credential may fail. We are working on a fix. In the meantime, specifying your own credentials should avoid the current disruption.

Oct 28, 2024

Report: "Networking outage, Render services unavailable in Ohio region"

Last update 2024-10-28T22:30:50.119Z

resolved2024-10-28T22:00:00.000Z

There was a network outage during maintenance on internal networking infrastructure in the Ohio region. The outage was for the time range of 14:55-14:59 PT

Oct 16, 2024

Report: "Outage for some freetier web traffic in Oregon region"

Last update 2024-10-16T17:49:37.594Z

resolved2024-10-16T17:30:00.000Z

Traffic was disrupted for some freetier services in the Oregon region for approximately 8 minutes

Oct 7, 2024

Report: "Builds and deploys degraded in Ohio"

Last update 2024-10-07T20:27:10.397Z

resolved2024-10-07T20:27:10.022Z

Between 12:27 PM and 12:52 PM PDT we saw elevated errors for builds and deploys in Ohio due to an incident with an upstream provider. The upstream incident has been resolved and we are no longer experiencing errors.

monitoring2024-10-07T20:09:27.034Z

We have seen recovery from the upstream provider. We are continuing to monitor builds and deploys in Ohio.

investigating2024-10-07T19:53:03.981Z

Builds and deploys may be degraded due to an upstream provider outage. We are currently investigating.

Sep 28, 2024

Report: "Some databases are Unavailable in Ohio"

Last update 2024-09-28T00:42:26.301Z

resolved2024-09-28T00:42:25.944Z

This incident has been resolved.

identified2024-09-28T00:31:50.095Z

The issue has been identified and a fix is being implemented.

investigating2024-09-28T00:13:44.065Z

We are currently investigating this issue.

Sep 26, 2024

Report: "Increased response times for Virginia services"

Last update 2024-09-26T19:57:30.348Z

resolved2024-09-26T19:57:29.964Z

There has been no further impact to response times, as previously stated the increased response times occurred from approximately 12:45PM EDT to 13:20PM EDT, and have been stable ever since. This incident has been resolved.

identified2024-09-26T17:50:59.787Z

Services hosted in Virginia began encountering increased response times as of approximately 12:45PM (Eastern Daylight Time). While response times seemed to return to normal around 13:20PM, our upstream routing provider has opened a status incident regarding routing performance in a Virginia facility, so this incident remains active as well.

Sep 18, 2024

Report: "Degraded Auto Deploys and Preview Deploys for some users"

Last update 2024-09-18T18:27:32.706Z

resolved2024-09-18T18:27:32.360Z

Between 14:15 UTC and 17:52 UTC, some autodeploys and preview deploys were delayed.

monitoring2024-09-18T17:30:03.813Z

A fix has been implemented and we are monitoring the results.

investigating2024-09-18T16:53:51.858Z

We are continuing to investigate this issue.

investigating2024-09-18T16:39:17.218Z

Auto deploys are delayed or not working for some users. We're currently investigating.

Sep 17, 2024

Report: "Some services using Node unable to deploy"

Last update 2024-09-17T19:32:26.097Z

resolved2024-09-17T17:30:00.000Z

Services that used Node but did not specify an explicit version (e.g. a range was used) were unable to deploy due to an issue downloading Node. Issues with Node stemmed from an outage with an upstream provider. The issue has been resolved and services may deploy again.

Sep 11, 2024

Report: "Request logs and network metrics missing for public services in Singapore"

Last update 2024-09-11T22:25:58.706Z

resolved2024-09-11T21:30:00.000Z

From 21:26 to 22:09 UTC, there was a configuration issue that prevented the system from processing a subset of logs in Singapore. This resulted in a gap in request logs and network metrics for affected public services. The underlying issue has been resolved.

Report: "Image-based services inaccessible in Dashboard"

Last update 2024-09-11T20:04:29.127Z

resolved2024-09-11T18:00:00.000Z

For around an hour, image-based services in Dashboard were showing up as not existing. These services still existed and remained operational, but could not be accessed in Dashboard during this time. The issue is now resolved.

Sep 10, 2024

Report: "Free Tier Services disrupted in Singapore region"

Last update 2024-09-10T23:19:45.031Z

resolved2024-09-10T22:50:00.000Z

Engineers were alerted and responded to an issue disrupting all services in our Free Tier in Singapore. Services were disrupted for approximately 12 minutes, from 22:50 UTC to 23:02 UTC

Sep 9, 2024

Report: "Degraded network performance for 28 minutes"

Last update 2024-09-09T23:03:21.973Z

resolved2024-09-09T22:30:00.000Z

Web services across all regions experienced intermittent request failures from 22:15-22:43 UTC.

Report: "Render Dashboard intermittently failing to load, some Oregon services affected for 5 minutes"

Last update 2024-09-09T20:24:40.795Z

resolved2024-09-09T20:24:29.631Z

Between 1:05 - 1:10 PM PDT the Render Dashboard intermittently failing to load and some Oregon services were affected. Render engineers have identified & fixed the issue.

Report: "Missing requests logs/metrics"

Last update 2024-09-09T12:11:01.459Z

resolved2024-09-09T12:11:01.442Z

This incident has been resolved.

investigating2024-09-09T08:58:02.942Z

We are aware of the issue and are currently investigating.

Sep 8, 2024

Report: "Some services unavailable in Virginia"

Last update 2024-09-08T23:06:32.343Z

resolved2024-09-08T22:50:00.000Z

For 6 minutes, some services were unavailable in the Virginia region. Engineers responded and services were restored at 3:56 PM PDT.

Sep 6, 2024

Report: "Slow or Erroring Dashboard and APIs"

Last update 2024-09-06T19:56:13.989Z

resolved2024-09-06T19:56:13.639Z

Normal Dashboard and REST API performance has resumed.

monitoring2024-09-06T19:20:01.763Z

A change to the Render API system resulted in slow performance for our Dashboard and REST API. We have reverted this change and are monitoring performance, which appears to have returned to normal. Customer services hosted on Render should not have been affected by this incident unless they are also active users of our REST API.

monitoring2024-09-06T18:45:45.955Z

We are continuing to monitor for any further issues.

monitoring2024-09-06T18:45:23.749Z

A fix has been implemented and we are monitoring the results.

identified2024-09-06T18:26:48.769Z

The Render Dashboard ( https://dashboard.render.com/ ) is encountering performance issues causing slow page loads, or timeouts and failures to load pages. Engineering is actively addressing the issue. Customer services hosted on Render are not affected by this incident.

Sep 2, 2024

Report: "Some services not starting in Oregon"

Last update 2024-09-02T15:06:50.636Z

resolved2024-09-02T15:06:50.184Z

The incident has been resolved

monitoring2024-09-02T14:46:03.025Z

We have identified the issue and have applied a fix. We are seeing services successfully starting.

investigating2024-09-02T13:49:45.198Z

We are investigating some services that are not starting in Oregon.

Aug 30, 2024

Report: "The Render dashboard intermittently failing to load and and some services in Oregon are affected"

Last update 2024-08-30T21:55:58.288Z

resolved2024-08-30T21:55:56.245Z

This incident has been resolved.

monitoring2024-08-30T19:00:06.566Z

We've implemented a fix and are continuing to monitor for elevated failure rates.

identified2024-08-30T17:04:13.345Z

We've identified the issue and have started work on a mitigation

investigating2024-08-30T15:56:48.061Z

We're still investigating and looking at ways to best mitigate this

investigating2024-08-30T15:24:21.236Z

Follow up from https://status.render.com/incidents/jw8wp2ss1566