Is Uploadcare Down Right Now? Discover if there is an ongoing service outage.

Uploadcare is currently Operational

Last checked Jul 29, 2025 17:59 UTC from Uploadcare's official status page

Historical record of incidents for Uploadcare

Jul 11, 2025

Report: "Issue with Uploading from Social Sources (Dropbox, Facebook, etc.)"

Last update 2025-07-11T02:49:06.512Z

investigating2025-07-11T02:48:58.700Z

We’re currently experiencing issues with uploading files from social sources such as Dropbox, Facebook, and others via the Uploadcare uploader. Users may be unable to select or upload files from these sources. The root cause is under investigation, and our team is working to resolve the issue as quickly as possible. Uploads from local and direct sources are unaffected. We’ll provide updates as we learn more.

Nov 27, 2024

Report: "Unavailability of uploading from 3rd Party Services"

Last update 2024-11-27T10:34:46.238Z

postmortem2024-11-27T10:23:10.615Z

## Incident Summary On November 25, 2024, from 16:01 to 19:44 UTC, the “Uploading from 3rd Party Services” feature of our platform was unavailable. This feature allows users to upload files from social networks and storage services like Dropbox, Facebook, Google Drive, and Instagram. ## Timeline * **15:50 UTC** – A server configuration update was initiated to streamline deployments in our Kubernetes environment. * **16:01 UTC** – The outage began as requests to the “Uploading from 3rd Party Services” feature failed to be processed. * **19:00 UTC** – Investigation revealed that the application networking was incorrectly configured. * **19:15 UTC** – A fix was implemented to re-configure the application networking. * **19:44 UTC** – The incident was fully resolved, and service was restored. ## Root Cause The outage was caused by a synchronization issue between repositories during a server configuration update. This misalignment between repositories, compounded by a lack of explicit configuration, led to the service outage. ## Impact For approximately 3 hours and 43 minutes, users were unable to upload files from third-party services. ## Challenges During Resolution This type of issue is difficult to catch in a staging environment because the traffic profile does not reflect production-level demand. Although the alert system worked as intended, delays in responding to the alert contributed to the resolution time. ## Resolution The application configuration was updated and service functionality was fully restored at 19:44 UTC. ## Action Items ### Short-term * Improve feature status monitoring and alerting to detect outages faster. ## Long-term * Improve synchronization processes between repositories to avoid dependency misalignments. ‌ We sincerely apologize for the disruption this caused to our users. We take this incident seriously and are committed to implementing the above action items to prevent similar occurrences in the future. Thank you for your patience and continued trust in our platform. If you have any questions or need further details, please don’t hesitate to reach out.

resolved2024-11-25T19:45:00.000Z

Issue has been resolved

Report: "Elevated URL API Errors"

Last update 2024-11-27T09:39:50.479Z

postmortem2024-11-27T09:36:28.839Z

## Incident Summary On 26 November 2024, an issue with our URL API was identified, causing a partial outage of the service from 14:31 to 15:00 UTC. ## Timeline * **14:20 UTC** – A CDN configuration change was made to improve the Auto Format feature in the URL API. * **14:31 UTC** – Service degradation began as increased traffic overwhelmed our origin servers due to cache key changes. * **14:37 UTC** – Alert notifications were received by our team. * **14:58 UTC** – The CDN configuration was rolled back. * **15:00 UTC** – The incident was fully resolved, and service was restored. ## Root Cause The issue was caused by a CDN configuration change that altered cache key behavior, significantly increasing traffic to our origin servers. This overwhelmed our autoscaling groups, which hit their scaling limits, resulting in a service outage. ## Impact URL API degradation lasted for approximately 29 minutes. ## Challenges During Resolution This type of issue is difficult to catch in a staging environment because the traffic profile does not reflect production-level demand. Although the alert system worked as intended, delays in responding to the alert contributed to the resolution time. ## Resolution The configuration change was identified as the root cause and rolled back. The service stabilized immediately after the rollback, and full functionality was restored by 15:00 UTC. ## Action Items ### Short-term * Review and improve escalation processes for alert notifications to reduce resolution time. * Adjust staging environment testing to better simulate production traffic patterns for similar changes. ## Long-term * Assess autoscaling limits to ensure sufficient capacity for handling unexpected traffic spikes. We sincerely apologize for the disruption and are committed to learning from this incident to improve the reliability of our services. Thank you for your understanding and continued support.

resolved2024-11-26T18:00:00.000Z

This incident has been resolved.

investigating2024-11-26T15:16:12.063Z

Systems are back to normal

investigating2024-11-26T14:56:34.989Z

We're experiencing an elevated level of URL API errors and are currently looking into the issue.

Sep 26, 2024

Report: "Webhook Service Degradation"

Last update 2024-09-26T09:00:40.033Z

postmortem2024-09-26T08:55:36.281Z

## Incident Summary On 25 September 2024, an issue with webhook delivery was identified, affecting clients between 10:07 and 12:19 UTC. The delay impacted webhook notifications, with no data loss but a significant delay in processing and delivery. ## Timeline * 10:07 UTC – A system configuration change was made, which inadvertently disrupted webhook processing. * 10:07 UTC – Webhook delivery issues began. * 12:05 UTC – The problem was identified and resolved, with backlogged webhooks being processed. * 12:12 UTC – The first webhook was successfully delivered after the fix. * 12:19 UTC – All queued events were processed, with delivery confirmed for all affected users. ## Root Cause The issue was caused by a configuration change that resulted in the webhook delivery system not processing events correctly. Despite initial signs of system health, the disruption went undetected due to gaps in the system’s monitoring tools. ## Impact * Webhook delivery was delayed for approximately 2 hours. * Customers experienced delays in receiving event notifications. * No data was lost, but delivery delays were significant due to a backlog in event processing. ## Challenges During Resolution * Monitoring systems indicated that components of the webhook system were healthy, which delayed identification of the underlying problem. ## Resolution * Webhook processing was restarted, and we verified that all queued events were delivered without any data loss. * The incident was fully resolved by 12:19 UTC, with all webhooks processed and delivered. ## Action Items ### Short-term * Improve the system’s monitoring and alerting to better detect issues with webhook processing. ## Long-term * Explore options to improve the resilience of our webhook delivery system, including scaling the infrastructure to better handle failures.

resolved2024-09-25T12:19:25.000Z

This incident has been resolved. We apologize for any inconvenience this may have caused.

investigating2024-09-25T10:07:49.000Z

We're experiencing a slowdown in our Webhooks service.

Jul 16, 2024

Report: "REST API and Upload API outage"

Last update 2024-07-16T14:55:33.206Z

postmortem2024-07-16T14:45:49.246Z

## REST and Upload API service degradation \(incident #wvjpwt1qtpkn\) **Date**: 2024-07-08 **Authors**: Arsenii Baranov **Status**: Complete **Summary**: From 09:01:53 to 09:33:12 UTC we've experienced higher latencies and in the end a complete outage of Upload and REST API. **Root Causes**: PostgreSQL performance degradation due to human-factor. **Trigger**: Misconfiguration. **Resolution**: Changed our DBA-related processes, fixed monitoring related issue. **Detection**: Our Customer Success team detected the issue and escalated to the Engineering team. **Action Items**: | Action Item | Type | Status | | --- | --- | --- | | Fix monitoring misconfiguration | mitigate | **DONE** | | Improve DBA maintenance approach | prevent | **DONE** | ## Lessons Learned **What went well** * Due to distributed nature of Uploadcare, this incident has no effect on most of our services. This degradation didn’t affect storage, processing and serving files that were already stored by Uploadcare CDN. * Our incident mitigation strategy was right and worked immediately. **What went wrong** * This incident was detected in non-automatic way due to alert misconfiguration. * We failed to process API request during the incident. ## Timeline 2024-07-08 _\(all times UTC\)_ * 08:00 Database maintenance started * 09:01 **SERVICE DEGRADATION BEGINS** * 09:23 Our customer success team escalates issue to Infrastructure team * 09:29 Issue localised and fixed * 09:33 **SERVICE DEGRADATION ENDS**

resolved2024-07-08T09:31:00.000Z

2024-07-08 09:01:53 UTC REST API and Upload API became unavailable due to system misconfiguration. 2024-07-08 09:33:12 UTC We've identified the source of problems, deployed fixes. Service recovered. We are monitoring the situation.

Report: "Service degradation"

Last update 2024-07-16T14:46:21.187Z

postmortem2023-10-04T06:38:56.325Z

## Upload API and Video processing services degradation \(incident #5r4zj8shr69c\) **Date**: 2023-10-02 **Authors**: Alyosha Gusev, Denis Bondar **Status**: Complete **Summary**: From 14:15 to 16:45 UTC we’ve experienced higher latencies of Upload API and with video processing due to very high interest in these services. **Root Causes**: Cascading failure due to combination of exceptionally high amount of requests to Upload API. **Trigger**: Latent bug triggered by sudden traffic spike. **Resolution**: Changed our throttling politics, increased resources for processing. **Detection**: Our Customer Success team detected the issue and escalated to the Engineering team. **Action Items**: | Action Item | Type | Status | | --- | --- | --- | | Test corresponding alerts for correctness | mitigate | **DONE** | | Improve our upload processing system to remove bottleneck that we found | prevent | **DONE** | | Fix service access issue for team members that form potential response teams | mitigate | **DONE** | ‌ ## Lessons Learned **What went well** * Due to distributed nature of Uploadcare, this incident has no effect on most of our services. This degradation didn’t affect storage, processing and serving files that were already stored by Uploadcare CDN. * Our incident mitigation strategy was right and worked immediately. **What went wrong** * This incident was detected in non-automatic way due to alert misconfiguration. * Due to hardening security standards in our organisation, not all of incident responders had access to Statuspage to update our customers in timely manner. ## Timeline 2023-10-02 _\(all times UTC\)_ * 14:15 Our upload processing queue start filling * 14:20 **SERVICE DEGRADATION BEGINS** * 15:23 Our customer success team escalates issue to Infrastructure team * 15:31 Issue localised * 15:41 Incident response team is formed * 15:51:13 Adjusted our throttling policies * 15:51:38 Increased number of processing instances * 16:40 **SERVICE DEGRADATION ENDS** Processing queues clear

resolved2023-10-02T17:05:39.000Z

From 14:15 to 16:45 UTC we’ve experienced higher latencies of from_url uploads and with video processing. We’ve identified the source of the problem, eliminated it and are monitoring the situation. These services are fully functional now.

Dec 1, 2023

Report: "Minor URL API degradation"

Last update 2023-12-01T10:32:23.748Z

postmortem2023-11-24T06:46:58.414Z

# -/json/ and -/main\_colors/ operations returning 400 status **Date**: 2023-11-21 **Authors**: Alyosha Gusev **Status**: Complete, action items in progress **Summary**: From 2023-11-21 17:40 to 2023-11-22 18:40 -/json/ and -/main\_colors/ operations started to return 400 status **Root Causes**: HTTP rewrites misconfiguration **Trigger**: Deploy of the new functionality that enables our customers to save more money on traffic, and end users to experience lower load times \(improvements to automatic image optimisation\) **Resolution**: Bugfix deploy **Detection**: Our automatic tests detected the issue **Action Items**: | Action Item | Type | Status | | --- | --- | --- | | Fix rewrites | mitigate | **DONE** | | Improve visibility of failing tests notification | prevent | **DONE** | ‌ ## Lessons Learned **What went well** * Problem was detected by automatic tests * Bugfix was deployed immediately **What went wrong** * We didn’t found out that tests were failing immediately ## Timeline All times UTC * 2023-11-21 17:40: **SERVICE DEGRADATION BEGINS**. Deploy of the new functionality * 2023-11-22 16:18: Team notices failing tests * 2023-11-22 16:19: Issue localised, Incident response team is formed * 2023-11-22 16:19: Team starts bugfix implementation * 2023-11-22 17:40: Fix deployed to production * 2023-11-22 18:40: Last cached error response expired * 2023-11-22 18:40: **SERVICE DEGRADATION ENDS**

resolved2023-11-21T12:00:00.000Z

2023-11-21 17:40 -/json/ and -/main_colors/ operations returning 400 status

Oct 16, 2023

Report: "CDN"

Last update 2023-10-16T09:39:19.085Z

resolved2023-10-16T06:45:57.000Z

This incident has been resolved.

monitoring2023-10-16T06:16:09.000Z

We are continuing to monitor for any further issues.

monitoring2023-10-16T06:15:40.000Z

We are continuing to monitor for any further issues.

monitoring2023-10-16T06:00:04.000Z

We've observed heightened latency and error rates for uncached requests to our CDN and image processing API from 6:00 to 6:45 UTC. This was primarily due to a surge in server load and a deviation in our capacity scaling mechanism. Currently we still experienced heightened load, while capacity scaling mechanism was fixed.

Aug 7, 2023

Report: "Service degradation"

Last update 2023-08-07T11:33:17.074Z

resolved2023-08-07T11:33:02.427Z

We have experienced a service degradation from 05:08 to 05:10 UTC. The source of the problem was identified and we are working on the fix for the future.

Jun 22, 2023

Report: "Higher latencies of from_url uploads"

Last update 2023-06-22T13:20:41.283Z

resolved2023-06-22T13:20:41.265Z

This incident has been resolved.

monitoring2023-06-22T12:32:03.796Z

A fix has been implemented and we are monitoring the results.

identified2023-06-22T11:41:35.477Z

We’re experiencing higher latencies of from_url uploads. Direct uploads are fully functional. We’ve identified the source of the problem and are working on the fix.

Jun 19, 2023

Report: "Upload degradation from cloud services"

Last update 2023-06-19T14:46:23.903Z

resolved2023-06-19T14:46:23.886Z

This incident has been resolved.

monitoring2023-06-19T14:36:17.134Z

A fix has been implemented and we are monitoring the results.

identified2023-06-19T13:42:02.285Z

The issue has been identified and a fix is being implemented.

investigating2023-06-19T13:37:00.183Z

We’re experiencing service degradation of uploads from cloud services (e.g. Facebook, Google Drive). We're currently investigating the issue. Direct uploads are fully functional.

May 12, 2023

Report: "Higher latencies of from_url uploads"

Last update 2023-05-12T14:01:13.935Z

resolved2023-05-12T14:01:00.307Z

From 12:29 to 13:31 UTC we’ve experienced higher latencies of from_url uploads. Direct uploads were fully functional. We’ve identified the source of the problem, eliminated it and are monitoring the situation. The service is fully functional now.

Feb 17, 2022

Report: "Elevated CDN 5xx Errors"

Last update 2022-02-17T15:07:01.236Z

resolved2022-02-17T14:30:00.000Z

Between 14:30-14:51 UTC customers may have experienced elevated levels of errors.

Dec 8, 2021

Report: "Webhooks operational issue"

Last update 2021-12-08T09:52:17.636Z

resolved2021-12-08T09:52:17.623Z

This incident has been resolved.

monitoring2021-12-07T23:47:53.163Z

A fix has been implemented and we are monitoring the results.

identified2021-12-07T17:28:45.177Z

We are experiencing elevated error rates for Webhooks service. We have identified root cause and we are actively working towards recovery.

Jul 22, 2021

Report: "Issue with DNS resolving for *.ucr.io domains"

Last update 2021-07-22T18:13:29.823Z

resolved2021-07-22T18:13:29.806Z

This incident has been resolved.

monitoring2021-07-22T16:44:52.235Z

The problems with DNS resolving recovered. We are still monitoring.

investigating2021-07-22T16:42:06.697Z

We are continuing to investigate this issue.

investigating2021-07-22T16:06:29.000Z

We are currently investigating the issue with DNS resolving.

Jun 12, 2021

Report: "DNS issues"

Last update 2021-06-12T20:19:10.961Z

resolved2021-06-12T20:19:10.943Z

No more issues with DNS resolving were detected.

monitoring2021-06-10T14:12:10.318Z

A fix has been implemented and we are monitoring the results.

identified2021-06-10T14:04:50.129Z

The issue has been identified and a fix is being implemented.

investigating2021-06-10T13:25:10.436Z

We are experiencing issues with DNS resolving of *.uploadcare.com

May 31, 2021

Report: "Elevated CDN 5xx Errors"

Last update 2021-05-31T18:29:58.987Z

resolved2021-05-31T17:00:00.000Z

Between 16:38-16:43 UTC customers may have experienced elevated levels of errors.

May 25, 2021

Report: "Webhook issues for SVG files"

Last update 2021-05-25T15:43:05.714Z

resolved2021-05-25T15:43:05.702Z

This incident has been resolved. All webhooks have been delivered as expected.

monitoring2021-05-25T15:28:32.707Z

A fix has been implemented and we are monitoring the results.

identified2021-05-25T14:20:31.954Z

We're working on the fix.

identified2021-05-25T14:14:07.258Z

We're experiencing issues with webhook delivery for SVG uploads. The main Upload API remains fully functional.

Apr 1, 2021

Report: "Partial degradation of from_url uploads"

Last update 2021-04-01T16:43:39.469Z

resolved2021-04-01T16:43:39.452Z

From 15:44 to 15:55 UTC we had bad configuration deployment resulting in partial degradation of our uploads service. The service is fully functional now.

monitoring2021-04-01T16:03:35.671Z

Starting at 15:44 UTC we’ve experienced partial degradation of our uploads. We’ve identified the source of problems, implemented fixes and are monitoring the situation.

investigating2021-04-01T15:57:55.705Z

We're experiencing issues with from_url uploads.

Aug 20, 2020

Report: "Partial degradation on uploads, api"

Last update 2020-08-20T08:28:00.870Z

resolved2020-08-20T08:28:00.858Z

This incident has been resolved.

monitoring2020-08-20T08:05:40.684Z

We have experienced issues with our uploads, api systems from 7:51 to 7:56 UTC. We have identified the source of problems, implemented the fix and are monitoring the situation.

May 19, 2020

Report: "Partial degradation of image processing and file delivery"

Last update 2020-05-19T11:35:47.109Z

resolved2020-05-19T11:35:47.091Z

From 8:20 to 9:20 UTC we’ve experienced partial outage of our image processing and file delivery due to degradation of our storage subsystem. The issue is resolved. All systems are working properly now.

monitoring2020-05-19T09:52:43.175Z

The issue is resolved. We are monitoring the situation.

identified2020-05-19T09:24:31.323Z

We have identified the issue.

identified2020-05-19T09:15:56.402Z

We have identified the issue.

investigating2020-05-19T08:44:53.737Z

We're experiencing issues with image processing and file delivery.

Feb 29, 2020

Report: "Video Processing System Maintenance"

Last update 2020-02-29T11:07:53.835Z

resolved2020-02-29T11:07:53.823Z

This incident has been resolved.

identified2020-02-29T07:17:01.218Z

Video processing vendor is performing database maintenance between 6 am and 11 am UTC. They're trying to minimize the impact on their service.

Nov 4, 2019

Report: "Website partial outage."

Last update 2019-11-04T20:56:16.843Z

resolved2019-11-04T20:56:16.825Z

This incident has been resolved.

identified2019-11-04T19:07:25.783Z

Still waiting for Netlify to fix things on their end. Core Uploadcare services are not impacted. Uploads, REST API, CDN, document/video conversion engines are working.

identified2019-11-04T18:40:56.883Z

Due to our vendor issues, website is partially down.

Oct 24, 2019

Report: "Higher latency and increased error rate on CDN and upload API"

Last update 2019-10-24T09:59:20.498Z

resolved2019-10-24T09:59:20.478Z

This incident has been resolved.

monitoring2019-10-24T09:02:27.075Z

This incident has been resolved.

monitoring2019-10-24T08:21:02.912Z

From 07:25 to 7:45 UTC we've experienced higher latency and increased error rate on AWS S3 resulting in partial degradation of our content delivery service and upload API. We are monitoring the situation.

Aug 1, 2019

Report: "Expired certificate on ucarecdn.com"

Last update 2019-08-01T14:14:26.984Z

postmortem2019-08-01T14:10:40.950Z

# What happened On 19th July 2019 at 15:30 UTC certificate used to serve traffic from [ucarecdn.com](http://ucarecdn.com) and [www.ucarecdn.com](http://www.ucarecdn.com) has expired. All HTTPS traffic to these domains effectively stopped. We were able to quickly \(within minutes\) redirect [www.ucarecdn.com](http://www.ucarecdn.com) traffic to alternative CDN provider with proper certificate installed. It took us approximately 105 minutes since start of the incident to fix the issue with [ucarecdn.com](http://ucarecdn.com). ![](https://ucarecdn.com/d07b72c7-36e0-4328-b5b9-902108fff1e9/-/preview/400x300/traffic) # Why that happened 1. Our CDN partner's certificate management system failed to renew the certificate in question and needed an input from our team. 2. It did sent an email 5 days prior to expiration requesting manual input. 3. This email wasn't noticed, because it was sent to a team member that was off the grid on vacation. # What we should do to improve ‌ 1. Change notification system settings, so any issues with certificates with our CDN partners are sent to a team, not a person \[done\] 2. Use 3rd party service to monitor certificate expiration and other settings \[done\] 3. Use CDN partner's APIs to actively and automatically monitor certificates \[in progress\] 4. Create an service that could be used to quickly redirect [ucarecdn.com](http://ucarecdn.com) traffic to backup CDNs in case of any issues with the main one \(not only due to certificates\)

resolved2019-07-19T19:35:13.090Z

This incident has been resolved.

monitoring2019-07-19T18:03:23.081Z

A fix has been implemented and we are monitoring the results.

identified2019-07-19T17:50:10.129Z

We are continuing to work on a fix for this issue.

identified2019-07-19T15:59:55.069Z

The issue has been identified and a fix is being implemented.

Feb 17, 2019

Report: "REST API partial outage"

Last update 2019-02-17T12:23:02.778Z

resolved2019-02-17T12:23:02.765Z

This incident has been resolved.

monitoring2019-02-17T10:41:30.636Z

Starting at 09:20 UTC we've experienced partial outage of our REST API. We've identified the source of problems, deployed fixes and are monitoring the situation.

Dec 17, 2018

Report: "Upload service degradation."

Last update 2018-12-17T16:48:46.916Z

postmortem2018-12-17T16:43:10.970Z

On December 12, we had a degradation of our Upload API. Most users were unable to upload files for 3 hours 11 minutes between Dec 12, 22:39 GMT and Dec 13, 01:50 GMT. ## What happened Requests to Upload API were either handled extremely slowly or \(most of them\) rejected by our web workers. Scaling up our Upload API fleet didn't help. ## What really happened Further investigation revealed that: — Slow requests were consuming all available web workers and without available workers requests were rejected by nginx. — Handled requests were slow due to constant database locks on one database table. — Database locks were caused by dramatical change of tracked usage statistics \(change of project settings by one of our largest customers\). We've spent most of the time during incident on investigation and figuring our what is happening. Actual DB load was average, and DB was wrongly dismissed as source of issues at first. Once we've the root cause, the fix was trivial and took minutes to implement and deploy. ## What we have done We turned off usage tracking for particular customer. ## What we will do — Refactor statistic tracking, so it does not affect our core service. — Add more specific monitors to our DB, so we could identify problems of similar nature much faster.

resolved2018-12-13T09:03:30.825Z

This incident has been resolved.

monitoring2018-12-13T01:54:01.012Z

A fix has been implemented and we are monitoring the results.

identified2018-12-13T01:27:23.841Z

The issue has been identified and a fix is being implemented.

investigating2018-12-12T23:44:44.396Z

We are continuing to investigate this issue.

investigating2018-12-12T23:34:23.455Z

We're experiencing issues with out Upload API.

Aug 8, 2018

Report: "Partial Upload and CDN outage"

Last update 2018-08-08T15:25:35.494Z

postmortem2018-08-08T15:21:39.814Z

We've experienced a partial outage due to issues at AWS, our network infrastructure provider. Starting from approximately 20:07 till 20:25 UTC there was an increased error rates for Upload API and on CDN origins. Due to the nature of the issue, our error reporting systems didn't notice Upload API problems: * all reported metrics were OK * there were no anomalies like drop in upload success rates or peaks in upload error rates We were only aware of CDN origin problems until some of our customers reported issues with Uploading. Underlying issue is quoted below: > Instance launch errors and connectivity issues > > 01:49 PM PDT Between 1:06 PM and 1:33 PM PDT we experienced increased error rates for new instance launches and intermittent network connectivity issues for some running instances in a single Availability Zone in the US-EAST-1 Region. The issue has been resolved and the service is operating normally

resolved2018-08-07T20:07:37.781Z

We've experienced a partial outage. Starting from approximately 20:07 till 20:25 UTC there was an increased error rates for Upload API and on CDN origins.

Jun 6, 2018

Report: "Upload API downtime"

Last update 2018-06-06T22:02:38.968Z

resolved2018-06-06T22:02:23.484Z

This incident has been resolved.

monitoring2018-06-06T18:52:40.856Z

The was 8 minutes downtime starting from 18:21 UTC. What happened: We're trying to mitigate Roscomnadzor's carpet bombing our Russian clients and end users. As one of the measures we've changed our Load Balancing settings but missed one critical config setting that caused the downtime during our regular code deployment. We've fixed the setting and are monitoring the situation.

May 15, 2018

Report: "Increased CDN error rates"

Last update 2018-05-15T16:13:53.273Z

resolved2018-05-15T16:13:53.245Z

This incident has been resolved.

monitoring2018-05-15T15:49:33.942Z

The issues were caused by network connectivity problems in AWS east-1 region.

investigating2018-05-15T14:32:29.221Z

Our CDN origin fleet is experiencing increased error rate.

Feb 25, 2018

Report: "Increased REST API error rates"

Last update 2018-02-25T12:00:00.000Z

resolved2018-02-25T12:00:00.000Z

We've encountered increased error rates on our REST API endpoints. This resulted in reduced reported uptime. In fact, even though the uptime suffered it wasn't as bad as reported. What happened: - from February 25 22:00 UTC to February 26 05:50 UTC error rates on REST API endpoints were increased Why that happened: - one of the machines in REST API fleet ran out of memory - due to OOM, the machine was unable to handle any incoming requests - misconfigured health check prevented load balancer from getting rid of the failing machine - part of all requests, including Pingdom (that reports our uptime) ones, was sent to that failing machine What we've done: - tracked down and terminated the failing machine - fixed health check configuration

Sep 14, 2017

Report: "Increased error rate for Upload API"

Last update 2017-09-14T20:25:40.243Z

resolved2017-09-14T20:25:40.225Z

This incident has been resolved.

monitoring2017-09-14T19:57:00.000Z

Our diagnostic tools show that all subsystems are back to normal. We continue monitoring of the situation.

monitoring2017-09-14T19:52:49.226Z

Error rates are decreasing.

identified2017-09-14T19:05:52.694Z

Appears that increased error rate is connected with increased error rate for S3.

investigating2017-09-14T19:03:26.555Z

We are currently investigating this issue.

Aug 22, 2017

Report: "Upload API was down from 03:04 to 05:16"

Last update 2017-08-22T12:00:00.000Z

resolved2017-08-22T12:00:00.000Z

Due to error in infrastructure configuration, our queue responsible for uploading files was not working properly.

Apr 10, 2017

Report: "File copying is now working"

Last update 2017-04-10T12:32:18.005Z

resolved2017-04-10T12:32:17.965Z

We have found and fixed the issue with backing up of stored files to S3. All files that were not backed up during the incident will be backed up eventually. Automatic copying of uploaded files to S3 buckets (Custom storage) was not affected by the issue.

monitoring2017-04-10T11:51:40.863Z

A fix has been implemented and we are monitoring the results.

investigating2017-04-10T11:17:30.017Z

Automatic file copying is not working at the moment, we're investigating the cause of this.

Feb 28, 2017

Report: "Uploadcare services have degraded performance"

Last update 2017-02-28T23:43:43.333Z

resolved2017-02-28T23:43:42.830Z

Currently, Uploadcare service is fully working. We continue monitoring the situation closely.

monitoring2017-02-28T22:59:00.000Z

Upload API is up again. We continue monitoring all components.

identified2017-02-28T22:54:00.000Z

Upload API went down.

monitoring2017-02-28T22:48:20.342Z

AWS Autoscaling is still down, we're deploying new instances manually.

monitoring2017-02-28T22:33:41.383Z

AWS has reported that everything is green. We're working on recovering affected Uploadcare components. Currently, everything should work but higher latency and error rate are expected.

identified2017-02-28T21:24:00.000Z

AWS has fixed working with existing S3 objects, so CDN and subset of REST API are working at the moment. More details on AWS status: https://status.aws.amazon.com/

identified2017-02-28T17:58:54.394Z

AWS S3 is experiencing increased error rates that results in all of our buckets going AWOL.

investigating2017-02-28T17:48:52.600Z

Our CDN is down. Investigating.

Feb 2, 2017

Report: "Increased error rate on CDN"

Last update 2017-02-02T14:04:19.763Z

resolved2017-02-02T14:04:19.396Z

CDN provider switch resolved the issue and went smoothly.

monitoring2017-02-02T11:06:29.741Z

CDN provider switch is complete. We continue monitoring the situation.

identified2017-02-02T10:54:00.676Z

We're experiencing increased error rates on our CDN. Switching to another CDN provider.

Jan 20, 2017

Report: "REST API Increased latency."

Last update 2017-01-20T16:55:15.820Z

resolved2017-01-20T16:55:15.405Z

The issue is resolved. REST API latency is back to normal.

identified2017-01-20T13:48:33.635Z

Since 09:30 UTC we're experiencing an increased latency for our REST API endpoints. We've found the reason and are working on resolving the issue.

Jan 1, 2017

Report: "Upload API is down."

Last update 2017-01-01T13:52:33.035Z

resolved2017-01-01T13:52:33.010Z

This incident has been resolved.

identified2017-01-01T13:44:41.826Z

Upload API is not working. The problem is identified and the fix is being deployed.

Dec 6, 2016

Report: "Partial CDN service degradation for Russian users."

Last update 2016-12-06T08:39:49.049Z

resolved2016-12-06T08:39:49.000Z

This incident has been resolved.

monitoring2016-12-05T22:33:33.571Z

We've experienced CDN service degradation for some of our end users that get content from edge servers in Moscow, Russia. We've identified and resolved the issue. Currently, we're monitoring the situation. The incident took place between 19:00 and 22:00 UTC.

Nov 17, 2016

Report: "Increased error rate on REST API"

Last update 2016-11-17T17:11:07.348Z

resolved2016-11-17T17:11:06.801Z

This incident has been resolved.

monitoring2016-11-17T14:59:02.622Z

We've fixed the issue and continue monitoring the REST API servers.

investigating2016-11-17T14:35:56.773Z

We are currently investigating this issue.

Sep 27, 2016

Report: "Direct uploads to S3 buckets"

Last update 2016-09-27T14:46:44.649Z

resolved2016-09-27T14:46:44.618Z

The fix is deployed to our upload servers.

identified2016-09-27T12:03:40.147Z

Currently direct uploads to clients' buckets located in regions other than 'standart' don't work. We've identified the cause of the issue and are working on the fix.

Sep 15, 2016

Report: "Increased error rate during uploads."

Last update 2016-09-15T13:57:36.747Z

resolved2016-09-15T13:57:36.350Z

We've found and fixed the issue that was causing upload latency.

investigating2016-09-15T13:47:07.490Z

We're seeing increased error rate and latency during file upload. Investigating.

Aug 30, 2016

Report: "Increased error rate for group files on CDN"

Last update 2016-08-30T16:10:46.806Z

resolved2016-08-30T16:10:46.242Z

This incident has been resolved.

monitoring2016-08-30T14:14:49.028Z

Due to AWS service degradation we're experiencing increased error rate for "file in group" on CDN.