Historical record of incidents for imgix
Report: "Elevated rendering errors"
Last updateWe are investigating elevated render error rates for the service. Previously cached derivatives are not impacted.
Report: "Elevated rendering errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating elevated render error rates for uncached derivative images and the Management API in the NA region. We will update once when we obtain more information. Previously cached derivatives are not impacted.
Report: "Elevated rendering errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating elevated render error rates for uncached derivative images and the Management API in the NA region. We will update once when we obtain more information.Previously cached derivatives are not impacted.
Report: "Issues with logging in"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Some customers are experiencing issues with logging into the dashboard. We have identified the issue and are working on a fix.
Report: "Missing Images in Asset Manager"
Last update# What happened Between **March 17th, 5:48 AM UTC** and **March 18th, 6:56 PM UTC**, the Asset Manager experienced an indexing outage. During this period, newly uploaded assets and a very small percentage of pre-existing assets were not appearing in the Asset Manager UI. These assets were still successfully uploaded to the Origin and were accessible via the Rendering API. # How it happened An unannounced maintenance from our upstream provider caused a node failure in our Asset Manager infrastructure, resulting in the temporary loss of indexed asset data. # What went wrong Several things went wrong during this incident: * Our service provider failed to notify us of the maintenance window * Our Asset Manager infrastructure was not adequately provisioned to tolerate failures during the maintenance * A large amount of data was required to be restored, resulting in a severely prolonged restoration process # What we are doing to address this incident * We have migrated and optimized infrastructure configurations to better tolerate node failures * We created additional backups and introduced additional indexing layers to improve redundancy and resilience * We are evaluating alternative upstream providers to reduce dependency risks
The root problem of the incident has been resolved. Some assets may still return an error. Please get in touch with support@imgix.com if you are experiencing issues
We have identified an issue affecting the Imgix Asset Manager. Users may experience difficulties searching for previously indexed assets and delays in indexing new assets. A fix is currently in progress. The rendering service remains unaffected.
Report: "Delay in Purge requests"
Last updateThe purge queue and purge time has been stable since the last provided update.
We are continuing to monitor for any further issues.
A fix has been implemented and new purge requests should again complete within expected times. We are continuing to monitor the situation.
Purge requests for images that use any of the below AI-parameters have been restored, but we are still observing longer purge times with a high queue. - bg-replace - compositions (any two or more ai features composed together, like bg-remove=true&upscale=true) We are looking at different approaches and will update accordingly.
The issue has been identified and we are working on a fix. To speed up processing of queued purge requests, we will have to temporarily disable purge requests for images that use any of the following AI-parameters: - bg-replace - compositions (any two or more ai features composed together, like bg-remove=true&upscale=true)
We have identified an issue with our purging service, causing new purge requests to be delayed in execution. (10mins+) Rendering services remain unaffected.
Report: "Quality degradation for a small portion of rendered images in the EU"
Last update# Timeframe December 9th, 11:08 AM UTC - December 10th, 6:38 AM UTC # Impact A small percentage of image requests served by a single machine in the EU region experienced quality degradation. # Root Cause A hardware failure on one of the rendering machines resulted in images with artifacts being delivered to users. # Contributing Factors An absence of automated testing for faulty renders across active machines caused assets to be cached undetected. The lapse between when the issue was detected and when the issue began culminated in an unknown number of renders that needed to be purged from the cache. Along with inadequate cache tooling for purging affected images, these factors caused the issue to fully resolve in a longer timeframe than expected. # Corrective Actions To prevent this issue from happening in the future, we will: * Investigate and implement automated image quality testing for active rendering machines * Replace and refresh machines on a more frequent basis to minimize impact of single machine failures * Upgrade cache tooling for targeted issue resolution
A small percentage of image requests served by a single machine in the EU region experienced quality degradation between December 9th, 11:08 AM UTC - December 10th, 6:38 AM UTC
Report: "Elevated SSL/TLS expiration errors for custom domains"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are investigating an issue with SSL certificate renewals for a small subset of our custom domains. Reports indicate that SSL certificates have expired and images might not be accessible over https, when using custom domains
Report: "Asset Manager delay"
Last updateThis incident has been resolved.
We have identified an issue with newly uploaded images not appearing in Asset Manager after being uploaded. Note that the upload will succeed and the asset can be served, but it may not appear in Asset Manager. We are applying a fix. The rendering service is not affected.
Report: "Inconsistent results in Purge requests"
Last updateThis incident has been resolved.
Our engineering team is currently investigating an issue with our Purge API, where requests to purge assets may not consistently result in the latest asset version being displayed. The rendering service remains unaffected, and we are working to identify the root cause to resolve this as quickly as possible.
Report: "Issue with receiving support requests"
Last updateWe have fixed the support@imgix.com email address and are processing our backlog of support requests. If you sent an email and have not received a reply until today, please reach back out to us through our contact form: https://dashboard.imgix.com/contact
We have identified an issue with receiving support requests to support@imgix.com and are working on a fix. In the meantime, please use our contact form to reach support: https://dashboard.imgix.com/contact
Report: "Delayed analytics data"
Last updateThis incident has been resolved.
We are investigating a delay in the availability of our analytics data. API services are not impacted.
Report: "Unable to connect to dashboard"
Last updateThis incident is resolved.
A fix has been implemented, restoring access to the dashboard. We are monitoring the results.
We have identified the issue and are working on a fix for logging into the dashboard. The imgix APIs (Rendering, Video and the Management API) are all working normally.
We are investigating issues with connecting to the imgix dashboard (https://dashboard.imgix.com). The imgix APIs are not affected by this incident.
Report: "Contact form downtime"
Last updateThis incident has been resolved.
We identified an issue with the support contact form not working. Please email support@imgix.com for any support request while we look into fixing the contact form.
Report: "Elevated rendering errors"
Last update## What happened? On September 12, 2024, imgix deployed a change to optimize image delivery for Safari browsers. This update inadvertently impacted a small number of browsers by delivering image formats they could not display. imgix identified the issue on September 14 and rolled back the changes to prevent further impact. ## How were customers impacted? Between September 12 and 14, 2024, customers using certain older browsers may have experienced issues displaying images. These users may have encountered image errors, while modern browsers, including Safari, continued to display images as expected. ## What went wrong during the incident? An update meant to optimize image delivery for Safari browsers was applied too broadly. As a result, some browsers not fully compatible with the AVIF image format served these files, causing display issues and affecting approximately 0.02% of renders. ## What will imgix do to prevent this in the future? To prevent similar issues, imgix will implement the following measures: * **More Precise Targeting**: Future updates will be more carefully targeted to specific browsers, ensuring only those that can handle advanced image formats like AVIF receive them. * **Enhanced Compatibility Checks**: We are improving browser compatibility checks to ensure the correct image formats are delivered based on the browser's capabilities. * **Continued Monitoring**: We will closely monitor image delivery performance to detect any compatibility issues early and ensure all users have a seamless experience. These actions will help us deliver a more reliable and optimized experience for all of our customers.
We have resolved the issue where some browsers received unsupported image formats, affecting a small number of renders. The changes have been rolled back, and services are fully restored
Report: "Issues with imgix Web Tools"
Last updateDashboard and documentation tools were affected by an upstream service provider outage. Rendering services were not impacted.
Report: "Intermittent Rendering Errors Related to EWR Node"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently seeing an issue with an upstream service provider. Images served from the Newark (EWR) node may intermittently return 502s. All other POP nodes are working as expected.
Report: "Elevated rendering errors"
Last update## What happened? On June 17, 2024, at 00:00 UTC, imgix experienced an extreme spike in requests to our render stack. This unexpected surge caused a failure in our auto-scaling infrastructure, leading to an inability to manage all incoming traffic effectively. A fix was implemented at 00:38 UTC, and the issue was resolved by 01:06 UTC. ## How were customers impacted? Between 00:00 and 01:06 UTC, customers may have experienced failures when requesting new renders. However, previously cached assets served successfully during this time. ## What went wrong during the incident? The incident was triggered by a significant increase in requests, which our automated systems did not properly handle. Although the system started to auto-scale as expected, the unexpected surge caused issues with the health checks used for auto-scaling. The combination of extra traffic and health check failure led to an inability to render new images that required manual intervention to resolve. ## What will imgix do to prevent this in the future? To avoid similar incidents in the future, imgix is taking the following actions: 1. **Health Check Enhancement:** We have investigated and implemented updated health checks to support increased traffic volumes. 2. **Rate Limiting:** Further rate limits will be applied to manage traffic spikes and minimize their impact. 3. **Traffic Routing:** Traffic will be rerouted as necessary to distribute the load and reduce the risk of system overloads. 4. **Automated Alerts Improvement:** We will enhance our automated alert systems to respond more effectively to traffic surges and potential issues, including health check failures. By addressing these areas, we aim to further improve our system's resilience and ensure a smoother customer experience during periods of high demand.
This incident has been resolved.
A fix has been implemented and error rates are returning to normal. We are continuing to monitor the service.
We are continuing to work on a fix for this issue.
The issue has been identified and our engineering team is developing a fix.
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information. Previously cached derivatives are not impacted.
Report: "Elevated rendering errors"
Last update# **Postmortem** # **What happened?** On May 23, 2024, at 19:23 UTC, an increased load on the rendering infrastructure was detected. Actions were taken to scale out our system to handle the additional traffic. This incident was resolved at 19:36. # **How were customers impacted?** During the incident, customers experienced increased error rates for recent renders, intermediate errors increased in our system, and response times for requests increased. # **What went wrong during the incident?** During the incident, our team implemented a service change that led to assets being dropped. This led to an increase in requests to our system. The increased requests to our system led to `429` and `5XX` errors. # **What will imgix do to prevent this in the future?** To prevent similar incidents, we will: * Improve procedures for pre-scaling instances during critical updates. * Conduct impact assessments before issuing significant changes. * Enhance monitoring and alerting systems to predict and manage load increases better.
This incident has been resolved.
On May 23rd, 19:19 UTC, we identified an issue affecting our rendering services due to a caching problem. This caused elevated rendering times and intermittent failures for some users. Our engineering team quickly diagnosed the issue and implemented a fix at 19:36 UTC. We are monitoring the system closely to ensure stability and confirm that the issue has been fully resolved. We appreciate your patience and understanding during this time.
Report: "Assets not displaying in Asset Manager"
Last updateThis incident has been resolved.
We have identified an issue that is causing assets to not display in the Asset Manager within the imgix dashboard. A fix has been deployed which is gradually restoring the Asset Manager. We expect all assets to be visible again in the Asset Manager within a few hours. The imgix Rendering API is not affected.
Report: "Source Creation Error for Microsoft Azure Sources"
Last updateThis incident has been resolved.
We are investigating issues with creating Microsoft Azure Sources. All images from existing Azure sources are unaffected.
Report: "Increase in Origin fetch requests"
Last updateWe have identified an issue that was preventing some newly fetched Origin images from being inserted into our Origin image cache, causing the same images to be re-fetched multiple times for new render requests. The impact time (UTC) was 3/21 20:30 - 3/29 0:50. The issue is resolved.
Report: "Issues with imgix web tools"
Last update# Postmortem # **What happened?** On Feb 20, 2024, at 19:00 UTC, Uploads using the \`/api/v1/sources/upload/\` API endpoint experienced slowness and some timeouts during this time. The issue had been completely resolved by Feb 21, 2024, 20:00 UTC. # **How were customers impacted?** Between Feb 20, 2024, 19:00 UTC, and Feb 21, 2024, 20:00 UTC, some requests to our upload API experienced slow responses, and a subset of requests resulted in timeout errors. Uploads using the \`/api/v1/sources/<source\_id>/upload-sessions/\` endpoints were unaffected; uploads using the Asset Manager UI were unaffected. # **What went wrong during the incident?** Two compounding issues caused the slowdown. The first is a service update creating an issue with our /api/v1/sources/upload/\` API endpoint. Secondly, a simultaneous and separate slowdown affected the cloud servers responsible for executing the upload actions. The service update issue increased the load on our upload function, which was already under strain due to the cloud servers experiencing a slowdown. These factors combined to cause the slow and timed-out responses. # **What will imgix do to prevent this in the future?** We have streamlined the upload process so that this cannot happen again.
We are investigating issues with imgix administration tools. There are reports of problems with calling our imgix upload API. The rendering service is not affected.
Report: "Encoding Error for Web folder"
Last update# Postmortem # **What happened?** On January 22, 2024, 07:38 UTC, some Web Folder requests to the imgix service began to return a `403` response. By 11:56 UTC, the issue had been completely resolved. # **How were customers impacted?** Between 07:38 UTC and 11:36 UTC, several requests to web folder sources began to return a `403` error. This affected a small amount of assets \(<0.1%\). # **What went wrong during the incident?** A service update caused an issue with web folders with directories using double slashes at the Origin URL \(`//`\). This update caused an encoding error, leading to `403` responses for Origins matching the double slash pattern. A fix was pushed to resolve this URL pattern, allowing us to fetch images from affected Origins. # **What will imgix do to prevent this in the future?** We will update our tests to catch encoding issues in the future.
We have identified the issue and are currently applying a fix. The rendering service is not affected.
Report: "Delay in purging"
Last updateThis incident has been resolved.
Our engineering team is investigating a delay in purge requests. The rendering service is not impacted.
Report: "Issues with imgix Asset Manager"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
We are monitoring issues with the Management API, specifically with PATCH requests to an asset. The rendering service is not affected.
We are continuing to work on a fix for this issue.
We have identified the issue and are currently applying a fix. The rendering service is not affected.
Report: "Intermittent rendering errors"
Last update# What happened? On October 23, 2023, between 21:43 UTC and 23:14 UTC, imgix experienced a partial outage affecting images served from the Rendering API. During this time, a small percentage \(<0.45% on average\) of non-cached requests returned a server error. A fix was implemented at 23:02 UTC, which allowed the service to recover by 23:14 UTC fully. # How were customers impacted? Between 21:46 UTC and 23:14 UTC, requests to the Rendering API returned a server error, with 0.65% of all requests to our CDN returning an error at the height of the incident. Additionally, Sources returned an unknown status between 21:06 UTC to 21:09 UTC. During this period, customers reported being unable to create Sources. # What went wrong during the incident? Our Rendering API experienced an unexpected interaction that caused a dramatic increase in server load. This caused error rates to increase as the network became overloaded slowly. The errors fluctuated between 0.07% to 0.65% until we resolved the issue. To restore the service, our engineers re-configured our network traffic to handle the unexpected Rendering behavior. During the incident, a separate issue \(unrelated to rendering\) impacted our Source data. This led to a delay in investigating the cause of the rendering errors. # What will imgix do to prevent this in the future? We have taken the following steps to prevent this issue from recurring: * Fixed the misconfigured server interaction * We will put an alert system in place to notify us when traffic congestion happens from a misconfigured source interaction. We are in the process of implementing the following: * Conducting a review of our current tooling to increase our traffic and network configuration capabilities. * Reviewing our current configuration to limit the affected services should a similar incident happen.
This incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating an issue affecting a small percentage of renders.
Report: "Issues with Management API"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating issues with patching requests to the Management API. The rendering service is not affected.
Report: "Issue with deploying new/updated Source configurations"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating issues of deploying new/updated Source configurations. The rendering service is not affected by this incident.
Report: "Elevated rendering errors"
Last updateThis incident has been resolved.
The issue has been identified and our engineering team is developing a fix.
Report: "Issues with Management API"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are investigating issues with requests to the Management API. The rendering service is not affected.
Report: "Investigating Dashboard Availability"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating reports of Dashboard unavailability. The rendering service is not affected by this incident.
Report: "Intermittent rendering errors"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
A fix has been implemented and we are monitoring the results.
We are currently investigating reports of intermittent rendering errors affecting a small percentage of images.
Report: "Intermittent 5xx errors"
Last update# What happened On May 1st, 2023, between the hours of 08:23 UTC and 15:08 UTC, imgix experienced intermittent errors affecting a small percentage of non-cached renders. # How were customers impacted? During the affected period, a small percentage of requests to the Rendering API returned a `502` or `503` error for non-cached requests. Errors slowly and gradually increased, with <.5% of requests returning an error at the height of the incident. # What went wrong during the incident? Our upstream provider experienced communication issues between CDN POPs, causing intermittent `502`/`503` responses in a small percentage of requests to our Rendering API. The increase in errors was so minor that it did not meet our monitoring thresholds for triggering alerts. One of our engineers observed a slow increase in errors and alerted other team members to a potential issue with our service. After tracing the issue to our upstream provider, we pushed a patch to mitigate intermittent connectivity issues, resolving the incident. # What will imgix do to prevent this in the future? We have refined our alerting to better catch the slowly increasing error rates. We have also ensured that the root cause of this incident has been fixed by our upstream provider. We are also updating our traffic routing in the case that the upstream issue occurs again.
This incident has been resolved.
A fix has been implemented, and we are monitoring the results.
The issue has been identified, and a fix is being implemented.
We are currently investigating reports of intermittent 5xx errors causing some images to initially return a 5xx error.
Report: "Elevated rendering errors"
Last update# What happened? On April 13, 2023, between 17:09 UTC and 17:32 UTC, imgix experienced a partial outage affecting non-cached renders. During this time, requests to cached assets continued to serve a `200` response, while requests to non-cached assets returned a server error. A fix was implemented at 17:32 UTC, restoring service. # How were customers impacted? Between 17:09 UTC and 17:32 UTC, requests to the Rendering API for non-cached renders returned a server error, with 9% of all requests to the Rendering API returning an error at the height of the incident. # What went wrong during the incident? We identified an error in one of our connections to customer origins. This error lead to significant slowdown in the retrieval process of new assets from customer origins. The errors rapidly grew in a short amount of time, causing our Rendering API to return 5xx errors. To restore the service, our engineers redirected some of our network traffic. The service was fully restored by 17:32 UTC, but some errors persisted and were being served from the cache until they were completely cleared at 17:35 UTC. # What will imgix do to prevent this in the future? We have taken the following steps to prevent this issue from re-occurring: * Fixed the misconfigured alert so our monitoring and alerts will trigger and identify potential issues before they become critical. * Removed the connection from our routing, replacing it with a new connection that will not experience the same errors. We are in the process of implementing the following: * Conducting a review of our current tooling to increase our traffic and network configuration capabilities. * Reviewing our current configuration to limit the affected services should a similar incident happen in the future.
This incident has been resolved.
Our engineering team has applied a fix, restoring services to normal. We are currently monitoring the situation.
The issue has been identified and a fix is being implemented.
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information. Previously cached derivatives are not impacted.
Report: "Dashboard errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are investigating reports of errors in the dashboard. The rendering service is not impacted.
Report: "Delayed analytics data"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being applied.
We are investigating a delay in the availability of our analytics data.
Report: "Investigating dashboard availability"
Last updateThis incident has been resolved.
We are investigating reports of dashboard unavailability. The rendering service is not affected by this incident.
Report: "Intermittent Dashboard and Management API Issues"
Last update# What happened? On February 1st, 2023 14:07 UTC, the imgix service experienced intermittent spikes in latency for web administration services, such as the imgix Dashboard and Management API. The incident was resolved later in the day at 20:03 UTC. # How were customers impacted? Customers may have experienced issues with using the Dashboard and the Management API. Actions such as logging in, loading pages, and making requests to the Management API resulted in intermittent timeouts. The Rendering API was not affected by this incident. # What went wrong during the incident? After our engineers identified the initial latency spike, we deployed a workaround that initially resolved the issue. After monitoring the results, we closed the incident, but latency shortly spiked again. The spike was sustained, and requests to the Web Administration parts of our service started to show long response times. The identified issues were similar to a recent incident that had occurred due to upstream providers. Our engineers applied similar mitigation steps, though they were less effective for this incident. Upon further discussion, our engineering team identified a path to resolution by fast-tracking a future planned infrastructure change. This involved reducing connections between our internal services. This change immediately fixed the latency in our Web Administration services. # What will imgix do to prevent this in the future? Internal documentation and tooling allowed our team to easily apply configuration changes and quickly push the needed architecture updates. We have updated this documentation and tooling involving the communication between our internal services to further facilitate these deployments in the future. The diagnostic steps and active monitoring/alerting have been updated as well. Additionally, we have completed an infrastructure upgrade which is designed to prevent this issue from recurring. As we gather more data on the new and improved performance metrics, we will proactively continue tuning our configurations to ensure future stability.
This incident has been resolved.
We are currently monitoring Dashboard and Management API performance.
Report: "Intermittent Dashboard and Management API Issues"
Last updateThis incident has been resolved.
The imgix Dashboard and the Management API are now performing at normal levels.
We are currently investigating reports of intermittent timeouts with the imgix Dashboard and the Management API. Rendering is not impacted.
Report: "Intermittent Dashboard and Management API Issues"
Last updateThis incident has been resolved.
A fix has been implemented, and we are monitoring the results.
We are currently investigating reports of intermittent timeouts with the imgix Dashboard and the Management API. Rendering is not impacted.
Report: "Intermittent Dashboard and Management API Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating reports of intermittent timeouts with the imgix Dashboard and the Management API. Rendering is not impacted.
Report: "Dashboard currently unavailable"
Last updateThis incident has been resolved.
The dashboard is operational. We are currently monitoring the issue.
We have identified an issue that has caused our Dashboard to go offline. The rendering service is unaffected.
Report: "Unable to process new assets in Asset Manager"
Last updateThis incident has been resolved.
We are investigating an issue preventing new assets from being processed in Asset Manager. The rendering service is not impacted by this incident.
Report: "Management API is unavailable"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The Management API is currently down. We are investigating the issue. The Rendering API continues to be operational.
Report: "Users are unable to login"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We identified an issue that prevents users from logging in. We are working on a fix.
Report: "Billing display issue"
Last updateThis incident has been resolved.
We are fixing a display issue affecting some of our user's billing overview dashboards. For correct billing information, please visit the invoices page: https://dashboard.imgix.com/invoices
Report: "Elevated rendering errors"
Last update# What happened? On June 10, 2022, 07:16 UTC, the imgix service experienced an increase in elevated rendering errors. A fix was implemented at 08:35 UTC, which restored the service back to normal error levels. # How were customers impacted? Between 07:16 UTC and 08:32 UTC, some customers received errors when making requests through the Rendering API. Previously cached assets continued to serve a successful response, but some files that were not cached returned a 502 or 503 error. At its peak, error rates reached 6% for requests to the Rendering API. # What went wrong during the incident? Erratic network behavior from our upstream network provider caused an increase in error rates to our backend services. As errors began to grow, one of our systems we designed to automatically remediate backend failures failed to trigger, allowing errors to surface through the Rendering API. Remediations were being identified, though we were delayed in posting a public status update. Eventually, a fix was pushed, immediately restoring service. # What will imgix do to prevent this in the future? We are investigating the network behavior detected at our upstream provider in order to update our configurations. We are expecting these changes to prevent a similar incident from occurring. We will also be fixing our automated tooling so that error rates get resolved before they impact the rendering service. Lastly, we will be revisiting our policies for status updates to ensure that incidents are communicated in a timely manner.
Service has been completely restored.
Our engineering team has applied a fix, restoring services to normal. We are currently monitoring the situation.
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information. Previously cached derivatives are not impacted.
Report: "Elevated rendering errors"
Last updateWe observed a short increase in rendering errors between the hours of 06:14 UTC to 06:23 UTC that was immediately resolved. While the incident is resolved, we are continuing to monitor the service and will continue to investigate the proximate cause.
Report: "Intermittent Dashboard Errors"
Last updateThis incident has been resolved.
We are currently investigating reports of intermittent errors on various pages of the imgix Dashboard. The rendering service is not impacted.
Report: "Issues with Management API"
Last updateThere was an issue with requests to the Management API not completing. This incident started at 11:52 AM UTC and was resolved at 12:33 PM UTC. The rendering service was not impacted by this incident.
Report: "Elevated rendering errors"
Last update# What happened? On December 17, 2021, 05:06 UTC, some uncached requests to the imgix service began to return a `503` response. By 05:36 UTC the issue had been completely resolved. # How were customers impacted? Between the hours of 05:01 UTC and 05:36 UTC, some requests to non-cached derivative images began to return a `503` error, with a 10% peak error rate being reached for parts of the incident. At 5:07 UTC, error rates began to decrease slowly, though a 5% error rate persisted until a fix was pushed at 5:36, which completely restored the service. # What went wrong during the incident? Large unexpected traffic patterns triggered a problematic interaction with a newly built internal automation, causing the initial incident. Our team pushed mitigations early on in the incident, though the mitigations had further unexpected interactions with the newly built automation. While the service _did_ begin to recover, the rate of recovery was slower than expected due to these interactions. Once the interaction was identified, another manual change was made which completely restored the service. # What will imgix do to prevent this in the future? We will be adding additional tooling which will enable us to more quickly identify proximate causes during incidents. We will also internally document the interactions and behaviors of our existing automation and mitigation runbooks to ensure smoother recovery times in the future. We also identified some improvement opportunities for some of our existing automation, which have completed fine-tuning.
This incident has been resolved.
This incident has been resolved.
A fix has been implemented and error rates have returned to normal. We are monitoring the situation.
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information. Previously cached derivatives are not impacted.