Shelf

Is Shelf Down Right Now? Check if there is a current outage ongoing.

Shelf is currently Operational

Last checked from Shelf's official status page

Historical record of incidents for Shelf

Report: "Gem Content Inaccessible via KMS Dashboard/Search"

Last update
postmortem

Between approximately 15:37 UTC and 16:03 UTC on April 29, 2025, some users experienced an issue where content pages \('Gems'\) would not load when accessed directly from the main Shelf dashboard or search results. This temporary issue stemmed from the premature activation of a configuration change during preparations for our planned user interface update, scheduled for April 30th. **Impact & Workaround:** During this time, content remained accessible via the Gem Preview feature. Other platform areas like Agent Assist, Self-Service Portals, and Content Intelligence were unaffected. **Resolution:** The issue was identified and fully resolved by 16:03 UTC. All systems are now operating normally. We apologize for any inconvenience caused.

resolved

The configuration issue has been corrected, and Gem content access via KMS is restored.

identified

We have identified the root cause related to a deployment configuration and are deploying a correction.

identified

We are investigating reports of users being unable to view Gem content.

Report: "Content Intelligence Service Degradation in US Region"

Last update
resolved

The disruption to Content Intelligence service in the US region has been fully resolved. Our engineering team successfully restored all services to their normal state. The situation has been closely monitored, and performance metrics have stabilized without further issues. We appreciate your patience and understanding during this time.

monitoring

The Content Intelligence service in the US region has been fully restored as of 20:00 UTC. Our engineering team successfully implemented a failover solution using a recent system backup taken moments before the disruption. All user data is intact and operations performed during the outage have been properly stored. Some users may experience brief periods of reduced performance as our systems complete the scaling process. We continue to monitor the service closely during this stabilization period and appreciate your patience throughout this event.

identified

We are currently experiencing a disruption to our Content Intelligence service in the US region that began at 17:20 UTC. Users may be unable to view analytics dashboards and charts. All data remains secure, and any changes made during this period will be properly processed once service is fully restored. Our engineering team has identified the root cause as an underlying infrastructure issue with our service provider and is actively implementing mitigation measures, including routing operations to redundant systems. We are in direct communication with AWS support engineers via a priority incident channel and collaboratively working toward resolution. This issue affects only the US region; EU and CA regions remain fully operational. We will provide updates as the situation progresses.

Report: "Search Functionality Degradation Affecting Multiple Features"

Last update
postmortem

## What Happened? On October 24, 2024, customers reported difficulties using Shelf's search functionality. Our monitoring systems detected performance degradation in the search service, affecting users' ability to find and access their content. The issue was first identified through our automated monitoring systems and confirmed by customer reports. ## Why Did It Happen? The search service experienced capacity constraints due to an unexpected pattern in request handling. When users performed searches, our system is designed to make a single attempt to retrieve results. However, a recent configuration change inadvertently allowed multiple retry attempts for each search request. This meant that a single user search could generate several duplicate requests, creating unnecessary load on the system. Under normal conditions, our search infrastructure efficiently handles the typical volume of search requests. In this case, the combination of increased search traffic and the multiplicative effect of retry attempts exceeded the system's designed capacity. This overload caused the search service to respond slowly or fail to respond at all. ## Impact Users in the US region experienced difficulties performing searches across the platform for approximately one hour. While other platform features remained functional, any operations requiring search capabilities were affected. Users could access their content directly through navigation, but could not search for specific items or use search-dependent features. The EU and Canada regions were not impacted by this incident. The exact number of affected users during this period cannot be precisely determined, but the issue was limited to users actively attempting to use search functionality in the US region between 19:28 UTC and 20:21 UTC. ## Solution Our team implemented an immediate fix by adjusting the request handling configuration to prevent excessive retries. We also increased the search service's capacity to ensure stable operation. After implementing these changes, we conducted comprehensive testing across all affected components and monitored the system for an extended period to confirm the solution's effectiveness.

resolved

Search functionality has been restored across all major features including Gems search. The system is now operating normally and we continue to monitor for stability. We appreciate your patience during this disruption.

monitoring

We have successfully restored search functionality for Announcements, Feedback lists, and CPW tasks. Gems search remains temporarily affected while we complete the final recovery steps. Our engineering team continues working on full service restoration. We will provide another update once all features are back online.

identified

Our engineering team has identified the source of the disruption affecting search functionality. We are actively implementing necessary adjustments to restore service. Search capabilities remain limited across affected features while we work on the resolution. We will provide another update once service is restored.

identified

We are experiencing a service disruption affecting our search-related functionality across multiple features including Gems search, Feedback list, and Announcement list. Our engineering team has been notified and is actively working to restore normal operations. While basic platform functionality remains available, search capabilities are currently limited. We will provide updates as the situation develops.

Report: "Chrome Update Causing UI Errors"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are aware of an issue where a recent update to Google Chrome is causing errors within the user interface of our platform. Our team is actively investigating the matter to identify the root cause and implement a fix. We will provide further updates as soon as more information is available. Thank you for your patience and understanding.

Report: "Content Publication Workflow: Service Interruption"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We have identified an issue with the task publishing process and are actively working to resolve it.

Report: "Gem Page Loading Issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

Report: "Canadian Region: Insights and Maintenance UI Downtime"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

Report: "Search Disruption"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Search Service Interruption"

Last update
resolved

We're pleased to report that the outage affecting our search capabilities has been resolved. Our team addressed a performance issue with our search cluster under heavy load. We've monitored the service and confirmed it is functioning normally. Thank you for your patience during this time. Happy searching!

identified

We are currently experiencing an unexpected outage with our search functionality that may affect your ability to locate content swiftly across our platform. Our engineering team has been notified and is actively working on diagnosing and resolving the issue as promptly as possible. We apologize for any inconvenience this may cause and appreciate your patience. Please stay tuned to this status page for real-time updates and estimated resolution times as we work to restore full service.

Report: "Incident with login page"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring results. Google authorization is temporarily disabled

identified

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Shelf Outage in the US Region"

Last update
resolved

As of 20:44 UTC, we have observed an absence of errors stemming from this outage for the past 20 minutes. Based on this observation, we deem the situation to have stabilized. Thus, we declare the incident resolved.

monitoring

We have observed that AWS has implemented an update to address the aforementioned issue. At this time, we cautiously anticipate that Shelf is functioning properly. However, we will require additional time to closely monitor the situation and gather more data in order to confirm and ensure its ongoing stability with certainty.

identified

We are continuing to work on a fix for this issue.

identified

We have ascertained that the root cause of the current issue can be attributed to our cloud service provider, Amazon Web Services (AWS), which is presently experiencing a global outage specifically in the us-east-1 region. Reference: https://health.aws.amazon.com/health/status#multipleservices-us-east-1_1686683337

investigating

We are currently investigating this issue.

Report: "Degraded Search Functionality"

Last update
postmortem

# What Happened? On June 9, 2023, at approximately 22:59 UTC, the Elasticsearch cluster utilized by the Shelf KMS platform experienced an inaccessibility issue in the us-east-1 region. As a consequence, certain API operations that rely on this cluster encountered timeouts, resulting in a degradation of search functionality on the Shelf KMS platform. ### Impact on Customers During the period of the incident, which lasted until June 10, 2023, at 00:18 UTC, clients using the Shelf KMS platform experienced difficulties in retrieving search results for gems. While the Gem Page remained functional and available for viewing, clients were unable to effectively locate gems via search results or navigate using the search function. The total duration of the degraded functionalities was approximately 1 hour and 19 minutes. # Why Did it Happen? The root cause of the incident was traced back to a networking issue that occurred within Elastic Cloud, a third-party service provider that Shelf KMS relies on for hosting the Elasticsearch cluster. Due to an underlying infrastructure outage outside of our direct control, Elastic Cloud experienced disruptions that adversely affected the performance and accessibility of the Elasticsearch cluster in the us-east-1 region. For your reference, Elastic Cloud's incident summary can be found at the following link: [https://status.elastic.co/incidents/07bw653d2677](https://status.elastic.co/incidents/07bw653d2677)

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The search functionality is currently experiencing performance issues as a result of an incident with our upstream cloud provider. For more information, please visit https://status.elastic.co/ In the meantime, we are actively working on resolving the issue through our own efforts.

investigating

We are currently investigating this issue.

Report: "Self-Service Portals Downtime in the US Region"

Last update
resolved

This incident has been resolved.

Report: "Some documents not visible in search results"

Last update
resolved

The incident has been resolved.

Report: "Elevated API errors"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

The issue has been identified and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "AWS Outage in the US region"

Last update
resolved

This incident has been resolved.

identified

We see positive signs of recovery, but occasional errors might still occur.

identified

One of the underlying core AWS services is experiencing a severe outage. See https://health.aws.amazon.com/health/status. Most of the Shelf functionality is impacted. Users experience intermittent errors.

identified

The issue has been identified and a fix is being implemented.

Report: "Problems with Self-Service Portals"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Problems Accessing the Web Application"

Last update
resolved

This incident has been resolved.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

Report: "5XX Errors When Accessing the Web Application"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "5XX Errors When Accessing the Web Application"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Kustomer Integration Issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Updates to gems show up in search results with a delay"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Intermittent Search Problems"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Intermittent Search Problems"

Last update
resolved

We are currently investigating this issue.

Report: "Intermittent Search Problems"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Problems with loading a library tree in the web app"

Last update
resolved

This incident has been resolved.

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Missing English Recommendations in Answer Assist"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

This issue is caused by the global AWS outage. See details here: https://status.aws.amazon.com/

Report: "Intermittent Errors Signing in via SSO"

Last update
resolved

This incident has been resolved.

identified

Our underlying auth provider Auth0 has an issue with their infrastructure. We'll post more details as soon as we have them.

Report: "US Region Downtime"

Last update
resolved

This incident has been resolved.

identified

While we have observed some early signs of recovery, we do not have an ETA for a full recovery. We will continue to provide updates here as we have more information to share.

identified

We begin to see signs of recovery.

identified

We are continuing to work on a fix for this issue.

identified

Outage impacts the US region only. Shelf continues to operate in the EU region.

investigating

Our infrastructure provider - Amazon Web Services has a global outage right now. Thousands of companies worldwide are impacted, including Shelf. We're monitoring the situation and will keep you updated.

Report: "Problems with Emails Delivery"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We're working on a fix to resume email delivery. We collect the backlog of emails and all of them will be delivered as soon as the fix is implemented.

identified

The issue has been identified and a fix is being implemented.

Report: "Intermittent Errors Signing in via SSO"

Last update
resolved

This incident has been resolved.

identified

See https://status.auth0.com/?domain=shelfio.auth0.com for more updates on the Auth0 incident. The issue is present for the US region only.

investigating

Our underlying auth provider Auth0 has an issue with their infrastructure. We'll post more details as soon as we have them.

Report: "6-minute spike of API errors"

Last update
resolved

Between 15:17 - 15:23 UTC there were some problems using Shelf application. Users experienced "Something went wrong" error messages. The issue was quickly resolved.

Report: "Agent Assist Errors"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "DNS Issues"

Last update
resolved

Due to some DNS issues, customers could not access the Shelf platform for some time.

Report: "Problems With Updating Gems"

Last update
resolved

This incident has been resolved.

investigating

Logged out users might not be able to log in using their passwords at the moment.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Web Application: Intermittent Long Loading Times (US Region)"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Problems when signing in using SSO providers"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We're impacted by the incident of our downstream provider Auth0. See more details here: https://status.auth0.com/incidents/89d7218yhxhv

Report: "CPW: Cannot save changes to the Decision Tree"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "CPW Tasks for Decision Trees Cannot Be Transitioned to the Next Stage"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "CPW: Wiki Editor has limited formatting functionality"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Some CPW Tasks for Decision Trees Cannot Be Transitioned to the Next Stage"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Shelf Web App: 3-minute downtime (503 errors)"

Last update
resolved

The downtime lasted for 3 minutes between 5:56 and 5:59 PM UTC. Only the primary web interface (search & view gems) was impacted. Self-service portals, Content Publication Workflow, Agent Assist, and other services remained intact.

Report: "Agent Assist: Incomplete Recommendations"

Last update
resolved

For a couple of hours, there was a problem when some gem types were ignored in Agent Assist recommendations, even if they were relevant.

Report: "EU Region: Intermittent errors"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "Documents are not uploading within Content Publication Workflow"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Problems signing in using some SSO providers"

Last update
resolved

This incident has been resolved.

investigating

A subset of Shelf users who use SSO experience issues signing in. The problem is caused by our downstream provider Auth0. More details on the incident here: https://status.auth0.com/

Report: "Web App"

Last update
resolved

This incident has been resolved.

investigating

Users might experience 503 errors and blank pages when loading the web app. We are currently investigating this issue.

Report: "Insights: degraded performance"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

Events might appear with dealy. Some reports or dashboards may load partially.

Report: "Problems loading some parts of Insights"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

Some of the charts might not load at the moment

Report: "Problems delivering Email Notifications about Gems"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.