Historical record of incidents for Hive
Report: "Degraded Performance across Hive Services"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Loading and performance issues"
Last updateThe incident has been resolved.
We are monitoring infrastructure as back-end has stablized.
We are monitoring infrastructure as back-end has stablized.
The issue has been identified and a fix is being implemented.
Report: "Degraded Performance across Hive Services"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Degraded Performance across Hive Services"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Degraded performance trigger"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Performance has stabilized, we're monitoring and investigating root cause. We will continue monitoring until confirmed.
We are currently investigating this issue.
Report: "Degraded performance across subservices"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
A potential issue has been identified.
Report: "Hive Analytics not loading for some regions"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "Proofing Service Issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Report: "Degraded database performance impacting load speed"
Last updateThis incident has been resolved.
We are seeing improved (but still degraded) performance. Backed up database operations are clearing up and we are monitoring + working to further get back to full performance.
We are continuing to work on a fix for this issue.
A fix is being deployed now. We expect performance to return to normal for most application areas shortly. We will update and monitor once deployed.
The issue has been identified and a fix is being implemented.
Report: "Performance Degradation across Hive Services"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Analytics Data Refresh Delays"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Partial forms outage when submitting specific forms"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "Desktop and web app timeouts"
Last updateThis incident has been resolved.
We are monitoring for the moment and will provide full updates to come.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Delayed data refreshes in Analytics"
Last updateThis incident has been resolved.
We have seen successful data refreshes and are monitoring for scenarios where failures occur. For the moment, most data should remain up to date.
An upstream update to Analytics has resulted in data refreshes being delayed. We are working towards resolving via rollback - data is still accurate, but note the timestamp for last refresh time from Analytics UI to understand when it was most recently synced.
Report: "Performance Degradation across Hive Services"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Degraded performance for some Desktop app views"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have confirmed new views are beginning to load.
We are continuing to work on a fix for this issue.
We have identified the issue and in the process of resolving.
Report: "Degraded performance when loading workspace with large set of projects"
Last updateThis incident has been resolved.
A fix has been rolled out and we are monitoring as performance trends towards fully operational.
We are continuing to work on a fix for this issue.
The issue has been identified and a fix is being implemented, will be rolled out shortly.
Report: "Degraded performance across Hive services"
Last updateThis incident has been resolved.
This issue has been mitigated and we will continue to monitor performance across Hive services.
We have identified a root cause in problematic API requests. A fix is currently being implemented.
We are currently investigating this issue.
Report: "Increased Error Rates across Hive services"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Error rates have zeroed out over the past 20 minutes. We will continue to monitor and assess next steps. All services are operational.
We are continuing to investigate this issue.
We've mitigated the spike in timeout errors, but continue to see lingering timeouts. We are continuing to investigate and will keep mitigations in place until source issue is confirmed.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Hive API increased error rates"
Last updateA heavy resource operation which blocked a subset of servers responding within a 60 second threshold resulted in increased API 500 level responses on a subset of 0.5-1.5% of public API request responses. The issue has been identified and a hotfix resulting in isolation of the issue has been deployed. Furthermore, a configuration change which more quickly quarantines unhealthy servers has been deployed to ensure more immediate quarantine of unhealthy servers.
Report: "Web and Desktop app loading issues"
Last updateThe incident has been resolved. Initial loading is working, and performance impact from the switchover has stablized.
The deployment has been fully reverted and both web and desktop should now load without issue. If you're experiencing issues, try force refreshing your browser/app (ctrl+R on Windows, cmd+R on Mac).
We have rolled back to a redundant instance of our application while the deployment changes are rolling back. The application should now load, but there may be partial impact in performance.
We've identified the issue and are in the process of rolling back changes which caused it.
We are currently investigating an issue where the Hive web and desktop app fails to load on a page refresh.
Report: "Infrastructure issues causing loading errors resulting in blank screen for some users."
Last updateThis incident has been resolved. All applications should be operational for all users, users may need to refresh or restart their application.
We are continuing to work on a fix for this issue.
We have identified the issue and have taken steps to return the system to full functionality.
Report: "Degraded performance in Hive Notes for some users"
Last updateThis incident has been resolved.
We have identified the root cause of an issue resulting in degraded performance in Notes loading or failing to load entirely. A fix has been implemented and we are monitoring closely.
Report: "Degraded performance in Hive Notes"
Last updateThis incident has been resolved.
Reports of degraded performance in Hive Notes where users are experiencing slow loading times or notes failing to load completely. The root cause has been identified and we have taken steps to resolve this issue. We are currently monitoring the implemented fix.
Report: "Cloud Services Provider Outage impacting Hive Services"
Last updateThis incident has been resolved.
We are seeing massively reduced error rates on impacted services and will continue to monitor towards full resolution.
Hive's cloud provider (AWS) identified the root cause of the issue and has been actively working towards full resolution. We are now observing sustained recovery of the error rates across our impacted services, but will need to monitor for additional time before we can confirm full recovery level.
An outage with one of Hive's cloud services providers (Amazon Web Services) is leading to partial outage with Hive services. We can confirm that the following Hive services are impacted and will update the list as we confirm further: Hive University, Forms, Export services, thumbnail previews, certain auth services. We are working to establish alternative routes to recovery as these services are cloud provider dependent.
Report: "Issue with action loading in Kanban layouts, Proofing and Portfolio Views."
Last updateConfirmed the fix is working as expected. No new instances of bad data are occurring. All instances of failed action view loading are now resolved.
The fix has been implemented and rolling out in all instances. We are continuing to monitor to ensure no instances of new bad data occur.
We've mitigated the issue across all existing instances of bad data, and have stood up a temporary recurring script to patch any new instances of bad data. In most cases, loading should be resolved for anyone affected. We are currently confirming a root cause fix to be deployed to fully resolve the issue and prevent future instances of bad data which lead to these loading issues.
We are continuing to work on a fix for this issue.
We have identified an issue that results in failure of loading action cards in Kanban layouts, Proofing and Portfolio Views for some users. Steps to mitigate have been implemented and this update should remediate impacted data. An update will follow once the mitigations steps are completed.
Report: "Desktop application reloading frequently"
Last updateThis incident has been resolved.
Desktop application is now available, a fix has been implemented and we will continue to monitor prior to resolving.
The Hive desktop app is currently unavailable, please use the Hive browser app at app.hive.com as we continue implement the fix.
The root cause has been identified and a fix is being implemented.
Report: "Users who utilize "Zscaler" software are experiencing loading issues"
Last updateThis issue should be resolved with your Zscaler configuration - please refer to the previous update on steps to resolve.
We've confirmed the issue is limited to Zscaler customers. If you are using Zscaler and are having issues loading Hive, please do the following: (1) If possible, add the following URLs / domains to your Zscaler allow list (see steps here: https://help.zscaler.com/zia/adding-urls-allowlist): https://prod-gql.hive.com https://prod-gql.hive.com/graphql (2) Contact your IT Team or Zscaler admin to submit a ticket with Zscaler at the following URL: https://help.zscaler.com/submit-ticket If the above steps do not lead to resolution, please contact us through https://help.hive.com live chat or email help@hive.com
We've identified the root cause as being "Zscaler" software. If you're experiencing issues loading Hive and use "Zscaler", contact your IT Team or Zscaler admin ASAP to ask them to assist in unblocking the following domain: prod-gql.hive.com Your IT admins can submit a ticket to Zscaler at this URL: https://help.zscaler.com/submit-ticket
We have identified the issue and can confirm that this issue is limited to users using zscaler. We are taking steps to resolve the issue.
We are currently investigating this issue.
Report: "Performance Degradation across Hive Services"
Last updateThis incident has been resolved.
The root cause has been identified and is being managed. We will continue to monitor for any further issues. Services are stable and we will continue to monitor and remain on high alert as we prepare next steps for future mitigation.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue and are reviewing the potential root cause. Updates will continue as we progress in identification and a fix.
Report: "Hive Analytics Loading Issues"
Last updateThis incident has been resolved.
We are looking into root cause of loading issues now. At the moment, Analytics services are working but with degraded performance.
We are currently investigating an issue where Hive Analytics has issues loading.
Report: "Hive Analytics Dashboard Loading Issues"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are continuing to investigate. In the interim, we have rolled back certain Analytics dashboards to previous versions which are working and can be navigated to from the left side panel.
We are currently investigating this issue.
Report: "Degraded HTTP request processing performance"
Last updateThis incident has been resolved. We will continue to monitor.
Deploy is nearing completion, we are now monitoring logs that indicate the degraded performance being resolved. We will update shortly once we have full confirmation.
We have identified a recent functional change which aligns with the current suspected root cause. A fix has been implemented and being deployed now. Update to follow once deploy completes.
We are currently investigating an issue where HTTP request processing may lead to pile up and time out of subsequent requests. A subset of users may experience degraded performance across normal operations. We have prepared a suspected root cause fix and will update as we progress.
Report: "Degraded load speed due to database issues"
Last updateThis incident has been resolved. We will continue to monitor and remain on high alert as we prepare next steps for future mitigation.
We are continuing to monitor for any further issues.
We're monitoring database services at the moment as things remain stable.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Issues with project loading"
Last update# Context and timeline On behalf of the team here at Hive, we would like to apologize for interruptions to services yesterday, and we appreciate your patience as we worked to resume service continuity. As posted in the incident status updates, the Hive web platform experienced service disruptions which impacted project loading from 8:01am through to 8:26am Eastern. The incident was left open with partial outage as we monitored failover from 8:26am through to 8:55am Eastern, and left in a monitoring state through to incident close out. A detailed timeline including mitigation steps taken list listed out below \(all times stated are Eastern timezone\): **8:01am -** Application monitor alarm bells raised, notifying our team of issues from completion of an application deployment. **8:15am -** Initial investigation confirms issues are widespread, impacting users who had been swapped over the latest web application refresh. **8:18am -** Application deployment reversion started. Failover to stable environment initiated. **8:26am -** Confirmation of all users switched over to failover environment and project loading service disruption resolved. **8:30am -** Upon review of logs after switching to the failover environment, the team confirmed from logs that a specific scenario of project creation from templates with pre-configured table layout options failed to fully complete. This specific issue remained until separate service redeployment which was initiated at 8:18am. The issue was due to application version mismatch and impacted just below 2% of the active user population. # Root cause In short, a web application deployment \(which completed just before 8am\) contained a cached version of a pre-production Hive build, leading to mismatched application versions and logic between services. Upon review of the deployment command logs, the team has confirmed that this cached version was previously deployed to a pre-production environment and not properly cleared out before the production deployment was built. # Remediation plan While our deployment scripts already ask for written confirmation for initiating a deployment and show information in the confirmation regarding which version \(branch/build\) and target environment, potential untracked or cached change warnings do not show. In order to ensure the root cause of mismatched application versions being deployed never happens again, the team has taken steps to update deployment commands and contextual information such that deployment will automatically fail in the event of untracked or cached changes.
All systems have remained stable since our earlier updates at 8:26am and 8:55am Eastern. We have gone ahead and unified application states across all environments to ensure no users experience version mismatches. We'll continue to actively monitor stability throughout the day, and do not anticipate any further issues. A post-mortem has been underway since ~9:30am Eastern this morning and will be posted here once finalized.
We are continuing to investigate the issue, but systems have now remained stable.
We have failed over to a stable application version while we work to identify root cause. The application should be available and working now. We will leave this incident open while we investigate original cause, implement a fix, and monitor before resolving.
We are currently work to roll back changes which were deployed earlier and related to the issue.
We are currently investigating this issue.
Report: "Issues with project loading"
Last updateWe identified a step in our deployment process which was formerly automated, and thus missed in our deployment earlier this morning. We’ve taken next steps to ensure the step is added to a review checklist so it won’t be missed again in future deployments. Moving forward, we will aim to automate this step again to prevent future issues stemming from mismatched deployment versions.
After monitoring, we can confirm that the fix is working as expected and is rolled out to all users.
The fix has been deployed and project loading should now be working as expected. If you are still experiencing issues, you can force refresh your browser or application (CTRL+R on Windows / CMD+R on Mac) to ensure things are loaded from the deployed fix.
We've identified an issue with project loading due to a recent deployment. A fix is currently rolling out and expected to resolve loading issues shortly.
Report: "Performance Degradation across Hive Services"
Last updateThis incident has been resolved. We will continue to monitor throughout the day and have set up enhance alerting and a running conference bridge in the event that there are any warning signs of further issues while we work on post-mortem for the incident.
Our services have stabilized. We will continue monitoring as we work to ensure continued stability through peak hours. The team will follow up with a post-mortem.
The bottleneck has been identified and we are working to ensure things remain stable. Full stability is expected to resume in the next 5-10 minutes. If your application is frozen or not loading, please try refreshing your browser tab or application (CTRL+R on Windows, CMD+R on Mac).
We are increasing load capacity across our services and expect stable state to resume shortly.
We are currently investigating this issue.
Report: "Degraded Performance across Hive services"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Hive Services Degraded Performance"
Last updateServices have reached steady state. We are taking next steps to mitigate future potential impacts on performance during peak hours.
To mitigate the degraded performance, resource capacity has been increased and is taking effect now so that application loading can re-enter a stable state.
We are currently investigating degraded performance across Hive sub-services which support data loading for several Hive apps and mobile.
Report: "Performance Degradation Detected"
Last updateAll operating metrics have returned to, and maintained normal values. We have follow-ups logged, but all operations are now considered normal.
We are monitoring following mitigation steps - services should be returning to normal.
We have identified an issue with a long-running query, and have applied mitigating steps. There will be residual impact that we are monitoring.
We are continuing to investigate this issue.
We are currently investigating slow performance affecting the application.
Report: "Database performance issues causing partial service outages across the platform"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have completed a manual scaling increase and are currently redeploying our services to re-sync and restore service.
We continuing to work on mitigating efforts while we work with our DB provider to restore our scaling functionality.
Our DB PaaS provider is currently experiencing an outage that has been preventing our normal scale-up process (https://status.cloud.mongodb.com/incidents/bkzhxk9db0nr) and we are continuing to work on other mitigating options while we wait for them to fully restore service.
We have identified the likely source and are currently working to remediate and restore service.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Degraded Service Performance"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A resource adjustment has taken place and we're monitoring to ensure things return to normal.
The source of degradation has been identified and is being mitigated.
We are currently investigating this issue.
Report: "Production Service Degredation"
Last updateAll services have returned to normal following challenges identified during standard changes processes.
We have remediated identified issues and are currently monitoring.
We are currently investigating this issue.
Report: "Issues with new Project Creation"
Last updateThe fix for this issue completed at 17:25 GMT-04:00. After suitable monitoring, we have confirmed that this issue has been resolved.
The issue has been identified and a fix is being implemented.
We are aware of an issue creating and viewing new projects and have identified the cause. We are working to resolve now.
Report: "Production Services Issues Identified"
Last updateThis incident has been resolved.
We have resolved the issue and are monitoring for any residual effects.
We are continuing to investigate this issue.
We have implemented remediating efforts and restored services. We are continuing triage efforts and additional preventative measures.
Issue with Production services impacting some users.
Report: "Possible Performance Degradation Detected"
Last updateAfter monitoring for a bit, we have determined that there are no lingering impacts from our earlier monitoring alerts and this incident has been resolved.
We are continuing to monitor and are also implementing additional precautionary measure. We will add updates here as they are available.
We are monitoring for any issues and/or user impact.
We have identified a possible performance degradation and are currently investigating.
Report: "DNS Replication Issues Causing Performance Issues"
Last updateThis incident has been resolved.
Service has returned to normal and we are monitoring for any residual impact.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Performance Degradation Detected"
Last updateThis incident has been resolved.
We have confirmed the cause, implemented a resolution and are monitoring for any residual impact.
We have identified a cause and have added resources to offset while addressing the the core issue.
We are investing the root cause and scaling up to offset impact.
Report: "Minor Performance Degradation Detected"
Last updateMonitoring period confirmed that the original issue (degraded performance) has full stabilized as of 11:43am EST.
The issue has automatically resolved and we will be monitoring for any residual effects
We have determined that unusually long operations in our DB layer resulted in degraded performance for some users.
We are currently investigation and will provide updates shortly.
Report: "Some users experiencing delay in proofs loading"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
Report: "AWS infrastructure issue impacting Hive Proofing rendering"
Last updateMonitoring completed; confirmed issue is resolved.
Impact from AWS issue on proofing has resolved, proofs are now loading. We continue to monitor this AWS issue.
A reported issue with Amazon Web Services (AWS) is impacting proof files rendering. We are continuing to monitor as partners at AWS resolve this issue.