Hex

Is Hex Down Right Now? Check if there is a current outage ongoing.

Hex is currently Operational

Last checked from Hex's official status page

Historical record of incidents for Hex

Report: "Degraded Kernel Acquisition"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating an issue acquiring kernels for some customers.

Report: "Degraded Kernel Acquisition"

Last update
Resolved

This incident has been resolved.

Monitoring

A fix has been implemented and we are monitoring the results.

Investigating

We are currently investigating an issue acquiring kernels for some customers.

Report: "Degraded kernel acquisition"

Last update
postmortem

After fully analyzing the timeline of events, **we’ve confirmed that the root cause of the incident was an issue in the AWS EBS storage system backing our primary database replica**. The issue began on May 12 11:00 PM PDT according to AWS, and the effect was gradual degradation of performance on the replica, eventually leading to a cascade where even our primary database got backpressured on critical write operations, which triggered the instability causing the incident. The AWS issue was resolved May 13 9:32 AM PDT, after we pushed a database configuration change that caused the EBS system software to reset. We then stabilized and cleaned up systems on our end before resolving the incident. While the root cause of this incident was on the AWS side, we are not satisfied that our detection and response were fast enough. We are investing in the following mechanisms to improve: * We have automated testing to monitor provisioning of kernels, but it is combined with a larger testing suite that makes the signal less immediately actionable. We will be extracting this out as its own monitor so we can more quickly identify and react. * Our database monitoring around latencies will be made more comprehensive, including the replica database, so we do not miss this early signal. * We are improving our incident runbook to cover some of these checks earlier in the process so we zero in on the root cause more quickly.

resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

Some users are still experiencing issues as we implement a fix.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Degraded Kernel Acquisition"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix has been implemented.

investigating

We are currently investigating this issue.

Report: "Issues with building custom kernels"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We've identified an issue with custom kernels failing to build and are actively working on a fix.

Report: "Inputs not reacting on some published apps"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We've identified an issue with published apps using the pre-run cells in the background feature and are working on implementing a fix.

Report: "Copy to clipboard and download CSV issues"

Last update
resolved

This incident has been resolved.

identified

When users copy to clipboard and/or download CSV they may see HTML rather than their data. The issue has been identified and a fix is being implemented. In the meantime, a refresh of the page should resolve the issue.

Report: "Viewers experiencing issues viewing projects"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

Report: "Slack outage impacting ability to send notifications from Hex"

Last update
resolved

This incident has been resolved.

monitoring

Slack is currently experiencing an outage which impacts the ability for Hex to send notifications to this channel. We are monitoring this outage at https://slack-status.com/ and encourage you to do the same if you rely on this functionality.

Report: "Instability in project loads"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

Report: "Projects not loading for some users"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Projects not loading"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Web app outage"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Projects not loading"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "HTML and styled dataframes not rendering"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Projects not loading"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Platform instability"

Last update
resolved

On November 12, 2024 from 8:06 AM to 10:21 PM PT, Hex customers experienced platform instability, including issues with kernel execution and site access. This disruption stemmed from an unintended effect of a recent database migration, which ultimately caused a backlog and impacted overall platform performance. Our engineering team promptly identified the root cause and took corrective actions by resetting kernels to alleviate database congestion. Once the database stabilized, we rolled back the migration change, fully restoring service by 10:21 PM PT. To prevent similar issues in the future, we are enhancing our automated tests to better validate database migrations before deployment, ensuring they don’t inadvertently impact platform reliability. Thank you for your patience and understanding.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Kernel Unavailability"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

Hex is experiencing degraded performance; some users are unable to acquire kernels and run projects. We are actively investigating the root cause and will update as soon as possible on a resolution.

investigating

We are currently investigating this issue.

Report: "Workspace tiers downgraded"

Last update
resolved

Some organizations have been erroneously downgraded to the community tier

Report: "SQL Query Instability"

Last update
resolved

Non-dataframe SQL queries were intermittently failing for all users.

Report: "Application Instability"

Last update
resolved

Users may have seen general application-wide instability and failures caused by partial database unavailability due to likely malicious users performing large numbers of operations quickly.

Report: "Application Instability"

Last update
resolved

Users may have seen general application-wide instability and failures caused by partial database unavailability due to likely malicious users performing large numbers of operations quickly.

Report: "Hex Toolkit Outage"

Last update
resolved

Calls to hextoolkit.get_data_connection() in any code cells were failing.

Report: "R Project Outage"

Last update
resolved

All SQL queries in R projects were failing.

Report: "DNS Outage"

Last update
resolved

Hex was completely inaccessible to many users.

Report: "Kernel Assignment Failure"

Last update
resolved

Users experienced long delays waiting for kernels to start for doing analysis in Logic view and running apps.

Report: "Main site service interruption 11/7/22"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We are continuing to work on a fix for this issue.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Some projects failing to acquire new kernels"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Login System Issue"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and we are implementing a fix.

investigating

We are currently investigating this issue.

Report: "Slow kernel start up times for some projects"

Last update
resolved

This incident has been resolved.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Projects not loading"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Intermittent platform access issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Performance degradation in projects"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Main site unavailable"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Platform availability issue"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Platform availability issue"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Table display filter inconsistency"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Some projects failing to acquire new kernels"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Platform availability issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are currently investigating this issue.

Report: "Platform availability issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Platform degradation due to AWS outage"

Last update
resolved

This incident has been resolved.

identified

Issue has been identified to be due to an ongoing AWS outage.

Report: "Kernel instability 08/29/23"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and we are working with our hosting provider to resolve the issue.

investigating

We are continuing to investigate the issue and are working with upstream providers to identify and resolve the outage.

investigating

We are currently investigating this issue.

Report: "Kernel Instability"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Degraded kernel execution"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Unable to load Hex"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating this issue.

Report: "Platform Instability"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Platform instability"

Last update
resolved

This incident has been resolved.

identified

We have observed some platform instability due to a scheduled change to our system.

Report: "Kernels unavailable"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Platform instability"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Platform instability"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Intermittent 502s when accessing Hex"

Last update
resolved

This incident has been resolved.

investigating

We are following a Cloudflare incident where intermittent connectivity issues are being reported: https://www.cloudflarestatus.com/