Historical record of incidents for Lightdash
Report: "Google Cloud Platform outage"
Last updateGoogle Cloud Platform outage. Core services used to power Lightdash queries are currently unavailable. We're investigating work arounds.
Report: "Users can't access all dashboards"
Last updateWe've identified an issue with loading all dashboards on shared Lightdash instances.
Report: "Degraded performance"
Last updateThis incident has been resolved.
We're currently experiencing downtime on some of our larger instances, including: - app.lightdash.cloud - eu1.lightdash.cloud Our team is actively investigating the issue and working to restore access as quickly as possible. We’ll keep you updated as soon as we know more. Thanks for your patience!
We are currently investigating this issue.
Report: "Degraded performance"
Last updateThis incident has been resolved.
We're currently experiencing downtime on some of our larger instances, including:- app.lightdash.cloud- eu1.lightdash.cloudOur team is actively investigating the issue and working to restore access as quickly as possible. We’ll keep you updated as soon as we know more. Thanks for your patience!
We are currently investigating this issue.
Report: "Degraded performance in app.lightdash.cloud"
Last updateWe can confirm that the fix was successful and that the issue has been resolved. Thank you for your patience
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Users may notice slower page load times and delayed query responses during this time. Our engineering team is actively investigating the root cause and working to restore normal performance levels.
Report: "Degraded performance in app.lightdash.cloud"
Last updateWe can confirm that the fix was successful and that the issue has been resolved. Thank you for your patience
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Users may notice slower page load times and delayed query responses during this time. Our engineering team is actively investigating the root cause and working to restore normal performance levels.
Report: "Degraded performance"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
Users reporting degraded performance and timeout errors. We are investigating the issue.
Report: "Google Sheets integration login issues"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Users are unable to complete the Google Sheets export functionality due to authentication failures during the sign-in process.
Report: "Email Delivery Issue"
Last updateThe email provider has resolved the incident, and email delivery is back to normal.
We've identified the cause of the email delivery issue—it's related to our email provider. They’ve updated their status page with more information and we're closely monitoring the situation and will provide further updates as they become available.
We're currently experiencing an issue where emails are not being sent. Our team is actively investigating the problem.
Report: "Google-related services impacted"
Last updateGoogle has resolved the incident. More details can be found in their incident report: https://status.cloud.google.com/incidents/ETJGhvY9Xaktw7tgi8dF
We are impacted by a broader issue from google affecting multiple Google services. The root cause is an outage within Google's infrastructure, which is beyond our control. You may experience issues with the following features: login with Google and Okta functionality, and Google Sheet Syncs. We are actively monitoring these services.
Report: "Email not sending"
Last updateThis incident has been resolved.
Emails are processing again, we are monitoring our external provider.
Users are not receiving emails from Lightdash, including verify emails on signup and password resets. We've identified that our email provider is experiencing an outage: https://status.postmarkapp.com/notices/zno1dlxjdjmblc0d-service-issue-outbound-sending-and-inbound-processing-messages-are-being-accepted-and-queued We're monitoring the status of our external provider while we setup a backup email service.
Report: "Degraded performance on US cloud instance"
Last updateThis incident has been resolved
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Users are experiencing unexpected errors. We are currently investigating the issue.
Report: "Degraded performance on cloud instances"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating the issue.
Report: "Performance degraded on dashboards and scheduled deliveries"
Last updateThis incident has been resolved.
We've deployed a fix to all customers and are monitoring performance closely. Users are reporting normal loading times.
We are continuing to work on a fix for this issue.
We're currently deploying a fix to all customers, once this is deployed we'll begin monitoring the fix to ensure we've restored services to normal.
Dashboards and scheduled deliveries are experiencing degraded performance (>1 minute) load times due to a service outage with our feature flag provider (https://status.posthog.com) We're currently removing any calls to this service to return Lightdash to normal. Please subscribe for updates.
Report: "Issues with deploying and creating preview environments"
Last updateThis incident has been resolved by applying a fix and releasing a new version of the app. Thank you for your patience during this time.
A fix has been implemented and we're currently monitoring all instances. We recommend applying a "hard refresh" to your browser tab/window to ensure you get the latest changes applied
We've identified the issue and are currently building a new image: 0.965.6
We’ve identified an issue within the following release: 0.956.4, leading to empty Explores when deploying to a Lightdash project or creating Preview environments. We’ve reverted all Cloud instances. For self-hosted users, we recommend upgrading to version 0.965.6 in Lightdash.
Report: "Degraded performance on app.lightdash.cloud"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to work on a fix for this issue.
We identified the root cause as the database running at full capacity at the moment. We are going to scale the resources and identify what queries should be optimized.
We are currently investigating this issue.
Report: "Degraded performance and Snowflake Query Failures"
Last update#### 2:11 PM * **Issue Reported**: Users report that dashboards aren’t loading any charts. #### 3:41 PM * **Additional Reports**: Other users report that app is crashing #### 3:14 PM - 4:30 PM * **Observation**: High backend latency was detected. * **Collaboration**: Engaged in a chat with other engineers to identify potential causes. * **Root Cause Analysis**: * Reviewed changes released that morning. * Suspected the Node upgrade might be the cause. * Discovered a related GitHub issue with Snowflake and Node 20: [GitHub Issue](https://github.com/snowflakedb/snowflake-connector-nodejs/issues/588) * **Decision**: * Immediate rollback. * Plan to create a PR to update the `snowflake-sdk` package. #### 4:25 PM * **Rollback**: Rolled back from release 0.797.0 to 0.795.0. #### 4:47 PM * **Feedback**: Users report that app is functioning correctly #### 5:10 PM * **Follow-up Action**: Began working on the PR to upgrade Snowflake's SDK.
We've successfully identified the root cause and have taken measures to resolve the issue. We apologize for any inconvenience this may have caused and appreciate your understanding and patience during this time.
Report: "Degraded performance on app.lightdash.cloud"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We've noticed degraded performance and disconnects for customers in the US region using app.lightdash.cloud. We're investigating the root cause
Report: "Lightdash Cloud instability (slow response times)"
Last updateToday at 13:37 \(UTC\) we noticed that all API endpoints for [https://app.lightdash.cloud](https://app.lightdash.cloud) were starting to run very slow. This was affecting all users and led to response times of over a minute for the API. Our first response was to greatly increase the amount of resources available for the Lightdash server \(both the number and size of servers\). After this change all services remained stable. We have investigated our logs to understand exactly what actions user’s were taking to make the server so busy. In addition to the volume of users, we noticed a higher number than usual of Databricks users. We already have a fix in testing for improved Databricks performance, which can be expected in the coming hours. For further performance improvements you can follow this milestone in GitHub: [https://github.com/lightdash/lightdash/milestone/91](https://github.com/lightdash/lightdash/milestone/91)
This incident has been resolved.
We are monitoring performance
The Lightdash API a https://app.lightdash.cloud is experiencing very slow response times for all users, leading to some actions taking minutes or not executing. We're currently investigating.
Report: "app.lightdash.cloud showing degraded performance"
Last updateEarlier today we received an automated alert that app.lightdash.cloud was unavailable and returning 502 errors. The reason for this error was that Lightdash was slowing down due to the amount of usage in Lightdash Cloud at the time. The slower response times in Lightdash triggered an automated process to restart the Lightdash servers, usually this should only trigger in the case that the server has already crashed. In this incident, this was a mistake and the server was simply running more slowly than expected. To resolve the issue, we've added much more resource to our Lightdash Cloud servers to prevent slow response times. We've also increased the threshold to automatically restarting the servers in the case of very slow response times.
This incident is resolved.
A fix has been implemented and we are monitoring the deployment.
We've identified the root cause and it's being fixed.
We have identified the failing component but are still finding the root cause. Services appear operational again.
We are currently investigating the root cause of the issue