Metabase Cloud

Is Metabase Cloud Down Right Now? Check if there is a current outage ongoing.

Metabase Cloud is currently Operational

Last checked from Metabase Cloud's official status page

Historical record of incidents for Metabase Cloud

Report: "Some Metabase Cloud customers hosted in the EU are reporting their site is down or slowness"

Last update
resolved

We have identified the issue which caused some EU-based instances to become unresponsive, which we are tracking here: https://github.com/metabase/metabase/issues/56148 We have added additional monitoring to our cloud infrastructure to alert us if this issue appears again while we work on fixing the root cause.

monitoring

The issue has been resolved for impacted customers. We will be monitoring closely for the next several hours to insure there is not a recurrence.

investigating

We are currently investigating this issue.

Report: "Some Metabase Cloud customers hosted in the EU are reporting their site is down or slowness"

Last update
Investigating

We are currently investigating this issue.

Report: "Some Metabase Cloud customers are reporting their site is down or slowness"

Last update
Investigating

We are currently investigating this issue.

Report: "Metabase Store unavailable"

Last update
resolved

Metabase store was unavailable intermittently for an hour due to a problem during the store update.

Report: "Issue with SSL certificates"

Last update
postmortem

**Summary** On Thursday Nov 9th the default certificate for the external Nginx ingress was replaced with a dummy certificate. Users attempting to reach their hosted instances would get an invalid certificate error. This was due to a misconfiguration in the new CI for releasing our helm charts that used the internal Nginx ingress values files for the external Nginx ingress. **Impact** All hosted customers not using custom domains were affected. **How was the root cause diagnosed?** We identified that the values were properly set in the values files. We Investigated the CI steps to determine why the correct values were not set. It was determined that the wrong values files were being used for external Nginx ingress. **How we’ll make this not happen again?** * Update staging and production value files to match. * Add alert to pagerduty for non-custom domains \(catch alerts faster\)

resolved

Ingress controller certificates are now back to normal. You should be able to get the correct certificate in the browser now. We're very sorry for the inconvenience, we'll publish a retrospective about the issue in the next few days.

investigating

We're investigating an issue in our cloud with certificate management that's causing browsers to receive incorrect certificates

Report: "EU region not available"

Last update
resolved

We had an issue in our EU cluster where instances were not being provisioned due to an autoscaler problem in that specific region. Issue is now resolved and we'll be monitoring the situation.

Report: "Metabase Store is unavailable due to maintenance"

Last update
resolved

We are continuing to investigate this issue.

investigating

We are continuing to investigate this issue.

investigating

We are preforming routine maintenance on the Metabase Store infrastructure. We expect it to be fully operational within the hour.

Report: "Unable to send outbound emails"

Last update
resolved

This incident has been resolved.

investigating

Metabase Cloud customers are unable to send outbound emails temporarily

Report: "Unable to create new Metabase cloud instances"

Last update
resolved

This incident has been resolved.

investigating

Metabase customers trying to create a new cloud trial or cloud instance are unable to do so.

Report: "Unable to create new database connections"

Last update
resolved

This incident has been resolved.

investigating

Metabase Cloud customers are currently unable to add new databases or edit database connections.

Report: "Google SSO bug affecting LATAM customers"

Last update
resolved

We've built and pushed new release artifacts to ship to our cloud and self hosted users who are facing this issue. Please refer to our website, Dockerhub or Github releases site to pull the new artifact that will solve this issue. Our Cloud instances will be upgraded as soon as we can in order to normalize the service.

investigating

We are aware of an issue in Google identity services that's affecting our customers in LATAM. If you're running Metabase self hosted or in our Cloud and have Google SSO enabled, you'll see an error on the login screen that won't let you access the product. We will keep monitoring the situation as we expect that it should be fixed soon.

Report: "Problems when creating instances"

Last update
resolved

AWS reports that the service should be normalized now

identified

There is a problem with our cloud service provider that doesn’t let us create new instances of Metabase. We have identified the problem and we are waiting for the cloud service provider to restore its services.

Report: "Cloud email quota exceeded"

Last update
resolved

During the day we hit the quota in our email gateway and made the subscriptions and alerts of our Cloud hosted customers to fail. The service is restored now.

Report: "Issue with Metabase Cloud"

Last update
postmortem

While provisioning new K8S clusters in our AWS environment, DNS and load balancer related configurations were accidentally overwritten with wrong parameters, preventing ingress network access to all Metabase Cloud instances. We have restored the correct configurations and added additional process controls and automated checks to prevent this from happening in the future.

resolved

We have restored the service. We will provide a description of the issue in the next few days.

investigating

We identified an issue with our Cloud. We're currently investigating

Report: "Google SSO might not work in LATAM/other regions"

Last update
resolved

We have started rolling out 44.6 version which includes the fix for the SSO issue. All instances should be upgraded in the next few hours.

identified

We finished the release process to obtain a new Metabase version that includes the fix for the issue. We'll be starting our Cloud upgrade process in the next hour.

identified

On Oct-31th, Google pushed a change without notice to their oAuth2 flows to the LATAM and other regions changing the name of a parameter that broke all SSO flows. We're actively working to push a new version of Metabase with a hotfix along upgrading our Cloud instances. In the meantime, please use username/password authentication if you cannot authenticate to your Metabase instance.

Report: "Issues on our Cloud"

Last update
resolved

Service has been restored, we will provide more details about the issue as soon as possible. In case you have further questions, please reach out to our support email.

investigating

We are continuing to investigate this issue.

investigating

We're currently investigating an issue that is causing downtime on our Cloud

Report: "Emails not sending"

Last update
resolved

We've identified that we run over quota on our Cloud email service so we requested to increase the limits. Emails on our instances are now back to normal.

investigating

We're aware that emails are not being sent on instances that run on Metabase Cloud. We're investigating the issue.

Report: "AWS Outage"

Last update
resolved

This incident has been resolved.

monitoring

AWS has restored most of their services and as a result, our customers who were experiencing downtime are now operational. We will continue to monitor closely. Please do not hesitate to contact support if you have any questions or experience any issues.

monitoring

Due to an ongoing issue with AWS us-east-1 region, a few our customers are experiencing temporary downtime. We’re monitoring the situation closely

Report: "Degraded service in our Business tier"

Last update
postmortem

A failure in the upgrade in our Elastic Beanstalk deployment for improving the timeout on reverse proxies led to instances not being able to revert back to previous version leaving them in an inconsistent state.

resolved

This incident has been resolved.

monitoring

We've deployed a fix and we're monitoring the health of the instances.

identified

We've identified a fix for the outage and we're in the process of issuing an update to resolve the situation

investigating

We identified an issue in our Business tier customers that run on isolated environments. The issue appears to be an issue with our load balancers not being able to flag instances as healthy which results in application servers being recycled continously.

Report: "Issues creating new instances in Metabase Cloud"

Last update
resolved

A customer alerted us about instances not being created on our Cloud environment. After some investigations, our engineers detected that an endpoint in our Docker image registry was returning a 500 error status so our customers were receiving the emails about instance creation but there was no instance at all. Systems should be back to normal now.

Report: "Database incident"

Last update
resolved

Around 16:30 PT we received a report of a customer that could not connect to it's Metabase instance. Our engineering team identified an issue on the application database which showed that the clusters weren't available in our cloud provider (CSP). After further investigation they found out that a cluster resize didn't apply correctly due to a configuration mismatch between the configuration scripts and the CSP offering resulting in downtime. The configuration script was modified to be compatible with the CSP and the database instances came back online.