Galaxy 2.0

Is Galaxy 2.0 Down Right Now? Check if there is a current outage ongoing.

Galaxy 2.0 is currently Operational

Last checked from Galaxy 2.0's official status page

Historical record of incidents for Galaxy 2.0

Report: "Cloudflare Outage Affecting Builds"

Last update
investigating

We are currently experiencing issues due to a Cloudflare outage. The following URLs used in our infrastructure are impacted: https://warehouse.meteor.com https://static.meteor.com During builds, you may encounter messages like: Problem downloading Meteor binaries or issues when downloading Atmosphere packages. As a result, your build may fail intermittently or take longer than usual to complete. This is due to Cloudflare instability affecting the delivery of these resources. We are actively monitoring the situation and will provide updates as necessary.

Report: "Access to the Galaxy dashboard may be unavailable"

Last update
investigating

We are currently investigating this issue.

Report: "Galaxy Cloud Dashboard Instability"

Last update
resolved

We experienced a period of instability with the Galaxy Cloud dashboard at beta.galaxycloud.app, which may have prevented users from accessing or managing their applications. Our engineering team identified and resolved the issue quickly. Full functionality has been restored, and we are continuing to monitor the system to ensure continued stability.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Metrics System for Webapps and MongoDB"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

Report: "Galaxy Cloud Dashboard Instability"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently experiencing instability with the Galaxy Cloud dashboard at beta.galaxycloud.app, which may prevent users from accessing or managing their applications. While we work to resolve the issue, you can continue managing your apps using the legacy regional UI: šŸŒŽ Americas: https://galaxy.meteor.com/ šŸŒ Asia: https://ap-southeast-2.galaxy.meteor.com/ šŸŒ Europe: https://eu-west-1.galaxy.meteor.com/ We appreciate your patience and are working to restore full access as soon as possible.

Report: "us-east-1 - September 1st 02:14 (UTC) - App scheduler not working properly"

Last update
resolved

Today (September 1st) around 02:14 (UTC) our App scheduler service on us-east-1 was not working properly. We roll out updates almost every week and the last update didn't terminate all the old scheduler machines. Scheduler machines are the ones coordinating the start and stop actions of containers and host machines. So we had running some scheduler machines that were not working as expected as they were running with old configurations and this caused some containers to be replaced wrongly without respecting our policy to always have good containers running first and then kill the old ones. This problem affected a few apps and in some cases causing a ~6 minutes downtime because all the containers were replaced. We really sorry for the trouble we have caused and we have already changed our process to double check all the resources that should be destroyed by Terraform in the end of every update. If you have any further questions please send us an email (support@meteor.com).

Report: "Old notifications being sent today"

Last update
postmortem

The issue was identified and fixed. It was stuck due to a wrong notification breaking a loop and now we don't have this problem anymore as we removed the wrong notification and fixed the code that caused it in the first place. This loop was causing performance issues on Galaxy Dashboard UI \(us-east-1 only\), you should notice Galaxy Dashboard UI a lot faster now. Sorry for the trouble.

resolved

This incident has been resolved.

identified

The issue was identified and fixed. It was stuck due to a wrong notification breaking a loop and now we don't have this problem anymore as we removed the wrong notification and fixed the code that caused it in the first place. This loop was causing performance issues on Galaxy Dashboard UI (us-east-1 only), you should notice Galaxy Dashboard UI a lot faster now. Sorry for the trouble.

investigating

We had an issue in our notification system that it is sending old notifications, we are already working in the fix. If you have received many emails saying your app is unavailable or something similar they are old notifications. Don't worry and sorry for the trouble.

Report: "Meteor APM metrics delay"

Last update
resolved

We have identified that the secondary node of our second shard of APM was not responding to reads. This caused some apps that are allocated in this database to have a halt in the metrics shown. We are working with the database provider to understand what could be the cause, but for now, we have changed for reading the primary node and the metrics will be processed from now on. We are sorry for the inconvenience.

identified

We have identified an issue with our DB monitoring, and one of our shards had a high CPU usage causing some metrics to be delayed on the second shard. We are deploying a fix right now but might take some time to catch up.

Report: "Some Galaxy deploys are failing"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We identified an issue with a deploy release 15 minutes ago that affects deploys. If you are seeing an internal server error when deploying from the cli, please wait 15 minutes as we are already deploying the fix. We are sorry for the inconvenience.

Report: "AWS Australia InternetConnectivity operational issue"

Last update
resolved

AWS solved the issue.

monitoring

We are continuing to monitor for any further issues.

monitoring

We have seen that AWS is having internet connectivity troubles from international connections directed to hosts in Australia. All our services are operating, but you might see a failed request anytime due to this. If you are serving inside Australia, this doesn't impact you. We are monitoring AWS fix on this matter, but until now, the impact is low.

Report: "CDN not responding (Meteor installs and packages)"

Last update
resolved

Our CDN provider suspended our account by mistake. It's active again and operating as expected. We are analyzing with them why this has happened. All the packages files and tarballs of Meteor are preserved and no data was lost, the issue was just in the CDN and not the content itself.

investigating

Hi, our CDN is not responding to our requests. This affects Meteor installs and also packages download. We are in contact with them to restore the service as soon as possible.

Report: "Investigating issues with new deploys"

Last update
postmortem

We use an internal Docker Registry for each region. These Registries use a self-signed certificate. The certificates from US and EU regions expired yesterday so pushing new images \(new deploys\) and pulling images \(new containers\) were failing. These certificates expire every 5 years but we didn't have any monitor tracking these certificates. A few apps were down while this certificate was not renewed because our cluster scales down our machines when we have a lot of space available, so if one app was running in a machine that was turned off during the scale down and this app was only running one container this app was getting error starting the container in the new machine as pulling new images was resulting in error due to the expired certificate. This caused a few apps to be down as the app was not able to start new containers successfully. Actions to avoid this in the future: 1 - We are going to add monitors to these certificates as we have to all other certificates. 2 - We are going to decrease our level of "accepted" errors in the monitors of the services: a\) that build new images and b\) starts new containers. So this issue is not going to happen again in the future. And if something similar happens we will be notified sooner even if just a few apps are affected. These monitors didn't fire as just a few apps were affected.

resolved

This incident has been resolved.

monitoring

We've replaced the certificates. Now we are replacing the app machines, this is not immediate as we don't want to kill many containers in the same time but new containers are going to be created in the new machines already with new certificates to access the registry. Deploys should be working fine now as well.

identified

Galaxy AP was not affected by this incident.

identified

We have a self-signed certificate in our Docker Registry that needs to be renewed each 5 years and it expired yesterday night. We are updating these certificates now. We are going to post more details later here. This is affecting new deploys and new containers as any action requiring a Docker image from our Registry is failing.

investigating

We are investigating issues with new deploys

Report: "Galaxy Dashboard (UI) is failing"

Last update
resolved

We've updated the way we make the calls so now this new root certificate is now not causing the requests to fail anymore. We have reviewed all the others apps and they are working correctly. Customer's apps were not affected at all by this problem. Sorry for the trouble. --- Update (01:24 PM EDT): Some users of your app could experience issues with certificate errors but this is not directly related to our problem in Galaxy UI. Old browsers are not going to be compatible with the newest Let's Encrypt Root Certificate (https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/) and Galaxy uses Let's Encrypt to generate certificates automatically. If you really need to support old browsers you need to use a different certificate than Let's Encrypt and upload your custom certificate in your app settings on Galaxy --- Update (5:31 PM EDT): Issues related to Expired Certificates are explained here https://docs.meteor.com/expired-certificate.html (They don't have a direct relationship with this problem in specific but maybe a few people will check this page anyway).

identified

Galaxy UI is operating normally again. Now we are applying the same fix for the API.

identified

We've identified the problem. The lib that we are using is failing to make HTTP requests after this upgrade from Let's Encrypt https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/ We are working in a solution.

investigating

Galaxy communication with accounts is failing, we are investigating the issue. Client apps are not affected, this issue is between Galaxy Dashboard UI and our authentication system. If you need to apply changes to your app please request the change to support@meteor.com and we can apply for you.

Report: "Let's Encrypt - Certificate issuance is temporarily unavailable"

Last update
resolved

Certificate issuance is now restored.

identified

The issue has been identified and a fix is being implemented by Let's Encrypt.

investigating

We use Let's Encrypt to generate SSL Certificates and they are currently unavailable. You can read more here https://letsencrypt.status.io/pages/incident/55957a99e800baa4470002da/6164b5af714e1f053880ba0c As we depend on then to generate Certificates we can't generate certificates at the moment.

Report: "AWS High API error rate"

Last update
resolved

AWS has implemented a fix that might still take some time to apply and reestablish all API services, but it's back responding to API calls. Galaxy impact was mild, and all services have resumed the normal operation.

identified

US-EAST-1 components are returning a high volume of errors during AWS API calls. This can affect the time needed for allocating new containers. We are monitoring AWS status page for more details: https://status.aws.amazon.com Currently, no running containers are impacted.

Report: "IP Whitelisting failures on AP-SOUTHEAST-2"

Last update
resolved

We've fixed the issue. It was a misconfiguration on our side that changed the NAT gateway from the auto-scaling groups of the professional plan, so new machines were using the wrong NAT gateway without the fixed list of IPs. Sorry for this trouble and if you have any questions please open a ticket at support@meteor.com.

monitoring

We've fixed the auto-scaling groups to use the correct NAT gateways so IP white listening is working fine again.

identified

We have identified that the new machines in the whitelist cluster were being created in a different subnet than the public exposed IP list, causing issues with applications using whitelist in providers. We have applied a change to our auto-scaling groups that requires machine reinitialization, we are applying it right now. The issue should be solved in some minutes.

investigating

We are investigating an issue with IP Whitelisting in Asia. Some connections might be being made with other IPS instead of the ones listed in our whitelist list. If your app is affected, please disable momentaneously the whitelist in your MongoDB or Redis while we fix this issue.

Report: "Logs for apps deployed in the Asia region are temporarily unavailable"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We're currently working on a problem in the Asian region that is blocking the logs from being displayed. We're incredibly sorry for the incident, and we're working as fast as we can to fix this ASAP.

Report: "Exporting logs from apps deployed to Galaxy is temporarily unavailable"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently working on an issue on the Galaxy that is blocking the export of logs. We are very sorry for the incident and we are working as fast as we can to fix this as soon as possible.

Report: "AWS is having issues with the ECS service"

Last update
resolved

AWS resolved the incident. Galaxy services are not affected.

investigating

Some of our internal services use ECS, but we're not having any problems so far. We will continue to monitor the status of AWS ECS.

Report: "Galaxy infrastructure problem."

Last update
resolved

The environments remained stable during monitoring. This incident has been resolved.

monitoring

We managed to recover all our infrastructures and we are monitoring the results.

identified

We are having issues related to AWS services. We are still investigating it, but we have been able to recover most of our services. Please, follow status page: https://downdetector.com/status/aws-amazon-web-services/

investigating

We are investigating the issue in all regions of the Galaxy.

Report: "Deployment issues when deploying via command-line"

Last update
resolved

This incident has been resolved.

investigating

We are investigating the issue in all Galaxy regions. This affects only the deployments via command-line. Push to Deploy is working normally.

Report: "Building issues when deploying via command-line and Push to Deploy"

Last update
resolved

This incident has been resolved.

investigating

We are investigating the issue in all Galaxy regions. This affects the build process when deploying via command-line and Push to Deploy.

Report: "Free MongoDB Shared Hosting on Galaxy [Virginia, US] is experiencing performance issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

Our database provider had an issue with one of their resources, which caused some instability on Free MongoDB Shared Hosting.

investigating

We are currently working on an issue with Free MongoDB Shared Hosting in the Virginia, US region We are very sorry about the incident and are working as quickly as possible to fix this as soon as possible.

Report: "Problems deploying apps on Galaxy"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We verified that there was instability in some deploys in the galaxy due to an increase in resources and demand. It did not happen to all customers. Only a part was impacted.

investigating

We are investigating the issue in all regions of the Galaxy. This only affects the deployment layer of client applications.

Report: "Free MongoDB Shared Hosting on Galaxy [Europe, Ireland] is experiencing performance issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently working on an issue with Free MongoDB Shared Hosting in the Europe, Ireland region We are very sorry about the incident and are working as quickly as possible to fix this as soon as possible.

Report: "Intermittent 404s downloading Node.js during deploys"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

This is an issue related to Node.js, but it affects the deploys on Galaxy. During the deploys, when attempting to download Node.js, it returns an error. For more details on this issue, please go to the official status page for Node.js at https://status.nodejs.org/incidents/svk1f6czgxy6

Report: "Push To Deploy - GitHub Errors"

Last update
resolved

This incident has been resolved.

monitoring

GitHub is recovering, P2D is now working as expected but may be slow at times. We continue to monitor.

investigating

Due to an issue with GitHub, we are experiencing an error with P2D deployments on GitHub. For more information, see the GitHub status: https://www.githubstatus.com/incidents/nf7s6933tnn8

Report: "Problem in viewing metrics in APM"

Last update
resolved

This incident has been resolved.

monitoring

There was a problem with the APM metrics that occurred on May 28th and 29th. The problem only happened with some apps. It's possible that the Apps graph in APM could appear empty at this time. We are still monitoring this incident.

identified

We've identified the issue and are making a fix.

investigating

We are currently investigating this issue.

Report: "The APM is currently taking some time to load its metrics."

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Problem with viewing some metrics in APM"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

System metrics like CPU and Memory are having issues.

Report: "Meteor.js version 2.13.1 is not compatible with Linux"

Last update
resolved

A new version of Meteor has been released to fix the issue. It is the latest version available for our installation systems. Release Meteor Version: 2.13.3

identified

The issue has been identified and a fix is being implemented.

investigating

The latest Meteor.js version 2.13.1 is not compatible with Linux. You may see an error if you use a CI to deploy your application and 2.13.1 is installed. The current workaround is to ensure you have 2.13.0 running ( curl https://install.meteor.com/\?release\=2.13.0 | sh ) or ( npm install -g meteor@2.13.0 ) installed on your servers. We are working to release a fix soon in version 2.13.2.

Report: "Downtime at our database service provider."

Last update
resolved

This incident is now resolved. We have successfully restored all the affected services, and the data resynchronization process was completed successfully.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We regret to inform you that we recently encountered some access issues with our Galaxy, APM, Meteor Accounts, Atmosphere, and Dashboard services due to a downtime experienced by our database service provider. I'm pleased to report that while the applications hosted on Galaxy remained unaffected, there was a disruption in deploy functionalities. Additionally, applications integrated with accounts.meteor.com experienced difficulties with login processes. In order to restore normal operations, we initiated a backup restoration process. However, please be aware that this might lead to some temporary absence of certain packages. As soon as our database provider is back up, we will perform a comprehensive data resynchronization to rectify any discrepancies. Please rest assured that our dedicated team is working diligently to ensure a seamless transition back to full functionality.

investigating

We are continuing to investigate this issue.

investigating

We are currently experiencing access issues with Galaxy, Meteor Accounts, APM, Atmosphere and Dashboard due to downtime from our database service provider. Customer applications hosted on Galaxy will not be affected. However, deploys are currently not functioning properly. We would also like to inform you that apps integrated with accounts.meteor.com may not be able to use sign-in and sign-up processes. Our team is actively addressing these issues to restore normal operations as soon as possible. We appreciate your patience and understanding during this time.

Report: "Galaxy Frontend - Downtime in Europe and Asia"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Galaxy is experiencing instability."

Last update
resolved

This incident is now resolved. We have successfully restored all the affected services and are monitoring the results.

investigating

Galaxy apps and our website are currently unstable and experiencing issues.

Report: "Email System Performance Issue"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We would like to inform you about an issue with our email sending system, causing delays and excessive email dispatch. We are actively working to resolve the situation promptly. Thank you for your understanding.

Report: "Issue on Galaxy Mailing System"

Last update
resolved

This incident has been resolved.

investigating

We are currently investigating issues with our mailing system. Clients may experience unusual behavior when sending emails. This only affects the mailing system; all other features are working correctly.

Report: "Galaxy UI and PushToDeploy issue"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We have implemented a solution for the issue, and our team is currently monitoring the affected services closely.

investigating

Hi there, We regret to inform you that our service is currently experiencing a temporary disruption. Our technical team is actively working to address the issue and restore normal functionality as quickly as possible. Please rest assured that we are taking immediate and decisive action to resolve the situation. Our top priority is to minimize any inconvenience this may have caused you. We understand the importance of our service to your operations, and we are committed to getting things back to normal at the earliest. We appreciate your patience and understanding during this time. Regular updates on the progress of the restoration will be provided here: status.meteor.com, and we will notify you as soon as the service is fully operational again. We appreciate your understanding. Regards, Support Team

Report: "Incident in deployments made using Push To Deploy - P2D"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Meteor Cloud Dashboard Downtime"

Last update
resolved

Dear Customers, We experienced a brief downtime on both our meteor.cloud dashboard and our meteor.com website today, from 8:00 AM to 12:00 AM. Our team has identified the issue and promptly resolved it. Both services are now fully operational. We apologize for any inconvenience this may have caused and appreciate your understanding. Thank you for your continued support. Best regards, Meteor Team.

Report: "Problem with generating new HTTPS certificates in Galaxy"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

Our certificate provider is experiencing an incident: https://letsencrypt.status.io/

Report: "Accounts system error"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

Problem Login into platforms: Galaxy | Cloud | Galaxy Beta

investigating

We are currently investigating this issue.

Report: "Push to deploy issues."

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

investigating

We are currently investigating this issue.

Report: "Problem with Lets Encrypt certificate"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Issue with Starting New Containers"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified, and a fix is being implemented.

investigating

We are continuing to investigate this issue.

investigating

If you use MontiAPM, you will encounter difficulties in starting new containers. Our specialist team is addressing this issue.

Report: "Problem viewing Logs"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently experiencing issues with the log viewing functionality for services hosted in the America region on Galaxy. Our team is investigating the root cause and working to resolve the issue as soon as possible. We apologize for the inconvenience and will provide further updates as they become available.

Report: "Instability on our Galaxy Database System Provider"

Last update
resolved

We have received the information that our provider is fully operational. We thank you for your understanding.

monitoring

We are continuously monitoring our provider.

identified

The issue has been identified; it is related to our database's main provider API. We are actively monitoring their status and performing internal tests to ensure stable performance to our users.

investigating

We are currently investigating this issue.

Report: "Minor Instabilities on US region"

Last update
resolved

This incident has been resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We want to inform you that we are currently experiencing minor instabilities in our US region infrastructure. We have already identified and applied changes to overcome that.

Report: "Downtime in Europe region"

Last update
resolved

After implementing security measures, we mitigated most attacks. While our Europe region remained stable during this monitoring period, we can consider this matter completely resolved.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are still collecting some more information and monitoring it.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

We are continuing to monitor for any further issues.

monitoring

Dear Customers, We recently experienced a major DDoS attack that hit our systems with over 53 million malicious requests within a short period, causing a brief outage in our Galaxy Europe region today, from 12 AM to 12:30 am GMT. Our team has identified the issue and are actively working on it. We stabilized for now. We apologize for any inconvenience this may have caused and appreciate your understanding. Thank you for your continued support. Best regards, Meteor Team.

Report: "Galaxy MongoDB Database Metrics"

Last update
resolved

This incident has been resolved.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue.

Report: "Push to Deploy Inconsistencies"

Last update
resolved

We have implemented a fix to the PushToDeploy feature. Our team monitored it for a few days after the deployment of this fix, and now everything is working accordingly.

monitoring

We are continuing to monitor for any further issues.

monitoring

A fix has been implemented and we are monitoring the results.

identified

We have identified the issue, and our team is committed to delivering a solution promptly.

Report: "P2D - Push To Deploy Issues"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented, and we are monitoring the results.

Report: "CLI Deployments failing for some accounts"

Last update
resolved

This incident has been resolved.

monitoring

A fix has been implemented and we are monitoring the results.

identified

The issue has been identified and a fix is being implemented.

investigating

We are currently investigating this issue. It seems to be happening for some accounts.