Subbly

Is Subbly Down Right Now? Check if there is a current outage ongoing.

Subbly is currently Operational

Last checked from Subbly's official status page

Historical record of incidents for Subbly

Report: "Emergency maintenance affecting checkout and API"

Last update
resolved

And we're back to 100% online. API and checkout included. We will continue working on the more robust solution to mitigate a similar situation happening again. The issue was related to either physical hardware failure by our supplier, or an overloaded node on the server. Sorry for any inconvenience caused.

identified

Good news hotfix is underway, we should have the checkout and API fully back online in 5 minutes

identified

Unfortunately the hot fix is proving tricky, we're unable to successfully upgrade the server directly, we believe the issue stems from a physical hardware issue. We've started working on the permanent fix immediately and will continue to try finish the hotfix. New ETA is 1 hour. Apologies for any inconvenience.

identified

Putting into maintenance mode now. ETA 25 minutes from now.

identified

Our team has identified one of our servers which handles a critical job queue is underperforming. This job queue is required for the API which the checkout depends on is underperforming. We're performing a hotfix which will require 15 minutes of downtime affecting the checkout. We're preparing for this downtime just now. We will update here when the checkout is officially offline. We have already planned a permanent fix for this essential componenet to ensure stable performance for BFCM and the holidays, which will be actioned after the hotfix. Sorry for any inconvenience. We'll be right back.

Report: "Batch jobs and CSV exports delayed"

Last update
resolved

The root problem of the queue becoming backed up was identified and patched. It was due to a protection mechanism we put in place previously combined with increased volume from the 1st of the month renewals. We have since patched this and the queue has now cleated. This issue was resolved at 09:00 BST, sorry for any inconvenience caused.

monitoring

A fix has been implemented and we are monitoring the server loads currently. Increased load times of the admin are still to be expected.

identified

We've identified the problem with the increasing queue of batch jobs causing slight delay with some of the batch jobs on the admin, such as order exports or batch admin changes. Devops team is working on increasing the amount of workers which is draining the queue. Please be patient after you submit the batch job or request an export, it will come through but it's going to be delayed. Sorry for any inconveniences caused

Report: "Billing engine partial outage"

Last update
postmortem

## Summary On April 15, 2024, between 3:00 AM and 9:00 AM UTC, a critical issue was identified in our billing engine affecting approximately 3.9% of our merchants. This error led to unintended multiple charges on some customer subscriptions during the renewal process. We have contacted everyone who was affected already. ## Technical details The problem originated from an SSL networking issue within a pod on one of our servers. This anomaly prevented the pod from connecting to the necessary database, causing job processing errors where jobs were incorrectly re-attempted. The issue was traced back to a specific data center anomaly affecting the said node. ## Resolution steps Upon detection of the issue, our engineering team promptly intervened with a series of corrective measures: 1. **Pro-active Communication**: We contacted all affected merchants about the issue with the steps we were taking. 2. **Immediate Refunds:** All duplicate charges were identified and refunded to the affected customers. 3. **Order Management:** Corresponding orders linked to the duplicate charges were archived and removed from our administrative platforms to prevent further confusion. 4. **Fee Credit:** We issued credits for both Subbly and Stripe application fees involved with these transactions, which are typically non-refundable. Please note that the credit process should be completed by the following morning \(16th April\). ## Preventive measures In response to this incident, we have implemented a robust solution to prevent the recurrence of this specific error. Enhancements to our network and database interactions have been deployed to fortify the stability of our job processing routines. ## Statement from our team We sincerely apologize for the inconvenience this may have caused and deeply value your trust and partnership. We are committed to ensuring the reliability of our services and upholding the quality our merchants expect. Should you need further assistance or have any concerns, our support team is on standby to assist you. We appreciate your understanding and continued support as we move forward from this incident. ### Contact Information For additional information, follow-ups, or immediate concerns, please do not hesitate to reach out to our customer support team directly.

resolved

All extra charges have been refunded. Root cause has been permanently patched. And we are working through crediting the Stripe and Subbly transaction fees which will be finished by tomorrow morning. 3.9% of merchants were affected by this issue. A postmortem will be published shortly. We apologize for any inconvenience caused.

identified

We are continuing to clean up the incorrect charges. ETA 2-3 hours from now.

identified

When the issue was identified we shut the troublesome worker/job down stopping the issue in its tracks, soon after an immediate permanent fix has been put in place. Another more robust permanent fix is being put into place as well. The initial scope of the damage has been identified as well and we're now working on refunding extra charges. We will be in contact with anyone who was affected soon after with a breakdown of impact, and what we're doing about it. We will also update here with a postmortem. Apologies to anyone affected, we understand this can be very disruptive and stressful.

investigating

We are currently having problems with some of the customer's renewals being charged multiple times. We're investigating the root of the problem and we will update you shortly on the outcome. Please stand by and sorry for any inconveniences.

Report: "Billing Engine - Might delay renewals"

Last update
resolved

Incident has been resolved, impact was minimal. All services are running normally and optimally.

investigating

We are continuing to investigate this issue.

investigating

We're currently investigating an incident with the Billing engine, there might be delays to subscription renewals. If you're facing any major issues, please contact support to report it.

Report: "Known minor issues due to a major upgrade"

Last update
resolved

All notable minor bugs have been squashed. If you come across anything else please do let our friendly support team know. Thanks for your patience. P.s. we're very excited to roll out part 2 next week!

identified

Fixes are nearly in place for the minor issues. Scope of impacted services and components changed for accuracy, renewals and API/feeds were unaffected. Will close this incident by end of the day. Thanks for your patience, and please do keep reporting any bugs you come across (as always).

identified

Fixes are under way. The one-time product inventory issue affecting some edge cases, will require some temporary workarounds (using Automations) until we finish rolling out the rest of the upgrade early next week. You will have received communication about this already if it's affecting you. Please report any bugs or issues you come across. Thanks for your patience and cooperation, this will be worth it.

investigating

Website builder removed from scope, this issue has been resolved.

investigating

Updated scope of impact, for accuracy.

investigating

Updated scope of impact.

investigating

We're currently deploying a huge upgrade to the platform (improving the product creation login and adding facilities for creating bundles and 'subscribe and save' features to the system). During the deployment, we've noticed some inconsistencies with existing feature functionality as a result of the deployment and we're working hard to get the new features backwards compatible as we speak. Problems with the following functionalities have been identified so far: - minor problems with inventory management for one-time products - billing/shipping cadence anomalies for anchored products (the ones with the set billing date and a set cut off date) - minor problems with backend product changes not fetching through to the frontend if your site is hosted on Subbly Please report any issues you come across to the support team (use the chat widget in the admin). Hope that you'll get plenty of benefit from the new upgrade and brand new features!

Report: "Cart widget unresponsive"

Last update
postmortem

Due to networking issues at our data center, we’ve had the checkouts silently failing for a period of time. We’ve since deployed a fix for this and did also improve alerting to catch similar cases easily in the future.

resolved

Problems with server load have been identified and fixed. Cart widgets and checkouts are back and operational.

investigating

We are currently experiencing downgraded performance with cart widget functionality. We're working on investigating the root cause of the issue and on deploying a fix for it. Sorry for any inconveniences caused.

Report: "Customer support chat and email is unavailable"

Last update
resolved

Chat widget is back and working as usual. Sorry for the inconveniences caused.

monitoring

Our support chat is now coming back online and we will be working to catch back up on tickets. We sincerely apologize for he inconvenience caused, we've already implemented measures to prevent this happening again, we know how important swift responses are to you and we strive to uphold a high quality and high speed response time. We are also going to review our backup channels and processes in case this does ever happen again. Thank you for your patience.

identified

We are currently facing technical issues with our customer support chat and emails, we are working hard to bring them back online and expect it to take up to 24 hours. 👉 Got an urgent question or issue? Please use the dedicated chat on our facebook group: https://m.me/ch/AbZTz1gnOkU_HlcZ/ or our Discord chat: https://subb.ly/discord whichever you prefer.

identified

We are having issues with out support channels. Sorry for the inconvenience, we're working hard to bring them back online.

Report: "Some automations and jobs not running properly"

Last update
resolved

Fix has been deployed, all jobs are back and running. Jobs and automations that have been clogged up in the queue because of the incident were processed successfully.

identified

We are continuing to work on a fix for this issue.

identified

We've identified problems with some of the jobs not being ran properly in the last 24 hours. As a result of that, some of the jobs haven't been or have been partially executed (most notably, customer's renewals). Problem has been diagnosed and we're actively working on fixing it.

Report: "Checkout facilities not working"

Last update
resolved

Fix was deployed.

identified

We are continuing to work on a fix for this issue.

identified

The issue has been identified and a fix is being implemented.

investigating

We're having problems with the checkouts affecting the small subset of Enterprise customers. Checkouts are currently not working for Enterprise customers due to the infrastructural changes.

Report: "Main Services (Checkout & Admin) degraded performance"

Last update
resolved

We've experienced a downtime of the core features for ~12 minutes due to high database CPU usage. Cause of this was migrating old database events to the new structure (improvement that will improve logging on the admin). Problem was immediately isolated and we rectified the problem by splitting the script into smaller chunks to prevent throttling of the CPU. Problem has been fixed at 12.31PM UCT time.

Report: "Network connectivity issues"

Last update
postmortem

Yesterday our data centre had issues with network connectivity which affected connections between servers. This may have led to data loss and duplicate data issues. We’re still reviewing the impact of our data centre networking issues, we will reach out to anyone who we believe was affected. Please let us know if you notice anything that doesn’t seem right. We’re also reviewing ways in which we can avoid impact if this situation were to happen again. Apologies for the inconvenience. Best wishes, Subbly Team

resolved

All our services are currently working properly. Our team has been working tirelessly to ensure that our services remain stable and reliable, and we will continue to monitor the situation closely to maintain our high standards of service quality. If you encounter any issues or have any concerns, please do not hesitate to contact our support team. We are always available to assist you and provide the necessary support.

investigating

We're experiencing an incident with externally managed Kubernetes clusters, customers may see increased error rates when using the website. We're investigating and will be back to you shortly. Sorry for any inconvenience, we're on the case.

Report: "Slower response times of the admin"

Last update
resolved

Issue has been fixed now. We had an unintentional DDoS attack which caused the problem, solution has been delivered for it. All systems are up and operational. Apologies for slower response times, you might have been experiencing problems logging in as well. For any further questions, please check in with the support team.

identified

We found the source of the issue, working on fixing it right now.

investigating

We are continuing to investigate this issue.

investigating

We're having some difficulties with the response times of the admin after the recent admin rollout. We're investigating and will be back with you shortly. Sorry for any inconvenience, we're on the case.

Report: "Checkout/admin not working due to hosting provider outage"

Last update
resolved

We've been experiencing some downtime as a result of hosting server outages. Checkouts and main dashboard were offline for around 12 minutes around 5AM GMT on the 29th of January. Fix has been put in place since then and all systems are operational again. If you are still experiencing some problems please contact the support team. Apologies for any inconveniences that this may have caused.

Report: "Login and dashboard services downtime"

Last update
resolved

This incident has now been resolved, a permanent patch will be in place by close of business tomorrow.

monitoring

Service is back online, we’re monitoring and working on a permanent fix

investigating

Hey everyone! We're experiencing downtime with the Login and account main dashboard but working as fast as we can to normalize the service. Will keep you posted.

Report: "Renewals delayed"

Last update
resolved

This incident has been resolved.

investigating

We have momentarily stopped our subscription renewal workers to investigate an issue. We will be back to normal in a while and restart the queue.

Report: "Filters not applied on orders batch actions"

Last update
resolved

Due to the recent deployment of "Stacked filters" feature, orders batch actions performed between 10:01 am and 03:07 pm UTC with a previous cached session using the old filters could have not applied any filtering to the results, causing the action to be effective on every order. A hotfix was deployed afterward to exclude the usage of old filters in the new functionality and prevent any misbehavior.

Report: "Website hosting affected"

Last update
resolved

The incident was fully resolved. Apologies for any inconvenience.

monitoring

Mitigations have been implemented and we are monitoring the results.

identified

We’ve identified the cause and are deploying mitigation measures. Performance may be intermittent.

identified

We are continuing to work on a fix for this issue.

identified

The server cluster is working slower due to abnormally high traffic. We are proactively working on mitigating the performance deficiency.

Report: "Renewals not processing"

Last update
resolved

Issue is resolved now. Renewals are processing again.

monitoring

Issue has been identified and fix has been put in place. We're monitoring the behavior and renewals are intentionally still paused. We can expect everything to go back to normal in 1-2 hours.

investigating

We're currently experiencing some intermittent problems with some customer's renewals. Renewals are currently not processing while we investigate the issue.

Report: "Website hosting networking issues"

Last update
resolved

The incident was fully resolved. Sorry once again for the hiccup.

monitoring

Sites are online again but the team is continuing to monitor.

identified

Google Cloud is facing some networking issues that is affecting the website hosting (See here: https://status.cloud.google.com/). We hope to resume normal service ASAP. Sorry for any inconvenience.

Report: "Degraded performance of the checkouts"

Last update
resolved

Checkouts for some of the merchants were affected due to gift-voucher changes in front-end. Problem has been diagnosed and fix has been pushed to production within 2 hours from when the incident happened. We are sorry for the inconveniences caused but we're back again.

Report: "Admin and checkout services downtime"

Last update
postmortem

Unfortunately, we experienced some downtime with the admin and checkout services tonight. Websites were not affected. Here’s what happened: We jumped to action right away to investigate, so there was some lag with us providing updates. After about 20 minutes we managed to isolate the issue to one of our caching server clusters running out of memory. One of our scripts that generates reports was running multiple instances simultaneously and due to over-caching it used up all the spare memory, eventually bringing the cache layer offline and the rest of the application with it. Ultimately it came down to poor planning. To avoid risking losing data we had to free up some of the cache records manually to allow us to then clear unnecessary data. The problem has been resolved and the offending script has been permanently patched and a good lesson along with it. We sincerely apologise for the awful timing of this incident. We know your trust in us is essential and that you depend on us. We will continue to fight for 100% uptime and reliability in order to earn your trust. If you have any follow up questions or concerns, please don’t hesitate to reach out to us. Sincerely, Stefan Pretty, CEO of Subbly

resolved

Service is back online and fully operational. The issue has been permanently patched. We sincerely apologise for the inconvenience.

identified

Issue identified and working on deploying the fix.

investigating

Hey everyone! We're experiencing downtime with the Admin & Checkout but working as fast as we can to normalize the service. Will keep you posted.

Report: "SSL issues on Subbly.me domain"

Last update
resolved

This has been fixed now, sorry for any inconvenience.

identified

We are aware of an issue with the subbly.me SSL, we are working on a fix just now.

Report: "Security updates to website hosting infrastructure"

Last update
resolved

We are performing routine security updates to our website hosting infrastructure

Report: "Connection issues at Data Centre (Admin & Checkout affected)"

Last update
resolved

The provider is now monitoring, therefore we are marking this as resolved. Apologies again for the brief interruption and inconvenience.

monitoring

We are continuing to monitor for any further issues.

monitoring

It seems the provider in question has resolved this issue, we will continue to monitor. Sorry again for the inconvenience, we will be reviewing our infrastructure to find ways to prevent this from happening again.

identified

We are continuing to work on a fix for this issue.

identified

There's an issue with our provider and they are currently working on it. Sorry for the hiccup guys! We'll be back soon.

Report: "Connection issues at Data Centre"

Last update
resolved

The Data Centre has implemented a fix for the issue that was impacting connectivity.

monitoring

Order creation and search results for survey answers may be delayed until the issue is resolved with the DC and we have caught up the backlog of jobs.

monitoring

Issues are ongoing with our upstream Data Centre, but they have resumed connectivity. We will keep monitoring and keep this incident and updated. Sorry for any inconvenience. Upgrading to "degraded performance" until we have certainty that there will be no further downtime.

investigating

It appears there are issues with the data centre, we are doing everything we can to resume normal service ASAP. Please stand by.

Report: "Website hosting down"

Last update
resolved

Upgrades were being made to the SSL issuing service and a configuration file was not created correctly. We identified and rectified the issue in less than 10 minutes. No permanent resolution is required as this was a one-off task and a human error. Apologies for the inconvenience peeps but we're back on track!

investigating

We are currently investigating the issue. Really sorry for any inconvenience!

Report: "Core services outage"

Last update
postmortem

We would like to apologise for any inconvenience caused from the downtime this morning. At around 4:30am GMT the core Subbly services came offline \(website hosting was unaffected\). Our team were immediately notified and sprung into action. We identified a failing API request to a third party which had come offline. Our engineering team implemented a temporary hot fix as quickly as possible by reducing the timeout for the offending api call and this brought services back online but with slow loading times. We identified that the third party had been offline for an entire 6 hours due to a major data centre outage. It didn’t affect us until 5\+ hours later due to caching which eventually expired as expected \(it’s almost unheard of for something to be down for 6 hours so we hadn’t planned for this\). We decided to refactor the code for the API call and moved into an asynchronous process which will never affect the uptime of the platform in future. This issue has now been considered **permanently resolved**. Thank you for choosing Subbly, and apologies again for any inconvenience or stress caused. Keep rocking on. -The Subbly Team

resolved

Continuing to optimise, we will write up a postmortem after we have a permanent solution in place. Sorry again for the inconvenience.

monitoring

We've deployed a fix, response times will be slow while the third party is offline. But we are optimising right now to try improve this.

identified

We have identified a third party api which has come offline which has brought our services offline. We are working on a fix. Apologies for the inconvenience!

Report: "Website hosting down"

Last update
resolved

Service has stabilised and is back to normal, we will continue to monitor. We apologise for any inconvenience. We will update with a post-mortem once we have more detailed information from the devops team.

monitoring

We are experiencing intermittent servers' downtime. Working on it to get everything back to normal!

Report: "Website hosting down"

Last update
resolved

There was an excessive load incoming to the servers for the last hour, causing an intermittent outage to sites' hosting. The service has normalized now. Sorry for any inconvenience!

Report: "Website Builder Downgraded"

Last update
resolved

Incident has been diagnosed and resolved now. Builder if fully operational again. Sorry for the hiccup everyone 🙌

investigating

We're having some issues with the accessibility of the website editor at present. Investigating the problem. Sorry for the inconvenience 🙏 Live sites are unaffected though.

Report: "Checkout facilities and admin analytics downgraded"

Last update
postmortem

On Friday around 9am GMT the checkout and admin services were affected for approximately 30 minutes. As soon as the issue was brought to our attention we reverted the cause of the issue. First of all I’d like to personally apologise on behalf of the Subbly team, this is less than ideal timing given its the busiest time of the year. The cause of the issue was due to a minor deployment that included a slight change to some middleware that led to requests not having the correct information. More specifically there was a misconfiguration on the production servers which led to this undetected behaviour until it was live on production. Our devops team are already working on a plan for the next iteration of our server architecture to handle continual scaling, a part of this plan includes streamlining deployment processes as well as ensuring the same environment state for all including local and staging. We have elevated the priority on the deployment processes and state management as we continue to work on the plan for this. In addition to this we have update the configuration on the production servers and the deployment was re-released successfully on Monday at approx 11am GMT. Apologies again and thanks for choosing Subbly.

resolved

Incident has been resolved. Checkouts and admin analytics are back up operational. Apologies for the inconveniences! We will be writing a postmortem explaining what happened soon.

identified

The issue has been identified and a fix is being implemented.

investigating

We're currently experiencing some problems with the admin analytics and checkout facilities due to latest deployments. Working on a quick resolution and reverting to the previous state.

Report: "Load balancer problems"

Last update
resolved

There was an issue with the load balancer on Google Cloud affecting website hosting. This has now been resolved. We may potentially schedule a maintenance to move to a different load balancer to avoid this in future. We apologise for any inconvenience caused.