Historical record of incidents for Vanilla Forums
Report: "Network connectivity issues in YUL1/Montréal"
Last updateThis incident has been resolved.
The switch is back online, we're waiting for BGP/routing to settle. Connectivity to Cloudflare in particular is taking a moment to settle.
The issue has been identified and a fix is being implemented.
A core switch went offline while performing maintenance with our hosting provider. Both parties are aware of the issue and actively working to bring it back online as soon as possible.
Report: "Network connectivity issues in YUL1/Montréal"
Last updateThis incident has been resolved.
The switch is back online, we're waiting for BGP/routing to settle. Connectivity to Cloudflare in particular is taking a moment to settle.
The issue has been identified and a fix is being implemented.
A core switch went offline while performing maintenance with our hosting provider. Both parties are aware of the issue and actively working to bring it back online as soon as possible.
Report: "Search & Job Queue Outage"
Last updateThis incident has been resolved.
Webhooks, notifications, and automation rules will be delayed by 2-3 hours.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Latency Issues in YUL Data Center"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We have identified the exact ISP causing our issues and are awaiting resolution. There is a routing loop along the way between Cloudflare and our datacenter.
We are talking with Cloudflare's support, and they have confirmed seeing the same network issues as we identified to our origin. We are now awaiting resolution on their end.
We are currently investigating issues with sites located on our YUL1 data center. Sites may be slow or inaccessible.
Report: "Outage in YUL-1 (Montréal) Datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Degraded Performance In YUL1 Webhooks & Queue System"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating this issue.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Report: "Site Outages"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We have identified the issue and are pushing a fix now.
We are continuing to investigate this issue.
Infrastructure update causing global site instability. We are investigating the isse.
Report: "Interruption of Service"
Last updateThe problem has been identified and resolved.
We are currently investigating reports of sites being inaccessible.
Report: "Search services disruption"
Last updateSupport is closing this as an active issue, but we will continue to monitor.
A fix has been implemented and we are no longer seeing errors for customers using the search service. Support teams are continuing to monitoring the results.
Support teams are investigating an outage affecting search services for some customers.
Report: "Knowledge Base Issues on Enterprise Sites"
Last updateWe have deployed a fix for this issue and full Knowledge Base access has been restored.
Navigating to Knowledge Base articles directly is resulting in an error. Navigating articles via the Knowledge Base page is working as expected. Our product team is investigating the issue.
Report: "Intermittent Connectivity Issues"
Last updateAll hosted communities should be back to normal.
We are continuing to monitor for any further issues.
The worst of the impact on hosted communities should be over at this time, but we are continuing to monitor and will update this status when the updates are complete
Our host is performing some upgrades and this is leading to intermittent errors and poor performance for community sites. We are monitoring the situation.
Report: "Performance degradation in our Montréal (YUL) datacenter"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Ongoing Maintenance on Enterprise Clusters"
Last updateUpgrades are done for the day of 7 March.
We are performing maintenance updates to Enterprise clusters. We do not expect to see any down time for sites, but there may be a performance impact when the updates are being applied
Report: "Partial Outage on Hosted Communities"
Last updateAll hosted communities have been returned to full functionality.
We have identified the issue where some hosted communities were down. We have implemented a fix and are now monitoring the issue.
Report: "Partial Service Outage"
Last updateWe have received reports that some clients cannot access their community sites.
Report: "Database issues for certain sites in our YUL/Montréal datacenter"
Last updateWe've identified some issues related to the databases upgrades we have done yesterday. Those are currently rebuilding, and a few customer's sites are affected. We're doing everything we can to bring those back up ASAP.
Report: "Queue Server Outage"
Last updateThis incident has been resolved.
We've resolved the issue and the queue is processing jobs again. We currently adding additional capacity and monitoring the situation.
The worker servers that execute jobs in our queue service are currently degraded and not processing jobs. Expect delays with webhooks and addition of new records into search.
Report: "Queue Performance Degradation"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Our message queue system is currently experiencing a backlog, which is causing delays in notifications, webhooks, and analytics data. We have identified the issue and are actively working on a solution.
Report: "Outage in YUL1 / Montréal"
Last updateSystems are stable again. Provider is investigating the cause of the network issues.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
We're in contact with our hosting provider and actively investigating. Some VMs have become unreachable. Control plane suspected.
Report: "Performance issues in our YUL-1 datacenter"
Last updateAll systems have recovered and our hosting provider is working on replacing the faulty disk.
We've identified performance issues in our YUL-1 datacenter that our provider identified as increased latency in our storage cluster, causing delays and timeouts. The disk have been taken offline and services are catching up, we're monitoring until full system recovery.
Report: "Queue Performance in Canada / Montreal datacenters is degraded"
Last updateThis incident has been resolved.
We've added additional capacity to this queue system and the backlog is being cleared. Delays are down to ~5 minutes. We will monitor until everything is caught up.
Queued jobs such as webhooks, recording of analytics, and sending of email notifications may delayed by up to 30 minutes.
Report: "Intermittent 522 errors from Cloudflare's DEN datacenter"
Last updateCloudflare has rerouted the traffic for that datacenter and the issue appear resolved. As of about 30 minutes ago, all traffic is back to normal and we're now seeing zero dropped requests from DEN.
We've had a long chat with Cloudflare's support team and the issue is being escalated to their network team. There appear to be some occasional packet drops between Denver and our Montréal datacenter. We estimate about 4% of traffic coming in from Denver is taking more than 5 seconds to respond and/or times out. Traffic originating from all other regions are unaffected and operating normally. We continue to investigate and await further responses from Cloudflare.
Since this morning at approximately 9:30, we've been observing intermittent 522 errors from Cloudflare that are isolated to their DEN (Denver, Colorado, USA) datacenter. No other datacenters are impacted. Users located in Colorado will experience intermittent 522 errors from Cloudflare. We have opened a ticket with Cloudflare and are awaiting updates.
Report: "Some Vanilla sites are unavailable"
Last updateAn issue with failed hardware has affected many of our communities. We have migrated affected sites to new hardware and functionality has been restored.
We have identified the cause of the issue and are working to fix the cause.
Some Vanilla sites are unavailable. We are investigating the issue.
Report: "Some Vanilla sites are Unavailable"
Last updateThe issue has been resolved. An occurrence of extremely high demand led us to take immediate action. Servers were rebalanced acrossed multiple physical servers with our infrastructure hosting provider. Monitoring is showing the systems are operating within normal ranges.
We are currently investigating an issue with customers hosted in Europe
Report: "Some Vanilla Sites are Unavailable"
Last updateThis incident has been resolved. Full functionality has been restored. The teams will investigate root cause to determine the timeline and affected customers.
Some Vanilla sites are unavailable. We are investigating the issue.
Report: "Outages affecting Vanilla communities hosted in AMS - Europe"
Last updateThis is resolved.
The incident has been resolved.
We are continuing to investigate this issue.
This incident has been resolved.
We are currently investigating this issue
Report: "Major networking disruption in our YUL1 datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
At 3:00 AM EST, a major network disruption is causing intermittent outages for our customers. We are investigating.
Report: "OSD failure causing intermittent slowness for communities hosted in YUL (Canada)"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Report: "Cloudflare is investigating wide-spread issues with their services and/or network affecting our communities"
Last updateThis incident has been resolved.
A fix has been implemented by the Cloudflare team and we are monitoring the results
Still awaiting further updates from Cloudflare: https://www.cloudflarestatus.com/
For more information: https://www.cloudflarestatus.com/
Report: "Network outage affecting some hosted communities in SJC - US datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Network outage affecting hosted communities in SJC - US datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Report: "Outages affecting communities hosted in YUL - Canada"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "SJC Datacentre Intermittent Outages"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Network Outage affecting communities hosted in Montreal-YUL"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
A fix has been implemented and we are monitoring the results.
Issue has been identified affecting all customers in the Montreal data centre. A fix is being implemented.
We are currently investigating this issue.
Report: "Intermittent slowness across communities"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Staging Cluster Outage"
Last updateThis incident has been resolved.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "DAL1 data centre suffering connectivity problems"
Last updateThis incident has been resolved.
We are investigating an ongoing issue that is causing the DAL1 environment to be unavailable.
Report: "Cloudflare experiencing elevated Errors in Chicago and LA affecting communities"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Outages affecting limited number of communities hosted on San Francisco Datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
Report: "DNS errors impacting connectivity to communities"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Partial outage in CAN/YUL datacenter"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Packet loss affecting CA DC"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We continue to troubleshoot this problem and work on alternative solutions.
We're experiencing relatively severe packet loss on an internal link between our two Montreal datacenters, which is contributing to degraded performance for sites hosted there. We're working on a resolution but there is no solid ETA at this time.
Report: "Outages affecting communities"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Partial outage in US/SJC1 datacenter"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We have detected an outage affecting some sites in our US/SJC1 data centre and are investigating.
Report: "Intermittent outages"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.
Report: "Intermittent connectivity loss (Enterprise clusters)"
Last updateThis incident has been resolved.
We have noticed some intermittent connectivity problems on our Enterprise tier internal network and are troubleshooting.
Report: "Outage affecting communities hosted in our US data center"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "Isolated template engine errors"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
Some of our customer communities are currently being affected by an issue with our templating system, which is causing Something Has Gone Wrong errors on certain pages. We are working on a fix and expect to deploy it shortly.
Report: "Outage affecting communities hosted in our US data center"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
We are currently investigating this issue.
Report: "DNS issues affecting communities"
Last updateFrom Cloudflare: Update - This afternoon we saw an outage across some parts of our network. It was not as a result of an attack. It appears a router on our global backbone announced bad routes and caused some portions of the network to not be available. We believe we have addressed the root cause and are monitoring systems for stability now.
Cloudflare's DNS service is having degraded performance https://www.cloudflarestatus.com/
We are currently investigating this issue.
Report: "Outage affecting communities hosted in our US data center"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are currently investigating this issue.