Historical record of incidents for BentoBox
Report: "Clover E-Commerce Issue"
Last updateClover has reported an issue with its service that is impacting BentoBox customers using Clover for POS and/or payment processing. Link to the Clover status page: https://status.clover.com/ BentoBox will continue to monitor the outage and will update this status page when the issue is resolved. We sincerely apologize for any inconvenience this may cause.
Report: "Bentobox Services Down"
Last updateWe are currently investigating a database outage that is impacting BentoBox services. We will provide updates as soon as possible.
Report: "BentoBox websites, backends and e-commerce stores are currently not loading"
Last update# Post-Mortem: Snowflake outage. Date: 03/20/2024 ### Summary of event At **12:15 PM EST**, BentoBox response time spiked to a critical 15.2 seconds, leading to timeouts on pages such as Sushi, Online Ordering, and Kitchen. By 1:27 PM EST, Snowflake identified the root cause, which was throttling client requests hosted in AWS - US East. **At 1:58 PM EST,** a hotfix was deployed, partially disabling features related to reporting and upsell items to mitigate the issue. The general users perceive an interruption of service that lasts **3 hours and 15 minutes.** ### Timeline of events * 11:25 AM EST: Snowflake issues started. Outage not reported yet. * 11:45 AM EST: Alerts were received in #bot-pse-alerts related to celery queue jobs not being processed. * 12:15 AM EST: Bentobox response time spike, going from 3.52 seconds to 15.2 seconds \[critically high response time\]. Customers began noticing slow load times on websites. * 12:16 AM EST: A timeout error was reported in #info-plataform-updates. As a result, sites like Kitchen, TOAD, and Sushi were not available in production. * 12:21 AM EST: Snowflake reported an issue with DataCloud, impacting customers with intermittent delays or timeouts as a Partial Outage. * 12:59 AM EST: After initial investigation, we suspected a DoS attack caused by requests coming from IP addresses in Europe. As a preventive action, we blocked these IPs. * 01:27 PM EST: Snowflake identified the issue causing throttle time for clients hosted in the AWS - US East. Snowflake status changed from Partial Outage to Outage. * 01:58 PM EST: A hotfix was implemented, where we partially disabled features related to reporting and upsell items. * 02:40 PM EST: Bentobox response time dropped to 313.71 milliseconds \[healthy response time\]. Customers regained access to the websites after an interruption lasting 3 hours and 15 minutes. * 03:45 PM EST: Snowflake has recovered from the outage. * 06:14 PM EST: General users experienced degradation loading ecommerce websites. * 07:50 PM EST: A fix was applied, restoring the expected loading behavior of the eCommerce websites. * 08:11 PM EST: The bentobox status page was updated to reflect a "Resolved" status. * March 20, 12:00 PM EST: The hotfix was reverted, and BentoBox's reporting and upsell item features have returned to normal. ### A root cause identification * Snowflake outage. * An unexpected full outage in Snowflake was the primary cause of the incident. A flaw in Snowflake’s automatic scaling system triggered a code issue, leading to resource exhaustion and throttling in systems processing customer requests. As a client hosted in AWS - US East, we experienced delays and timeouts when executing queries or using Snowflake services and features via Snowsight, with queries appearing to be stuck in a running state. * Bentobox < > Snowflake * BentoBox relies on Snowflake primarily for gathering and reporting data on sites such as Sushi and Online Ordering, which were significantly impacted by the Snowflake outage. As a result, attempts to access key metrics—such as upsell items, revenue, and others—led to delays in BentoBox's response time, ultimately causing timeouts on these pages. ### User impact of the incident * Customers/diners impacted? The incidents resulted in temporary unavailability or increased load time of our backend \(Sushi\) and client sites. Diners were also impacted by the outage, they were unable to add items to carts. * Revenue impacting? Yes. Since diners were unable to add items to carts, several orders were not placed. ### Impacted customers * List of impacted customers or link to report with customer information This outage impacted all the customers. ### Lessons Learned * What went wrong During Snowflake’s outage, we had no mechanism to disable features relying on it, leading to service disruptions. Since code changes were required, mitigation took longer than necessary. Moreover, this third-party service wasn’t included in our high-priority monitoring list, which delayed incident detection * What worked Quick response from all the teams across Bentobox. With everyone on the call, we were able to identify the specific endpoints that were affected by the outage. Help from the developers that implemented the feature and give us context about how Bentobox has integration with Snowflake. * For the future Subscription to Snowflake’s status page was added to our monitor tools. We will refactor our API to eliminate real-time dependency on Snowflake. Instead of querying Snowflake live when resources like menus or upsell revenue are requested, we will asynchronously query Snowflake and store the results in a cache or database. This will prevent Snowflake outages from directly impacting our API availability The SRE team is going to investigate the lack of caching for assets files. Additional monitors are going to be created for our web server resources.
As of 14:50 pm EDT, the issue affecting customer websites, backends, and e-commerce stores' ability to load has now been resolved. Due to an outage with one of our 3rd party platforms Snowflake, features like Best Sellers and Upsell on our e-commerce products, as well as revenue and diner insights on the BentoBox dashboard, are still being affected. If you'd like to monitor Snowflake's status, you can visit this page: https://status.snowflake.com/
A fix has been implemented and we are monitoring the results.
This issue has been identified and a fix is being implemented.
We are continuing to investigate this issue.
BentoBox is currently experiencing an issue that is affecting customer websites, backends, and e-commerce stores' ability to load. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "BentoBox e-commerce stores are experiencing high load times"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
BentoBox is currently experiencing performance issues that are affecting e-commerce stores' ability to load. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Clover e-Commerce Issue"
Last updateThis incident has been resolved.
Clover has reported an issue with its service that is impacting BentoBox customers using Clover for POS and/or payment processing. Link to the Clover status page: https://status.clover.com/ BentoBox will continue to monitor the outage and will update this status page when the issue is resolved. We sincerely apologize for any inconvenience this may cause.
Report: "Instagram grids are not rendering on Bentobox websites"
Last updateThis incident has been resolved. Note: Instagram users must disconnect and reconnect existing integrations to grant updated permissions. The app now only supports Instagram Business accounts. Personal account users need to update their account type.
We are in the process of submitting a Meta app review to gain advanced access for Instagram Business. We expect a response by 12/12 so our fix can be committed.
The issue has been identified and a fix is being implemented.
Our engineering team is investigating the issue and is working to resolve this immediately. Again, we're very sorry for the inconvenience.
Report: "Issues with logging into BentoBox Backend for some users"
Last updateToday, we experienced an issue that caused issues with some users attempting to log in to the BentoBox Backend. Our team worked as quickly as possible to resolve this issue and and we were back up and running on 10/29 at 4:06 PM EST. Thank you very much for your patience.
A fix has been implemented and we are monitoring the results.
The issue has been identified and a fix is being implemented.
BentoBox is currently experiencing an issue that is effecting some users logging in to the BentoBox platform. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "BentoBox Backend was down for 7 minutes on 10/2"
Last updateCustomers Backends were not available for 7 minutes. Our engineering team had identified the issue on October 2 at 6:12 pm EST and it was resolved at 6:19 pm EST. We are sorry for the inconvenience.
Report: "Meta Apps Integrations are disabled"
Last updateThe Meta Apps Integrations issue has been resolved and reactivated as of 9:12 AM EDT this morning.
We are aware that the Meta Apps Integrations are currently disabled. We are working with Meta to resolve the issue as soon as possible.
Report: "Microsoft Outage Impacting 3rd Party Integrations"
Last updateWe have been notified that Campaign Monitor's outage has been resolved as of 8:13 AM EDT. Their status page can be viewed here: https://status.campaignmonitor.com/incident/554595. We have been monitoring on our end and we don't see any additional errors from Campaign Monitor.
The current Microsoft outage has affected our third party contact form processor, Campaign Monitor. Therefore, any diners signed up for email marketing and newletter subscriptions might not have received expected emails this morning. We are actively monitoring and will follow up when this issue has been resolved.
Report: "BentoBox Customers with Clover Payments May be Experiencing an Outage"
Last updateOn the afternoon of July 10, we experienced an issue with Clover Payments that prevented checkout for some of our e-commerce customers. Our team worked as quickly as possible to resolve this issue and and we were back up and running on July 10th, 1:37 EDT. Thank you very much for your patience. You can view more information on the incident here: https://status.clover.com/
BentoBox is currently affected by an outage with Clover Payments. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Some Online Ordering, Catering, and eCommerce stores are experiencing errors upon checkout"
Last updateToday, we experienced an issue with our E-commerce, Online Ordering, and Pre-Order & Catering stores that caused an error during checkout. Our team worked as quickly as possible to resolve this issue and and we were back up and running at 06/25 at 1:50 PM EST. Thank you very much for your patience.
Our engineering team had identified the issue and is working to resolve this immediately. Again, we're very sorry for the inconvenience.
BentoBox is currently experiencing an issue that is effecting Online Ordering, Catering, and eCommerce store checkouts. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "DoorDash Partner Services Disruption"
Last updateAs of 10:32 EST on June 21st, our DoorDash partner has confirmed that their main service disruption was resolved. For more details on this incident, please visit their status page: https://doordash.statuspage.io/
Our DoorDash partner is currently experiencing a disruption to their services, preventing delivery orders at checkout for some users. Please refer to the DoorDash Status Page as we continue to monitor, https://doordash.statuspage.io/ We're very sorry for the inconvenience and appreciate your patience.
Report: "Some BentoBox users are unable to use the file uploader"
Last updateAt 10:29 AM ET on 5/9/24, we experienced an issue that caused the file uploader to fail for some BentoBox users. Our team worked as quickly as possible to resolve this issue and and we were back up and running at 12:23 PM ET. Thank you very much for your patience.
A fix has been implemented and we are monitoring the results.
Our engineering team has identified the issue affecting the file uploader and is working to resolve this immediately. Again, we're very sorry for the inconvenience.
BentoBox is currently experiencing an issue that is effecting the File Uploader functionality. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "CMS Downtime."
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Some users are having trouble logging into BentoBox"
Last updateAt 10:26 AM ET on 3/6, our engineering team identified an issue impacting some users’ ability to log into BentoBox. Our team worked quickly to resolve the issue, and we were back up and running at 11:19 AM ET. Thank you for your patience.
BentoBox is currently experiencing an issue impacting users' ability to log into BentoBox. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Clover Services Disruption"
Last updateAs of 22:32 EST on February 21st, 2024, our Clover partner has confirmed that their service disruption was resolved. For more details on this incident, please visit their status page: https://status.clover.com/incidents/30vzx8g94f39
Our Clover partner is currently experiencing a disruption to their services, affecting some users during checkout. Please refer to the Clover Status Page as we continue to monitor, https://status.clover.com/ We're very sorry for the inconvenience and appreciate your patience.
Report: "Google Food Ordering is currently down"
Last updateOn February 21st, 2024, we experienced an issue with a third-party outage that caused Google Food Ordering to be disabled for restaurants. Our team worked as quickly as possible to restore the service and we were back up and running at 5:04 pm EST. Thank you very much for your patience.
Our engineering team had identified the issue and is working to resolve this immediately. Again, we're very sorry for the inconvenience.
Report: "Square Services Disruption"
Last updateAs of 17:53 EST on February 6th, our Square partner has confirmed that their main service disruption was resolved. For more details on this incident, please visit their status page: https://www.issquareup.com/
Our Square partner is currently experiencing a disruption to their services, affecting some users during checkout. Not all Square customers are impacted, and some diners are able to check out after a few retries. Please refer to the Square Status Page as we continue to monitor, https://www.issquareup.com/ We're very sorry for the inconvenience and appreciate your patience.
Report: "BentoBox Reservations Planned Maintenance on 1/29 at 2:00am ET"
Last updateBentoBox Reservations planned maintenance on 1/29 at 2:00am ET is complete and operational as of 3:15am ET.
BentoBox Reservations booking widget will be down on Monday, January 29 at 2:00am ET for routine maintenance. The outage will last approximately 30 minutes. Check back here for additional updates.
Report: "Checkout on Pre-Order & Catering Sites is Down"
Last updateAt 3:40 PM EST on Thursday, October 12th, we experienced an outage affecting Pre-Order & Catering checkout. Our team worked as quickly as possible to resolve this issue, and we were back up and running at 6:24 PM EST. Thank you very much for your patience.
BentoBox is currently experiencing an issue that is impacting Pre-Order & Catering checkout. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "BentoBox Reservations Booking Widget is Currently Down"
Last updateAt 4:36pm EST on Monday August 21st, we experienced an outage affecting the online booking widget for BentoBox Reservations. Our team worked as quickly as possible to resolve this issue and and we were back up and running at 5:30pm EST. Thank you very much for your patience.
BentoBox is currently experiencing an issue that is effecting the BentoBook Reservations booking widget. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Printing Issues for Clover Integrated Online Ordering Stores"
Last updateClover experienced an issue that is affected some BentoBox Online Ordering stores integrated with Clover POS. This issue occurred Friday August 11th and was resolved on Sunday August 13th at 9am. We're very sorry for the inconvenience and appreciate your patience as Clover worked to resolve this.
Clover is currently experiencing an issue that is affecting some BentoBox Online Ordering stores integrated with Clover POS. We are actively monitoring the situation with Clover as they work to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Website CTAs Displayed Incorrectly"
Last updateThis incident has been resolved.
BentoBox is currently experiencing an issue that is affecting website CTAs. Our engineering team has identified the issue and is working to resolve this immediately. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Meta Apps Integrations are disabled"
Last updateThis incident has been resolved. Meta has re-enabled our integrations after a successful appeal.
We are aware that the Meta Apps Integrations are currently disabled. We are working with Meta to resolve the issue as soon as possible.
Report: "Sites slow or unable to load"
Last updateOur websites were loading slowly or unable to load from 6:50pm to 7:09pm and again from 7:24pm to 7:30pm. During this period of time we faced a highly distributed denial of service attack that our automated systems started automatically blocking. Because this attack was highly distributed and coming from several thousand bots, it was able to overwhelm our servers during the period of time noted above. We have updated our detection rules and automated blocking mechanisms to better respond to this type of attack.
Report: "BentoBox Reservations is currently down"
Last updateAt 2:00 AM EST on 2/8/2023, we experienced an issue with our AWS database that caused our Reservations product to go down. Our team worked as quickly as possible to resolve this issue and and we were back up and running as of 2/8 11:03 AM EST. Thank you very much for your patience.
We are continuing to investigate this issue.
BentoBox is currently experiencing an issue that is affecting the Reservations product due to an AWS database issue. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Incorrect Tax Calculations on Some E-Commerce Stores"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
We are continuing to investigate this issue.
BentoBox is currently experiencing an issue that is impacting some E-commerce customers using Stores. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Degraded Performance - Catering Store"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring performance.
The issue has been identified and a fix is being implemented.
We are continuing to investigate the issue impacting Catering stores.
We are currently investigating a degraded service issue with our Catering stores.
Report: "Dine-In Ordering is currently down for some users"
Last updateThis issue has been resolved. A bug impacted accounts that did not have "ordering options" selected on their online ordering location settings page, and also had dine-in enabled. The bug has been addressed and users should no longer experience issues with the dine-in feature.
The issue has been identified and a fix is being implemented.
We are continuing to investigate this issue. The service disruption is limited to a subset of our Dine-In E-Commerce users.
We are continuing to investigate this issue. The service disruption is limited to our Dine-In E-Commerce functionality.
We are continuing to investigate this issue.
BentoBox is currently experiencing an issue that is effecting Dine In Ordering Ecommerce customers. Our engineering team is actively working to investigate the issue as quickly as possible. We're very sorry for the inconvenience and appreciate your patience as we work through this.
Report: "Websites timing out"
Last updateThe code responsible for this issue was rolled back and everything is functioning normally.
We are continuing to investigate this issue.
We are currently investigating reports of websites timing out.
Report: "404 not found page returned to users"
Last updateDue to a change in our CDN configuration, some users who are geographically located close to certain data centers were returned a 404 page communicating that the site they were looking for was not found. The error lasted 4 minutes and was resolved as of 9:32 AM.
Report: "Expired SSL certificates"
Last updateOne of the load balancers in our infrastructure was configured to use an expired SSL certificate. This caused a small percentage of requests to fail with a 503 error between 8pm and 9pm.
Report: "Incorrect content returned on menu pages"
Last updateAt approximately 5:21 PM EST on Friday May 20th, 2022 our customer success reported that menu pages on certain sites were serving content from the wrong website. The url /menus/ on certain BentoBox websites was incorrectly loading the content directly from a single BentoBox website. Our support team immediately received a large amount of support tickets from concerned customers. Our incident response process was triggered starting at 5:30pm and our production support engineering team was notified via PagerDuty. Investigation began immediately. The root cause appeared to be related to site caching but it was unclear whether it was due to something on the BentoBox application or our edge caching provider. Earlier that day, due to an expected flood of traffic on a single site, a caching feature had been enabled on a single site in order to better handle additional load. Our production support team immediately started work to disable the caching feature for this website. At approximately 6pm, using our feature flag functionality, a caching layer was turned off which appeared to resolve the issue for most affected customers. Remediation worked was paused. Shortly thereafter more reports started coming in, including more affected urls \(menus, locations, about\). Remediation worked was resumed. At 6:25pm the caching feature was fully turned off and the cache cleared with our content distribution network provider. At 6:30pm the issue was declared resolved and no further issues were detected.
The /menus/ url on BentoBox customer websites is returning content from a single, incorrect, BentoBox customer.
Report: "Investigating SSL errors with our CDN provider"
Last updateOur monitoring is no longer showing errors and the issue appears to be resolved.
Our content distribution network provider has acknowledge this issue and are working on a resolution. https://status.fastly.com
We are investigating a number of errors with establishing secure connections with our content distribution network provider. Approximately 4% of connections appear to be failing at this time.
Report: "Slow response times"
Last updateThis incident has been resolved.
A fix has been implemented and we are monitoring the results.
We are continuing to investigate this issue.
We are currently investigating this issue.
Report: "Performance degradation"
Last updateThis incident has been resolved.
We are currently investigating this issue.
Report: "Slowdown on BentoBox services"
Last updateThis incident has been resolved.
We are continuing to investigate this issue.
We are currently investigating a platform wide performance degradation issue.
Report: "Amazon Web Services (AWS) service outage"
Last updateAWS has resolved the issues on their end and services should be operating as usual. Thank you very much for your patience.
Our engineering team is actively monitoring the issue and awaiting status updates from AWS. We're very sorry for the inconvenience and appreciate your patience as we work through this.
BentoBox is currently experiencing an issue with Amazon Web Services (AWS), which started around 11a today, 12/7. In return, this AWS outage is causing issues across various services, such as Square and TaxJar, that rely on AWS as a service provider.
Report: "Twilio (SMS provider) API errors"
Last updateTwilio has resolved the issue, and all services should be operating normally.
Twilio has identified the issue and connection should be restored. Their queues of messages are still being processed. We will continue to monitor the situation until it is resolved.
Twilio, a service used to send SMS (text) messages for BentoBox order notifications, is currently experiencing an outage that might be impacting the ability of some SMS notifications to be sent. The Twilio engineering team is investigating and working on resolving the issue.
Report: "Filestack media upload service outage"
Last updateFilestack has pushed a fix that has resolved the issue with uploading files.
We are continuing to monitor for any further issues.
The Filestack engineering team is working on resolving the issue.
Filestack, a service used to upload files to BentoBox, is currently experiencing an outage. Their engineering team is investigating and working on resolving the issue.
Report: "Slow response times on BentoBox Websites"
Last updateA BentoBox customer was the target of a DDOS attack from 12:12pm to 12:22pm. The BentoBox team responded immediately and began mitigating the attack by blocking IP addresses. Due to the distributed nature of the attack this only started having an impact within a few minutes. We will be revisiting our current DDOS protection and re-evaluating our rules to mitigate these attacks in the future.
Due to a distributed denial of service attack, BentoBox website response times increased dramatically between 12:12pm and 12:22pm. Some websites were unavailable intermittently during that time.
Report: "Delay loading orders from Live Orders screen"
Last updateOur online ordering live orders screen experienced delay in loading new orders as they came in.
Report: "Sites unavailable"
Last updateDeployment issue resulted in unavailable sites.
Report: "Square Outage Impacting Ecommerce"
Last updateThis incident has been resolved.
We are continuing to monitor for any further issues.
Square errors have started to recover.
We are continuing to work on a fix for this issue.
Our payment provider Square is currently experiencing issues. BentoBox customers using Square for payments are not able to process orders. We will update here when this issue is resolved. For the most up to date information please see https://www.issquareup.com/
Report: "Service Degraded - Server Load"
Last updateWe have identified and remedied the root cause behind this issue. We will continue to monitor and perform system upgrades.
We have identified and remedied the root cause behind this issue. We will continue to monitor and perform system upgrades.
We have alleviated the main source of the issue, and are continuing to actively monitor the performance of all sites.
We have alleviated the main source of the issue, and will continue to actively monitor the performance of all sites.
We have alleviated a portion of traffic contributing to slower load times. However we are still experiencing intermittent slowdowns on some sites.
We are continuing to work on a fix for this issue.
We are actively working to isolate and remedy the source of the service degradation and slower than normal load times for BentoBox sites.
Report: "Widespread Slow Down of BentoBox Websites"
Last updateOur team has resolved this incident, BentoBox websites have returned to normal performance levels
Our engineers have identified the cause of the slow down and are currently working on a solution.
We are currently investigating and will have more information shortly.
Report: "This is a test - SITES ARE DOWN"
Last updateThis incident has been resolved.
This issue is closed to being resolved. We appreciate your patience while our engineering team works through this.
We have determined the issue and are now working on a fix. Thank you for your patience.
We are currently investigating the issue and will post an update as soon as we have more information.